top of page

Overview

Bitcoin's price exhibits significant second by second volatility, yet clear trends are observable. However, conventional modeling methods often fail to produce models with predictive power for these trends. To address this complexity, I utilized a Cox regression model to forecast the survival of a trend for the subsequent second. The model demonstrated a strong capacity to predict trend continuance, although its ability to predict trend termination was sub-par. While the immediate production use case of this model may be limited, it can still generate valuable insights and potentially serve as a foundation for future model development and optimization.

Data Pipeline

How Data was retrieved

  • The order book data is BTC on Binnance’s crypto exchange

  • The data was retrieved live via Binnace’s orderbook websocket API

    • The websocket connection was coded in python

  • The data received by the the websocket was parsed for the needed info and saved in a csv file

How the Order-Book data was created

  • For each order book update data was recorded.

  • After the update data was recorded the actual order book itself was then updated 

  • Data Colled by each update

    • Bid  amount weighted by quantity averages ( bqav )

    • Ask amount weighted by quantity averages (aqav)

    • The variance of  Bid  amount weighted by quantity averages (bvar)

    • The variance of  Ask amount weighted by quantity averages (avar)

    • Timestamp

      • The updates came every second

Final Data preparation

  • Data was condensed in 5 second intervals

  • To denote if an interval's following interval  had a higher or lower btc price  a variable called higher was created

  • To detect if a trend has been going on  (the previous interval has a matching higher value) the variable t_death was created

  • Along side t_death , t_dur (trend duration ) was created to keep track of how long the trend has been going on

  • Two Variabiles were created from previously made variables

    • Varx

      • avar divided by 1 + bvar

    • Meanx

      • bqa multiplied by aqa

Results & Disscusion

The regression model demonstrated an 83% accuracy rate for trend continuation but only a 25% accuracy rate for trend termination. Closer examination revealed an unnatural distribution of trend termination probabilities, with results heavily skewed towards either extremely high or extremely low values when a trend actually ended. This observation, coupled with the tendency for trend terminations to occur in clusters, suggests that incorporating a variable to quantify clustering could significantly improve the accuracy of trend termination prediction.

Notes

Cross-validation techniques were not used, so results are a bit less concrete. Some Interactions between variables were not added to the model, and more variables could be created from order book updates to help the final model. 

© 2023 by naijport. All rights reserved.

bottom of page