Decrypting the world of Cryptocurrency with Random Forest-Part I

Poojadurai
7 min readMay 31, 2021

Since its inception, Cryptocurrency has gained an important place in the international financial landscape. The popularity of cryptocurrency has skyrocketed in 2017 due to consecutive months of exponential growth of their market capitalization, which peaked at more than $800 billions in Jan. 2018. Like common stocks, Cryptocurrency price fluctuations are non-stationary and highly noisy. Due to attractiveness of Cryptocurrency in terms of returns and risk, Cryptocurrency price prediction is gaining a growing attention. Indeed, with the development of machine learning , forecasting Cryptocurrency is receiving a particular interest.

In this blog, we will explore how Cryptocurrency price pattern can be predicted using a Random Forest Classification algorithm. The entire exercise is done with the help of XL8ML software and Tulip Indicator library in Excel. XL8ML is an efficient software that helps us to implement the state of art machine learning algorithms in Excel itself! Tulip Indicators (TI) is a library of functions for technical analysis of financial time series data.

Let us look at the data used first. Historical data of prices and volumes of Cryptocurrency can be obtained from Quandl. Quandl is a marketplace for financial, economic and alternative data delivered in modern formats for analysis. Below are the variables available in the dataset,

· Date : The date range is from 7th January,2014 to 17th May,2021

· Open : Opening price of the time interval

· High : Highest price reached during time interval

· Low : Lowest price reached during time interval

· Close : Closing price of the time interval

· Volume (BTC) : The Bitcoin trading volume indicates how many Bitcoins are being bought and sold on specific exchanges.

· Volume (Currency) : Total volume of Crypto coin in a particular exchange

· Weighted Price : The average price of an asset over a set period of time, weighted by volume

Weighted Price across various time period

Let’s get started! The first and foremost thing is to load the dataset into Excel. This is how the dataset looks after loading into Excel.

Data when loaded into Excel

We need to separate out the individual columns in order to analyze the data. This can be done using xl_s_split(input range,delimiter,trim) function from XL8ML software. Once we do that, we get the desired format.

Data after applying xl_s_split

Data cleaning and preparation is a critical first step in any machine learning project. Although we often think of data scientists as spending lots of time tinkering with algorithms and machine learning models, the reality is that most data scientists spend most of their time cleaning data.

Before we start cleaning data for a machine learning project, it is vital to understand what the data is, and what we want to achieve. Without that understanding, we have no basis from which to make decisions about what data is relevant as we clean and prepare our data.

In this data we find many cases where the price is 0. No worries! We have XL8ML to help us. XL8ML offers xl_t_where_neq(table,column,value). We use this function to choose only those values that are non zero.

xl_t_where_neq()

Now an important question arises in our mind. Is this data fair enough to predict the Cryptocurrency direction? Predicting which direction the price of Cryptocurrency will move is not easy, and that’s why indicators exist. Bitcoin indicators are tools combined with technical analysis that can help traders predict the price movement of cryptocurrencies with more accuracy. So let’s make use of them in our data!

Tulip Cell is a free technical analysis plug in for Excel, based on Tulip Indicators. It offers over 100 free technical analysis indicators, such as moving averages, stochastic oscillator, MACD, Bollinger Bands, and more.

For our model, we will be using the following technical indicators in addition to the historical data,

· Relative Strength Index

· Moving average Convergence/Divergence

· Exponential Moving Average

· Parabolic SAR

· Bollinger Bands

· On Balance Volume

· Chaikins Volatility

Our aim is to predict whether the value of a Cryptocurrency will rise or fall looking 7 days into the future. Predicting binary classes is much easier than predicting concrete prices.

To achieve our goal, we need to transform our data. Let us find the daily price changes for Open, High, Low, Close, Volume(BTC) and Weighted price variables. This transformation is done using xl_to_log_returns(input range,period) in XL8ML. It gives N period log returns as output. Since we daily change, let us take period as 1.

xl_to_log_return()

Now let us calculate the technical indicators using Tulip cell add in. Relative strength index is calculated using TI_RSI(input range,period). The RSI indicator helps traders identify when Bitcoin’s price is too far from its “true” value allowing a trader to take advantage before the market corrects itself. Using the RSI, traders can realize great trading entry points. Here I am taking period as 3 and the input is daily weighted price change.

TI_RSI()

Understanding Moving average convergence divergence is very necessary. When the price is experiencing a lower low but the MACD is making a higher low, it creates a bullish divergence. If the MACD is making a lower high, but the price is making a higher high, it creates a bearish divergence. Divergence will almost always happen right after a sudden price movement higher or lower. Divergence is just a cue that the price might reverse. The standard setting for MACD is the difference between the 12 and 26-period EMAs. The function used is TI_MACD(Input range, Short period, Long period, Signal period). Hence I have taken 12,26 and 5 as period values along with daily weighted price change as input

TI_MACD()

The Exponential Moving Average (EMA) is one of the most popular and useful indicators available to cryptocurrency traders. A moving average (MA) is calculated by calculating the sum of closing prices from a specified number of trading periods, and then dividing the sum by the number of trading periods. This is created using TI_EMA(Input range,Period).

TI_EMA()

Parabolic SAR indicator and is used to identify trends, reversals, and breakouts. The parabolic SAR highlights the direction of an asset’s price action and shows traders entry and exit points they can use to profit. This can be calculated using TI_PSAR(High range, Low range, Acceleration Factor step, Acceleration Factor Maximum).

TI_PSAR()

Bollinger Bands are used by cryptocurrency traders essentially as a tool to identify overbought and oversold assets. A crypto trader may use Bollinger Bands to track the performance of an asset like Bitcoin over 20 periods, then make buy or sell decisions based on the formations that develop. This is calculated using TI_BBANDS(Input range,Period,Stddev)

TI_BBANDS()

On balance volume is a metric based on daily volume. It can help you confirm market signals. For example, if you believe that the price of a cryptocurrency is increasing, you can calculate the on-balance value to see if it is also increasing. If it is, then you can partially confirm that the price increase is likely to stay. This is calculated using TI_OBV(Close range, Volume range).

TI_OBV()

Chaikin volatility indicator depicts volatility by calculating the difference between the high and low for each period or trading bar. It measures the difference between two moving averages of a volume-weighted accumulation distribution line. The great thing about the Chaikin indicator is that it often peaks at around the same time every single day. This means that is the perfect tool for helping to time your trades. It is calculated using TI_CVI(High range, Low range, Period).

TI_CVI()

Finally our data is all set! Let us take a look at it.

Data with technical indicators

Thus we have successfully done data cleaning and data preparation for our Random Forest Classifier model using XL8ML and Tulip Cell in Excel. In Part II of this blog, we will deep dive into concepts of Random Forest and build our first classifier to predict the direction of Cryptocurrency using XL8ML! Stay tuned for Part II!

--

--

Poojadurai

I am a curious learner who loves to explore the trending technologies and experiment them. I like to share my thoughts on latest topics.