How to predict Cryptocurrency price using LSTM Recurrent Neural Networks in Python

This is going to be a post on how to predict Cryptocurrency price using LSTM Recurrent Neural Networks in Python. Using this tutorial, you can predict the price of any cryptocurrency be it Bitcoin, Etherium, IOTA, Cardano, Ripple or any other.

What are LSTMs?

LSTMs are a special kind of RNN, capable of learning long-term dependencies. They were introduced by Hochreiter & Schmidhuber (1997), and are very popular for working with sequential data such as texts, time series data etc. For a deeper understanding of RNNs and LSTMs, I highly recommend you check out The Unreasonable Effectiveness of Recurrent Neural Networks by Andrej Karpathy and Understanding LSTM Networks by Christopher Olah.

Why the interest to predict Cryptocurrency price?

The first Cryptocurrency ever was Bitcoin, created in 2009 by an unknown person named Satoshi Nakamoto. Now, only 8 years later the first Cryptocurrency showed up, according to Wikipedia, the number of cryptocurrencies available over the internet as of 31 December 2017 is over 1381 and growing. And according to this post in dailyforexreport.com, total Cryptocurrency market increased by 1600% in 2017 alone. Evidently, there are people who are making a fortune trading cryptocurrency. Unsurprisingly, to predict Cryptocurrency price is one of the most researched topics in today’s time.

Let me first get clear by saying that even though this is a post titled, “how to predict Cryptocurrency price”, this is not a post for financial advice. This tutorial is merely for educational purposes. Even though the LSTM model works fairly well on test data, there are many factors that affect the price of cryptocurrency such as, perceptions on its value by the public, Media, Investors, and Scams etc which we won’t be taking into account here. I am not responsible for any money you lose. However, I will gladly accept any share of the profit if you want me to. Anyways, let’s get this tutorial started with.

Predict Bitcoin’s price using Neural Network

We are going to use Bitcoin as our choice of cryptocurrency price to predict. It has over 249 Billion dollars worth of market cap in today’s date. You can find historical data for the price of Bitcoin on the coinmarketcap’s site here. I’ve simply just copy/pasted the data there and saved the file as all_bitcoin.csv. If you look at the data, it should look something like this:

bitcoin price chart

If you plot the date vs Closing value of bitcoin, you should see a chart that looks somewhat like this:

bitcoin price chart

Problem Definition:

We can phrase our problem as a regression problem.

That is, given the value of bitcoin today, what is the value of bitcoin tomorrow?

Data Preparation

Here, we don’t need columns Open, High, Low, Volume and Market Cap. We also don’t need to worry about the Date column for training purpose as each of the data are evenly spaced. We can write a simple function to convert our single column of data into a two-column dataset: the first column containing today’s (t) Bitcoin price and the second column containing tomorrow’s (t+1) Bitcoin price, to be predicted.

LSTM Network to predict Cryptocurrency price

Now let’s begin coding. Firstly, import all of the functions and classes we intend to use. We will be using numpy for mathematical operations, pandas to operate with the csv, scikit-learn for data preprocessing and Keras with tensorflow backend as our deep learning library.

import numpy as np
import matplotlib.pyplot as plt
from pandas import read_csv
from keras.models import Sequential, load_model
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
import math

Add following code to hide any tensorflow runtime warnings:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 

Before we do anything, it is a good idea to fix the random number seed to ensure our results are reproducible.

# fix random seed for reproducibility
numpy.random.seed(36)

Now, lets load the dataset as a Pandas dataframe and drop the unnecessary columns. We can then extract the NumPy array from the dataframe and convert the integer values to floating point values, which are more suitable for modeling with a neural network.

df = read_csv('./data/all_bitcoin.csv')
df = df.iloc[::-1]
df = df.drop(['Date','Open','High','Low','Volume','Market Cap'], axis=1)
dataset = df.values
dataset = dataset.astype('float32')

LSTMs are sensitive to the scale of the input data. It can be a good practice to rescale the data to the range of 0-to-1, also called normalizing. We can easily normalize the dataset using the MinMaxScaler preprocessing class from the scikit-learn library.

# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

Now we can define a function to create a new dataset, as described above.

The function takes one argument: the dataset, which is a NumPy array that we want to convert into a dataset. This function will create a dataset where X is the closing price of Bitcoin at a given time (t) and Y is the closing price of Bitcoin at the next time (t+1).

# convert an array of values into a dataset matrix
def create_dataset(dataset):
  dataX, dataY = [], []
  for i in range(len(dataset)-1):
    dataX.append(dataset[i])
    dataY.append(dataset[i + 1])
  return np.asarray(dataX), np.asarray(dataY)

To split the data into training and testing sets, we can use train_test_split provided in the scikit-learn library. Let’s take the first 80% of the data as training dataset and remaining 20% for testing the model.

#Take 80% of data as the training sample and 20% as testing sample
trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.20, shuffle=False)

The shape of the input data (X) for the LSTM network should be specifically in the form of [samples, time steps, features].

Currently, our data is in the form: [samples, features] and we are framing the problem as one time step for each sample. We can transform the prepared train and test input data into the expected structure using np.reshape() as follows:

# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))

We are now ready to design the LSTM model for our problem. The network has 1 input layer, a hidden layer with 4 LSTM blocks or neurons, and an output layer that makes a single value prediction. The LSTM blocks use sigmoid activation function by default. We train the network for 5 epochs and use a batch size of 1. Once trained, you can save the model using the model.save method. You can also use the load_model method to load a pre-trained model.

# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, 1)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=5, batch_size=1, verbose=2)

#save model for later use
model.save('./savedModel')

#load_model
# model = load_model('./savedModel')

We can estimate the performance of the model on the train and test datasets once the model is fit. This will give us a point of comparison for new models.

We invert the predictions before calculating error scores to ensure that performance is reported in the same units as the original data.

# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform(trainY)
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform(testY)

# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[:,0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(testY[:,0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))

We need to shift the data so that it aligns with the original data in the x-axis. Now we plot the prepared data. The original dataset is shown in blue, the predictions for the training dataset in orange, and the predictions on the unseen test dataset in green.

# shift train predictions for plotting
trainPredictPlot = np.empty_like(dataset)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[1:len(trainPredict)+1, :] = trainPredict

# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict):len(dataset)-1, :] = testPredict

# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()

We can see the model did an excellent job of fitting both the training and testing dataset.

predict the price of bitcoin

Now let’s predict Cryptocurrency price for tomorrow. We can use the model we trained earlier and pass today’s price as the input parameter. Today’s price will be the last row in the test dataset.

print("Price for last 5 days: ")
print(testPredict[-5:])
futurePredict = model.predict(np.asarray([[testPredict[-1]]]))
futurePredict = scaler.inverse_transform(futurePredict)
print("Bitcoin price for tomorrow: ", futurePredict)

According to our model, the closing price of Bitcoin Jan 2, 2018 should be 15076.88476562 which wasn’t that off from the actual closing price of 14982.10. Our model was correctly able to predict a healthy rise in the price.

predict the price of bitcoin

For completeness, below is the entire code example to predict Cryptocurrency price:

# LSTM for closing bitcoin price with regression framing
import numpy as np
import matplotlib.pyplot as plt
from pandas import read_csv
from keras.models import Sequential, load_model
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
import math

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 

# convert an array of values into a dataset matrix
def create_dataset(dataset):
  dataX, dataY = [], []
  for i in range(len(dataset)-1):
    dataX.append(dataset[i])
    dataY.append(dataset[i + 1])
  return np.asarray(dataX), np.asarray(dataY)

# fix random seed for reproducibility
np.random.seed(7)

# load the dataset
df = read_csv('./data/all_bitcoin.csv')
df = df.iloc[::-1]
df = df.drop(['Date','Open','High','Low','Volume','Market Cap'], axis=1)
dataset = df.values
dataset = dataset.astype('float32')

# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

#prepare the X and Y label
X,y = create_dataset(dataset)

#Take 80% of data as the training sample and 20% as testing sample
trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.20, shuffle=False)

# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))

# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, 1)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
# model.fit(trainX, trainY, epochs=5, batch_size=1, verbose=2)

#save model for later use
# model.save('./savedModel')

#load_model
model = load_model('./savedModel')

# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

futurePredict = model.predict(np.asarray([[testPredict[-1]]]))
futurePredict = scaler.inverse_transform(futurePredict)

# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform(trainY)
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform(testY)

print("Price for last 5 days: ")
print(testPredict[-5:])
print("Bitcoin price for tomorrow: ", futurePredict)

# calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(trainY[:,0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(testY[:,0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))

# shift train predictions for plotting
trainPredictPlot = np.empty_like(dataset)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[1:len(trainPredict)+1, :] = trainPredict

# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict):len(dataset)-1, :] = testPredict

# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()

 

6 Comments on How to predict Cryptocurrency price using LSTM Recurrent Neural Networks in Python

  1. Thanks for a wonderful tutorial. And the simple step-hy-step explanation so easy to understand. One query in the code to predict tomorrows price we have to load the saved model first right?

    • If you just completed the model.fit step, you don’t need to load the model. However, if you trained separately and want to use the same model file, you need to load the model. Also, to predict tomorrows price, you need to provide the price up to today as the input data

  2. Thank you for the tutorial, I have one question. When reviewing the out-of-sample prediction, Bitcoin price for tomorrow. The section that prints the Prices for the last 5 days doesn’t appear to be correct. I’m trying to find any of those prices in the csv file, but they aren’t there. So they do not appear to be the correct prices for the last 5 days. Am I missing something?

    • Can you please provide a link to the csv file you are using? The model should normalize the provided dataset and before printing the price for last 5 days, inverse transforms the normalized list to give values approximately equal to the ones in the last 5 rows of the dataset.

  3. https://file.io/yZIiBN
    https://file.io/nRMnes
    Here’s the csv file and a screenshot of an edited version of your code. Instead of using a CSV I’m calling an API to load the data. I can upload the whole code if you want too. Either way, I’m having a hard time reconciling the last 5 days prices back to any of the input datasets. Thank you for your help researching!

    • At the beginning of our code, we are using MinmaxScalar from the scikit-learn library which transforms features by scaling each feature to a given range. We train the RNN on the data with input range 0 to 1 and later inverse transform the results to get back to the original amount. However, if the data is not normally distributed, the inverse transform after transform method does not give an exact original amount. Your output seems to be reasonable for the input csv file used. I hope that cleared your query. Glad to help! Please let me know if there’s more.

Leave a Reply

Your email address will not be published.