This is going to be a post on how to predict Cryptocurrency price using LSTM Recurrent Neural Networks in Python. Using this tutorial, you can predict the price of any cryptocurrency be it **Bitcoin, Etherium, IOTA, Cardano, Ripple** or any other.

### What are LSTMs?

LSTMs are a special kind of RNN, capable of learning long-term dependencies. They were introduced by Hochreiter & Schmidhuber (1997), and are very popular for working with sequential data such as texts, time series data etc. For a deeper understanding of RNNs and LSTMs, I highly recommend you check out The Unreasonable Effectiveness of Recurrent Neural Networks by **Andrej Karpathy** and Understanding LSTM Networks by **Christopher Olah.**

### Why the interest to predict Cryptocurrency price?

The first Cryptocurrency ever was **Bitcoin**, created in 2009 by an unknown person named Satoshi Nakamoto. Now, only 8 years later the first Cryptocurrency showed up, according to Wikipedia, the *number of cryptocurrencies* available over the internet as of 31 December 2017 is over 1381 and growing. And according to this post in dailyforexreport.com, total Cryptocurrency market increased by 1600% in 2017 alone. Evidently, there are people who are making a fortune trading cryptocurrency. Unsurprisingly, to **predict Cryptocurrency price **is one of the most researched topics in today’s time.

Let me first get clear by saying that even though this is a post titled, “how to predict Cryptocurrency price”, this is not a post for financial advice. This tutorial is merely for educational purposes. Even though the LSTM model works fairly well on test data, there are many factors that affect the price of cryptocurrency such as, perceptions on its value by the public, Media, Investors, and Scams etc which we won’t be taking into account here. I am not responsible for any money you lose. However, I will gladly accept any share of the profit if you want me to. Anyways, let’s get this tutorial started with.

### Predict Bitcoin’s price using Neural Network

We are going to use Bitcoin as our choice of cryptocurrency price to predict. It has over 249 Billion dollars worth of market cap in today’s date. You can find historical data for the price of Bitcoin on the coinmarketcap’s site here. I’ve simply just copy/pasted the data there and saved the file as all_bitcoin.csv. If you look at the data, it should look something like this:

If you plot the date vs Closing value of bitcoin, you should see a chart that looks somewhat like this:

### Problem Definition:

We can phrase our problem as a regression problem.

That is, given the value of bitcoin today, what is the value of bitcoin tomorrow?

### Data Preparation

Here, we don’t need columns Open, High, Low, Volume and Market Cap. We also don’t need to worry about the Date column for training purpose as each of the data are evenly spaced. We can write a simple function to convert our single column of data into a two-column dataset: the first column containing today’s (t) Bitcoin price and the second column containing tomorrow’s (t+1) Bitcoin price, to be predicted.

### LSTM Network to predict Cryptocurrency price

Now let’s begin coding. Firstly, import all of the functions and classes we intend to use. We will be using numpy for mathematical operations, pandas to operate with the csv, scikit-learn for data preprocessing and Keras with tensorflow backend as our deep learning library.

import numpy as np import matplotlib.pyplot as plt from pandas import read_csv from keras.models import Sequential, load_model from keras.layers import Dense from keras.layers import LSTM from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error from sklearn.model_selection import train_test_split import math

Add following code to hide any tensorflow runtime warnings:

import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

Before we do anything, it is a good idea to fix the random number seed to ensure our results are reproducible.

# fix random seed for reproducibility numpy.random.seed(36)

Now, lets load the dataset as a Pandas dataframe and drop the unnecessary columns. We can then extract the NumPy array from the dataframe and convert the integer values to floating point values, which are more suitable for modeling with a neural network.

df = read_csv('./data/all_bitcoin.csv') df = df.iloc[::-1] df = df.drop(['Date','Open','High','Low','Volume','Market Cap'], axis=1) dataset = df.values dataset = dataset.astype('float32')

LSTMs are sensitive to the scale of the input data. It can be a good practice to rescale the data to the range of 0-to-1, also called normalizing. We can easily normalize the dataset using the **MinMaxScaler **preprocessing class from the scikit-learn library.

# normalize the dataset scaler = MinMaxScaler(feature_range=(0, 1)) dataset = scaler.fit_transform(dataset)

Now we can define a function to create a new dataset, as described above.

The function takes one argument: the **dataset**, which is a NumPy array that we want to convert into a dataset. This function will create a dataset where X is the closing price of Bitcoin at a given time (t) and Y is the closing price of Bitcoin at the next time (t+1).

# convert an array of values into a dataset matrix def create_dataset(dataset): dataX, dataY = [], [] for i in range(len(dataset)-1): dataX.append(dataset[i]) dataY.append(dataset[i + 1]) return np.asarray(dataX), np.asarray(dataY)

To split the data into training and testing sets, we can use train_test_split provided in the scikit-learn library. Let’s take the first 80% of the data as training dataset and remaining 20% for testing the model.

#Take 80% of data as the training sample and 20% as testing sample trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.20, shuffle=False)

The shape of the input data (X) for the LSTM network should be specifically in the form of *[samples, time steps, features]*.

Currently, our data is in the form: [*samples, features*] and we are framing the problem as one time step for each sample. We can transform the prepared train and test input data into the expected structure using **np.reshape()** as follows:

# reshape input to be [samples, time steps, features] trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1])) testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))

We are now ready to design the LSTM model for our problem. The network has 1 input layer, a hidden layer with 4 LSTM blocks or neurons, and an output layer that makes a single value prediction. The LSTM blocks use sigmoid activation function by default. We train the network for 5 epochs and use a batch size of 1. Once trained, you can save the model using the model.save method. You can also use the load_model method to load a pre-trained model.

# create and fit the LSTM network model = Sequential() model.add(LSTM(4, input_shape=(1, 1))) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') model.fit(trainX, trainY, epochs=5, batch_size=1, verbose=2) #save model for later use model.save('./savedModel') #load_model # model = load_model('./savedModel')

We can estimate the performance of the model on the train and test datasets once the model is fit. This will give us a point of comparison for new models.

We invert the predictions before calculating error scores to ensure that performance is reported in the same units as the original data.

# make predictions trainPredict = model.predict(trainX) testPredict = model.predict(testX) # invert predictions trainPredict = scaler.inverse_transform(trainPredict) trainY = scaler.inverse_transform(trainY) testPredict = scaler.inverse_transform(testPredict) testY = scaler.inverse_transform(testY) # calculate root mean squared error trainScore = math.sqrt(mean_squared_error(trainY[:,0], trainPredict[:,0])) print('Train Score: %.2f RMSE' % (trainScore)) testScore = math.sqrt(mean_squared_error(testY[:,0], testPredict[:,0])) print('Test Score: %.2f RMSE' % (testScore))

We need to shift the data so that it aligns with the original data in the x-axis. Now we plot the prepared data. The original dataset is shown in blue, the predictions for the training dataset in orange, and the predictions on the unseen test dataset in green.

# shift train predictions for plotting trainPredictPlot = np.empty_like(dataset) trainPredictPlot[:, :] = np.nan trainPredictPlot[1:len(trainPredict)+1, :] = trainPredict # shift test predictions for plotting testPredictPlot = np.empty_like(dataset) testPredictPlot[:, :] = np.nan testPredictPlot[len(trainPredict):len(dataset)-1, :] = testPredict # plot baseline and predictions plt.plot(scaler.inverse_transform(dataset)) plt.plot(trainPredictPlot) plt.plot(testPredictPlot) plt.show()

We can see the model did an excellent job of fitting both the training and testing dataset.

Now let’s predict Cryptocurrency price for tomorrow. We can use the model we trained earlier and pass today’s price as the input parameter. Today’s price will be the last row in the test dataset.

print("Price for last 5 days: ") print(testPredict[-5:]) futurePredict = model.predict(np.asarray([[testPredict[-1]]])) futurePredict = scaler.inverse_transform(futurePredict) print("Bitcoin price for tomorrow: ", futurePredict)

According to our model, the **closing price of Bitcoin** Jan 2, 2018 should be 15076.88476562 which wasn’t that off from the actual closing price of 14982.10. Our model was correctly able to predict a healthy rise in the price.

For completeness, below is the entire code example to predict Cryptocurrency price:

# LSTM for closing bitcoin price with regression framing import numpy as np import matplotlib.pyplot as plt from pandas import read_csv from keras.models import Sequential, load_model from keras.layers import Dense from keras.layers import LSTM from sklearn.preprocessing import MinMaxScaler from sklearn.metrics import mean_squared_error from sklearn.model_selection import train_test_split import math import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' # convert an array of values into a dataset matrix def create_dataset(dataset): dataX, dataY = [], [] for i in range(len(dataset)-1): dataX.append(dataset[i]) dataY.append(dataset[i + 1]) return np.asarray(dataX), np.asarray(dataY) # fix random seed for reproducibility np.random.seed(7) # load the dataset df = read_csv('./data/all_bitcoin.csv') df = df.iloc[::-1] df = df.drop(['Date','Open','High','Low','Volume','Market Cap'], axis=1) dataset = df.values dataset = dataset.astype('float32') # normalize the dataset scaler = MinMaxScaler(feature_range=(0, 1)) dataset = scaler.fit_transform(dataset) #prepare the X and Y label X,y = create_dataset(dataset) #Take 80% of data as the training sample and 20% as testing sample trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.20, shuffle=False) # reshape input to be [samples, time steps, features] trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1])) testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1])) # create and fit the LSTM network model = Sequential() model.add(LSTM(4, input_shape=(1, 1))) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') # model.fit(trainX, trainY, epochs=5, batch_size=1, verbose=2) #save model for later use # model.save('./savedModel') #load_model model = load_model('./savedModel') # make predictions trainPredict = model.predict(trainX) testPredict = model.predict(testX) futurePredict = model.predict(np.asarray([[testPredict[-1]]])) futurePredict = scaler.inverse_transform(futurePredict) # invert predictions trainPredict = scaler.inverse_transform(trainPredict) trainY = scaler.inverse_transform(trainY) testPredict = scaler.inverse_transform(testPredict) testY = scaler.inverse_transform(testY) print("Price for last 5 days: ") print(testPredict[-5:]) print("Bitcoin price for tomorrow: ", futurePredict) # calculate root mean squared error trainScore = math.sqrt(mean_squared_error(trainY[:,0], trainPredict[:,0])) print('Train Score: %.2f RMSE' % (trainScore)) testScore = math.sqrt(mean_squared_error(testY[:,0], testPredict[:,0])) print('Test Score: %.2f RMSE' % (testScore)) # shift train predictions for plotting trainPredictPlot = np.empty_like(dataset) trainPredictPlot[:, :] = np.nan trainPredictPlot[1:len(trainPredict)+1, :] = trainPredict # shift test predictions for plotting testPredictPlot = np.empty_like(dataset) testPredictPlot[:, :] = np.nan testPredictPlot[len(trainPredict):len(dataset)-1, :] = testPredict # plot baseline and predictions plt.plot(scaler.inverse_transform(dataset)) plt.plot(trainPredictPlot) plt.plot(testPredictPlot) plt.show()

Thanks for a wonderful tutorial. And the simple step-hy-step explanation so easy to understand. One query in the code to predict tomorrows price we have to load the saved model first right?

If you just completed the model.fit step, you don’t need to load the model. However, if you trained separately and want to use the same model file, you need to load the model. Also, to predict tomorrows price, you need to provide the price up to today as the input data

Thank you for the tutorial, I have one question. When reviewing the out-of-sample prediction, Bitcoin price for tomorrow. The section that prints the Prices for the last 5 days doesn’t appear to be correct. I’m trying to find any of those prices in the csv file, but they aren’t there. So they do not appear to be the correct prices for the last 5 days. Am I missing something?

Can you please provide a link to the csv file you are using? The model should normalize the provided dataset and before printing the price for last 5 days, inverse transforms the normalized list to give values approximately equal to the ones in the last 5 rows of the dataset.

https://file.io/yZIiBN

https://file.io/nRMnes

Here’s the csv file and a screenshot of an edited version of your code. Instead of using a CSV I’m calling an API to load the data. I can upload the whole code if you want too. Either way, I’m having a hard time reconciling the last 5 days prices back to any of the input datasets. Thank you for your help researching!

At the beginning of our code, we are using MinmaxScalar from the scikit-learn library which transforms features by scaling each feature to a given range. We train the RNN on the data with input range 0 to 1 and later inverse transform the results to get back to the original amount. However, if the data is not normally distributed, the inverse transform after transform method does not give an exact original amount. Your output seems to be reasonable for the input csv file used. I hope that cleared your query. Glad to help! Please let me know if there’s more.