## Stock market prediction using python – Part III

**Overview**

Machine learning has become a vibrant technology these days. Every day a new ML algorithm is discovered. There are hundreds of ML algorithms to solve regression problems. Some of them are specialized in doing time series predictions. Time series is a dataset having only one dimension that is the time. In this quest of predicting the future of stock market; initially, we will be playing with time series dataset only. Eventually, we will add more dimensions to our dataset but for now, for the simplicity we will use only time dimension.

Prophet is open source software released by Facebook’s Core Data Science team. It is available for download on CRAN and PyPI. Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. It provides us with the ability to make time series predictions with good accuracy using simple intuitive parameters.

In previous articles on linear and polynomial regression we have seen the dismal predictions done by these two algorithms. Linear regression predicted near 0% CAGR returns in next 10 years while polynomial regression predicted 3.97% CAGR returns in next 10 years from the stock market.

The one who understands the system can only predicts the future. Both of these previously discussed regression techniques were to simple to be used for a complex system like a stock market. Now, let’s see how does the Prophet performs.

The first thing we need is the historical data. We can download it from BSE India Archives . I have downloaded the data from year 1989 to the current date. The downloaded data sheet has total five columns; Date, Open, High, Low and Close. For simplicity we are deleting three columns Open, High and Low from the data sheet and keeping only Date and Close columns.

Following is the complete python code. The code is commented enough to explain each and every aspect of this problem. Copy paste it on your python IDE and run it to see the predictions in graphical format.

```
#import packages
import pandas as pd
from fbprophet import Prophet
import matplotlib.pyplot as plt
# function to calculate compound annual growth rate
def CAGR(first, last, periods):
return ((last/first)**(1/periods)-1) * 100
# comment either of the following two blocks.
# Block 1 for BSE SENSEX
################################################################################
# read the Indian BSE data file
df = pd.read_csv('D:\\python3\\data\\SensexHistoricalData.csv')
################################################################################
# Block 2 for DOW JONES
################################################################################
## read the US Dow Jones data file
#df_i = pd.read_csv('D:\\python3\\data\\DowJonesHistoricalPrices.csv')
## Dow jones data is in reverse order, i.e. from current date to the past dates.
## We need to correct this before proceeding
#df_i['Date'] = pd.to_datetime(df_i.Date)
#df = df_i.iloc[::-1]
################################################################################
# preparing data. Prophet only understands y and ds columns. Hence we need to rename
# our data frame columns
df.rename(columns={'Close': 'y', 'Date': 'ds'}, inplace=True)
# Model initialization. Create an object of class Prophet.
model = Prophet()
# Fit the data(train the model)
model.fit(df)
# Create a future data frame of future dates. Here 3650 is approximate number of days in 10 yrs time frame.
future = model.make_future_dataframe(periods=3650)
# Prediction for future dates.
forecast = model.predict(future)
# forecast has number of various columns. In this exercise we are considering only two of them.
# ds is a date column and yhat is the median predicated value.
forecast_valid = forecast[['ds','yhat']][:]
forecast_valid.rename(columns={'yhat': 'y'}, inplace=True)
#print the last predicted value
print ("Closing price at 2029 would be around ", forecast_valid[['y']].iloc[-1])
#print CAGR for next ten years.
print ('Your investments will have a CAGR of ',(CAGR(df['y'].iloc[-1], forecast_valid[['y']].iloc[-1], 10)), '%')
# create a date index for input data frame.
df['Date'] = pd.to_datetime(df.ds)
df.index = df['Date']
# Create a date index for forecast data frame.
forecast_valid['Date'] = pd.to_datetime(forecast_valid.ds)
forecast_valid.index = forecast_valid['Date']
# plot the actual data
plt.figure(figsize=(16,8))
plt.plot(df['y'], label='Close Price History')
# plot the prophet predictions
plt.plot(forecast_valid[['y']], label='Future Predictions')
#set the title of the graph
plt.suptitle('Stock Market Predictions "Bse Sensex"', fontsize=16)
#set the title of the graph window
fig = plt.gcf()
fig.canvas.set_window_title('Stock Market Predictions')
#display the legends
plt.legend()
#display the graph
plt.show()
```

**How to read the above image?**

X-axis of the graph shows the dates from year 1989 to year 2029 and the Y-axis shows the market closing price.

The graph in the blue color displays the close price history from year 1989 to year 2019. The graph in the orange represents the prophet’s best fit model from year 1989 to year 2019 and the future predictions from year 2019 to year 2029.

**What are the predictions for Bse Sensex?**

- Market closing price at 2029 would be around
**66612**. - In actual terms there will be
**5.46%**year on year growth in next ten years.

**Prediction graph analysis:**

In the graph you can see that the orange line has maintained a good touch with the blue line, which shows that the prophet has a better understanding of the market data as compare to linear and polynomial regression. However, its predictions are linear!! similar to linear regression!! How come it can be linear ??

Is there anything wrong with the data which we are using ? Or the algorithm itself is incapable ? Let’s find out this by changing the data. Let’s use the data which has more ups and downs. Let’s use Dow Jones data. We can download it from https://quotes.wsj.com/index/DJIA/historical-prices I have downloaded the data from year 1971 to the current date. The downloaded data sheet has total five columns; Date, Open, High, Low and Close. For simplicity we are deleting three columns Open, High and Low from the data sheet and keeping only Date and Close columns.

**How to read the above image?**

X-axis of the graph shows the dates from year 1971 to year 2029 and the Y-axis shows the market closing price.

The graph in the blue color displays the close price history from year 1971 to year 2019. The graph in the orange represents the prophet’s best fit model from year 1971 to year 2019 and the future predictions from year 2019 to year 2029.

**What are the predictions for Dow Jones?**

- Market closing price at 2029 would be around
**42470**. - In actual terms there will be
**4.56%**year on year growth in next ten years.

**Prediction graph analysis:**

Predictions for Dow Jones are in similar lines with the Bse Sensex predictions. This algorithm fits with the training data properly, however, it does linear predictions. This is the nature of this algorithm, we cannot change it. It might be useful for solving some other problems but not the one we are trying it for.

This prophet has deeply disappointed me. Like the entire mankind I had lots of hopes from the prophet. We all know that **it** is coming!! I hoped that this prophet would show me some glimpse of **it** in its predictions. However, its linear predictions are as insipid as the teachings of all the previous prophets. ” SABUN KI SHAKAL MEIN YE TO NIKALA KEVAL JHAAG, JHAAG, JHAAG. BHAAG!” . (Ignore the last line please, I could not control my emotions.).

These linear predictions might be helpful to middle class people who keep on accumulating mutual funds throughout their life time. But for the legends, who enjoy surfing on the stock market waves these predictions are of no use. They want something real, something factual.

This stock market is a complex system to crack. It has got everything in it. It has money, it has power, it has happiness, it has fear, it has emotions, it has speculations and so on and so forth. It cannot be understood by simple regression techniques which we have been trying till now. To crack this riddle we need something more complex, something more powerful, something similar to human brain!! We need neurons, we need neural network . We need a memory based model to do future predictions.

We all know that **it** is coming, the only question is “What is the date?”. Can we get the answer of this question using neural network algorithms ? Can we solve this puzzle using just one dimension that is the time or do we need to consider other dimensions as well? Coming up with the answers of all these questions in my next article “The Next Recession”. Till then happy predicting…..

**References:**

https://facebook.github.io/prophet/

Also read:- Stock market prediction – Part II & Stock market prediction using linear regression