Transcript for:
Exploring Neural Networks in Trading

so machine learning and AI has taken over different aspects of our lives and it has also taken influence in many decisionmaking fields including trading and investing so today what we're going to do is we are going to discuss about neural networks trading and how we can create efficient strategies like this which can beat the S&P 500 Buy and Hold returns uh and pretty much create your own strategies using n neural networks trading or even uh improve on your existing strategies by adding a little bits of machine learning and AI so uh let's start with what a neural networks trading is uh so before that this is our third video in the series in our YouTube channel so few weeks ago we decid we discussed about decision Tre models and then few months ago we discussed about the regression as well so this is kind of complicated that's why I kind of delayed it this much uh because neural networks is kind of a complicated aspect for anybody to understand and learn so I'll try to make it as simple as possible so you guys can understand it so neural network is just like the way your brain function so it's like your brain has got neurons so when you see a number like three uh the brain looks at the number three identifies the curves and everything pass on the signals and makes a decision saying that hey it's three so that's pretty much what neural networks does the machine learning model does so when you get an image of a cat the neural networks can identify that it's a cat through different kind of a training regimes uh so there are different kinds of neural networks the most basic one is a feed forward neural network so that's used for like image classification uh so let's do this example here so uh let's assume that you feed in an image of a cat and what happens is that there is the input layer which is the cat picture and then there are like hidden layers and each hidden layer uh will have certain information certain weights added to it so that weights will be initially set as kind of like random weights uh and then it will identify it as something so let's assume that the initial identification is like a cheatah so obviously there's an error because this is not a cheatah this is a cat so uh it creates kind of like an error term like I and then it goes back into that weight area and then it kind of adjusts the weight so different weights the weights here there weights there uh so there weights everywhere and each weight is going to adjust and then it's going to go forward again uh and then find out whether it's a cat or a mouse or a cheetah and again if there's another error then again it creates a new error and then again it goes back uh and does the whole process over and over again unless and until it finally gets the fact that it's a cat and then you've got really a neural network model here so um what I just said right now is basically a simple example of a feed forward neural network so the initial things here uh those are the input uh input layers so I'm just going to mark it as I and then this is the hidden layer and then finally that is the output layer here uh so you've got the input layer hidden layer and output layer so you know the number of times this process goes over and over again that's called iterations or in other words uh EPO um so in this example we have just taken a cat and then a picture of a cat and finally concluded that hey this is a cat and this whole model is called feed forward neural network now on the other hand in case of uh recording neuron Network we've got a different thing so this one doesn't have a memory uh so once the model is created it kind of identifies that a cat as a cat so on the other hand when it comes to stocks or like for example predicting the next word uh you have essentially what is called recurring neural networks recurring neural network fundamental Advantage is that it's got a memory Advantage you can remember things uh so for example it's used in a sequence of data like for example stock data so so it that's got prices of yesterday day before yesterday and all those things so you've got like a sequence or like a sentence like I like to drink coffee so um we can use that to predict the word here coffee I like to drink coffee so the thing that we said when I go back from the cat after the error and check back the weights that's called back propagation and then after that when it goes forward it is called the forward propagation so there's a I created a a simple picture here got it online so here you can see the back propagation so first initially it goes to the input units and it adjusts the weights uh of the Hidden layers hi this are the hidden layers hidden units one and hidden units two so here we've actually created a symbol one with just one hidden layer uh and here you can see there are multiple hidden layers like two hidden layers and finally creates the output and then the output goes back to create the error term uh and then adjust the weight from the hidden and the inputs and then it goes back again so that's called back propagation and forward pass uh so the difference fundamental difference between recurring neural network and the normal feed forward neural network it's not it's got back propagation uh not only through the hidden layers but it's called back propagation through time uh so that's through the hidden States so it's called like a sequential time step so let's take this example of I like to drink coffee uh so in time T1 the first word you get is I and there's no previous hidden State and the current hidden state in this new hidden State created and that's H1 and then during the next time step you have got the word like and you've got the previous hidden State H1 and then you've got a current hidden State H2 so it has got information from the previous hidden State as well so that's how it has got memory and then in time three T3 we've got two uh and then it got the previous hidden state of H2 and then it got a new current hidden state of H3 so again it is carrying over that memory which the previous situation of a cat didn't have uh so in this case because of the back propagation through time the recurring neuron network has got an advantage and that is the whole idea of memory advantage and finally in T5 uh we don't have any more input word and we've got the hidden uh state of the previous one H4 and based on that h4s data we make a prediction that it is coffee so just like the prediction is done for coffee you can actually use it to predict the stock price dat so instead of T1 uh I like to drink it can be price of yesterday price of day before yesterday and things like that and you can use different kinds of input instead of the price you can use the RSI you can use the volume and you can use the target as the fifth day return so here is like 1 2 3 4 it's like the fifth word right so into the fifth word you can uh actually use the target return as the fifth days return so uh I've given like a trading example here so here we've got the time T1 T2 T3 T4 T5 uh so input we've given the price of the first day and the RSI of the first day and the price is the second day in rsf the second day so each time in the hidden State uh the memory is being stored and the new information is carried over and that helps us to predict certain things in the stock market we can predict whether the market is going to be a positive or A negative by using all the RSI data and the price data of the past uh so the recording neural network or fun neur network has got certain problems the fundamental problem is that it's got a computational uh issue so we need lots of computational power unlike decision Tre or regression where we can actually create so uh in our strategy we actually use squant connect and they don't use gpus they use CPUs and still we were able to create one but when we go to more complex one like lsdm it takes a long time to test it so uh even in Quad I've seen like 15 20 30 minutes uh for it to actually load up uh and there's a time after just certain time it kind of shuts down because you know it's too much load for the computer to actually run uh run the strategy but even then Quant connect is pretty good in executing uh these machine learning models but in future what is going to happen is that you're going to have gpus and everything to make the models even more powerful uh so in this specific strategy we created a 64 uh as the memory Vector so we can also decide on how much the memory vectors are so we can create more complicated uh memory vectors like from 64 to 128 to 256 so in the strategy we use 64 uh so it's much more faster to train the model uh and then we can do lots of kind of hyperparameter tuning we can we can change the input features we can change the epox that's the number of iterations that the model used to train uh you can use the hidden Vector size that's what I've I've said here uh and you can also adjust things like learning rat these are all like more complicated hyper parameter tuning which I'm not going to explain it here but I have explained it thoroughly uh in our course uh so this basically is covered in our machine learning and AI course including this strategy uh so we've actually got eight machine learning and AI strategies we've also got combined machine learning strategies where we combined recording neuron Network without a regression and things like that uh so that's give us a fundamentally better Edge uh so we we we've used AI in even portfolio rebalancing also in combining multiple models and after each section each model that we discuss uh basically we have discussed five models here we we discuss regression uh decision tree uh support Vector machines recording neural networks and long shortterm memory and after each of these sections we've got a strategy and after each strategy we've got a quiz as well so a quiz would be something like this so for example this is a recording newal network uh quiz so these quizzes are created so that you understood what's being explained so that you can create your own uh record neural network strategy and you'll have the strategy code and the lecture and all those things so we've also got different kind of uh uh strategies as well all these strategies has beaten the SP SPX cagr to maximum draw down ratio that's what we are aiming for in our course or any strategy when you create the fundamental comparison measure is the SPX cagr to maximum draw down ratio so we also discussed the apply Mar portfolio optimization to AI as well so uh coming back to this uh lecture so one of the advantages of the neural networks we can we can apply it anyway so imagine you have like a strategy which is performing well you can actually make it perform even better or reduce a draw and things like that using neural networks um so going back to the recording neural network RNN has got a fundamental problem and that is the vanishing grading problem so it tends to forget stuff that is kind of like in the past so for example when you read a book let's say you're reading Stephen King's The stun which is like really really big so uh you tend to forget some of the characters because there are lots of characters so just like human beings we forget stuff uh so the neural networks also tends to forget stuff and that's because of the back propagation errors and the weights and things like that uh so in order to avoid the situation uh lstm has got a thing called forget gate so along with the hidden units there will be a forget gate which kind of takes care of the problem of the uh things that they forget and it will take into consideration the important things in the past that has to be remembered so that helps us improve our result for the lstm but the problem with lstm fundamentally uh is that it is extremely computationally uh intensive that it will take a while uh in to back test this uh strategy in lsdm so let me get the lsdm strategy here this is a strategy we did for the lsdm uh we've done the portfolio optimization rebalancing using the lsdm machine learning model AI so uh these machine learning model AIS like neural networks not only for you to create efficient strategies it can also be to rebalance a portfolio so imagine you have like a value investing portfolio and you need to adjust the allocation of the portfolio so you can use the machine learning model to actually allocate how much money do I need to allocate to shy how much money do I need to allocate to GLD spy a TLT and get Superior returns while minimizing the draw down so these are some of the things that machine learning models like RN lsdm and many other things can use it's just not for entry or exit conditions it can be to improve your existing basic Quant trading strategies it can also be to rebalance a portfolio so uh now let's go back and discuss the code of the recording neural network okay so as you can see so I have imported the libraries numai and psychic learn so uh there is a fundamental python library for neural networks in lsdm but for this specific strategy for the RNN what I did is that I actually created the equations myself uh so that you guys can understand uh how the process is done and if you want to make any changes the advantage of having libraries is sometimes it can be uh you don't know what's happening inside uh but it is e easy to use you just have to import the libraries and actually run it but sometimes it's Al good to know what happens inside of it so I actually created uh the code myself for the recording neuron Network so you've got the start dat end date so that will be a testing data I've set the cache I've taken values of RSI Ma and the 200 so if you don't know anything about uh Quan connect or python I think the first thing you could do is visit our YouTube channel and go through algorithmic trading in Python Z to Hero uh and once you're done with that you can actually visit our Quan connect full tutorial as well so you kind of up speed on how to handle python uh and then you can come here and it will be easy for you to understand so these are kind of like the prerequisites uh that you will need to understand a certain level of python so you understand what the code does uh we're warming up the data so it warms up 200 days and then we are looking into the parameters we are looking into how many days we need to look back so remember we did T1 T2 T3 uh T4 and T5 so that could be like five look back periods so in this case we're doing 10 look back periods uh we are doing a feature count so that could be anything could be a price change it could be an overnight Gap it could be a volume change it could be an RSI it could be anything uh so it's kind of up to you on what to choose the input features are we not going to show the input features that we have and the entry or exit conditions that we have but for the people who have done the course you have access to that specific thing uh so you've got the hidden size uh so remember I said the 64 12 128 the vector sizes uh the memory Vector so that's kind of uh dictate here then the learning rate is dictated uh and then initially the prediction is set as zero because that's what we're going to find the value is the prediction going to be positive or negative to go long or short uh we've used the standard scaler to scale um our numbers so people who don't know uh standard scaler so basically we make a different dates for example RSI can give you a certain number but volume can give you like 100,000 uh so the RSI might be like 30 or something so you see all these complete uh variations of these numbers so in order to make those numbers kind of be comparative that's why we used standard scaler so standard Scala helps us convert these numbers to uh analytical ones uh so that there is no massive discrepancy because a bigger number can affect the model completely uh so we don't want that kind of situation the volume might take over uh instead of the RSI number so in order to do that is why we use standard Scala and then we're going to use initialize RNN function which is something that I created and also the train RNN function uh so the initial ones are the ones that we uh decide on the weights and then we also decide the biases so initially as I said before for the weights I think I did said before so initially the weights are Set uh as uh random uh the first first time it goes and then when you get the error the weights are adjusted accordingly uh so in this case we've actually set up the random function and created the uh weights and also we created the Bice and then finally we train the model for training we're using 2000 to 2009 so I've discussed this in our previous video uh videos about training and testing data so we're using training data from 202 9 and we are testing it on 2010 to 2024 you can change this uh you can update it as you keep on going um you can actually do it from I don't know 990 to 2000 and then tra train that on um I mean test that on 2000 2010 inad as well uh and then we're going to get an array which stores the features and the targets uh and the features are going to be got from our get features from history uh where we can feed in the input features whatever input features you want and then obviously there's a t Target and then uh obviously we're going to scale offit the um features and then we're going to transform the features and here is the eox or the iterations the amount of uh iterations we I can increase the iterations to higher number but again it is going to come with a higher computational power and sometimes Quan connect may not be able to handle it if you given like too many EO um and then we're going to finding the epoch losses and all those things and here is the uh forward uh propag ation and also the backward propagation as well so uh here you can see we have used the self. backward function uh here to find out the epoch loss and the hidden State and Target and the prediction uh and then here is the forward propagation we've got the hidden States here it's an empty array and then we append the hidden States we add up the hidden States uh and we can see here we have actually put in the weights and also the biases and we are going to return y uh and also the hidden States as well and then with that information of the Hidden States and the Y and Yore predict is the value that we are trying to predict we feed in the weights and then we do the backward propagation and we get the uh biases in weights and then we keep on updating it because it's has gone through different kinds of epo right so we've got the learning rate uh again it's being adjusted so finally we have got the completely complete model of the uh backward propagation so we've got the backward propagation and we've got the forward propag ation and it's being trained and then comes a def on data which I'm not going to go too far cuz that's basically the entry and exit uh uh conditions and also the get features uh the input features as well so those are available for our members and they can access it in their uh specific strategy code section uh let's go through the results of this and see how it has performed okay so we have actually included the commissions as well so all our strategies comes with commissions included it uh so let's go to the strategy strategy performed on the Spy as you know as you've seen it in the codes uh so we've got a compounding annual return of 9% with a draw down of 18% so let's do the calculation so so the S&P 500 cagr to draw down ratio cagr is 10 and we are dividing it by the draw down maximum draw down which happened in the 2008 financial crisis that's 55 so we've got a ratio 0.18 so unless we've got a strategy which is above that we can't go ahead in executing that strategy so in this case it's 9% divided by 18.3 so we've got a ratio of 0.49 which is pretty impressive uh so the S&P 500 by an whole T to draw down was 0.18 so the advantage of this is we can apply a little bit of Leverage you can apply like a 2 is to one leverage uh and get like an 18% cagr and still the draw down will be just like 36% or something like that still it won't be the maximum draw down of an S&P 500 Buy and Hold uh so there are lots of things we can adjust to it uh in these numbers and we can make the strategy uh improve by adjusting the uh epox or adjusting the input features or changing the input features or adjusting the hidden size and things like that so there are lots of things that we can play around with uh in the input features or the hyper parameters uning which we have explained thoroughly in the course and that is one of the challenges that you have like we give you a condition these are things you have have to change and these are the things that you can improve the strategy so this is the RNN strategy and you can actually improve the strategy as well by combining it with different model so let me go through some of the performance of our machine learning strategy of the training data so I can see different models here so S&P 500 rati is 0.18 so our linear regression model has 0.32 decision 3.45 uh support Vector machine .44 recording new work 49 uh there's a combined model of 0 39 there is the combined am model of 0.45 uh there's another decision tree of 0.4 I think I been put in the lstm so I think it might be there below here long shortterm memory there you go that's a long shter memory uh that also has done spectacularly well then we have got the RNN regression which is combining the RNN and the regression uh We've also got the r&n plus support Vector machines as well so so there's lots of growth Prospect when it comes uh to application of machine learning and AI to trading and these are just the fundamentals we are just taking baby steps so I think within like a 2 to three years time frame people will be discussing more about what kind of machine learning models to apply uh for a specific trading strategy can we apply it on a a momentum trading strategy how can we change the portfolio how can we create multiple uh strategies using neural networks can we apply newer networks on multiple quantitative trading strategies to rebalance a portfolio so the amount amount of growth prospects for this is very huge uh so hope you guys enjoy the video let me know you understood everything let me know if you have any confusion in any of these things I'll be happy to help you uh thanks for watching bye-bye