Transcript for:
Freak AI Overview and Configuration Guide

Alright guys, so in this video I'll be going through Freak AI introduction and configuration like how it works basically and what are the guidelines to build a machine learning system and deploy it and trade it live as well. So introduction Freak AI is a software designed to automate a variety of tasks associated with training a predictive machine learning model to generate market forecasts given a set of input features features include self-productive retraining so this feature is different from this feature okay this is input feature features in machine learning whatever we can measure about the market basically that could be our feature so for instance let's say uh okay let's see Okay, anyway, in market what data we have in the market? Well, in the market we have open high low close. That data we have open. Well, we have date as well. let me just you know look to the blackboard basically and show you what i mean okay so we have open high low and close and also we have also update so this is our data okay and this can be our future as well features all right now what we got to do is uh first what we have is all this data and volume as well okay volume is the very most important thing all right now we can reduce more information from this four column five column okay on five or six column including date and we can generate many many many many features all right you can generate like millions of features okay memory is the limit here okay but even memory is not the limit just the imputation of the data so for instance we can like get the close percentage change so as we will be working on panda's data frame we have like df close and then you just do percentage change and it will return the one minus previous close by close uh shift one uh so it will be like one minus close one by close two okay this will be our percentage change okay and uh and we can also like reduce the high or also some suppose we want to like get the maximum of high of last 10 bars last 10 bars okay so we will get this data all this data from trading you so don't worry about that and like like we can like generate more than 10 plus 10k plus features based on users and created a strategies like we can like do rapid feature engineering okay so well let's see what's the feature fit trade provides us the first one is self-adaptive retraining. So it actually retrains our models during live deployments to self-adopt to the market in a supervised manner. Alright, let me do that. Okay, now, okay. Now we can do rapid feature engineering, which means that we can create large rich feature sets. with 10k features based on simple user created strategies all right now like just uh which i just explained here okay here and now um if i like uh okay now what we got what other features we have is high performance okay which is very very like essential to the machine learning and deep learning especially deep learning and reinforcement learning model right you need high you need high end gpu to train the model all right Now, it actually the it provides that threading allows for adaptive model retraining on a separate thread or a GPU if have label. All right. Now from model inferencing prediction and board trade operations, nearest models and data are kept in RAM for rapid inferencing. So like inferencing will be like in a millisecond. OK, now realistic backtesting. It provides realistic backtesting, which says that like it in. it emulates self-productive training on historic data with a backtesting module that automates free training okay so it basically uh intimately like it basically uh you know does this cycle of uh this cycle uh training and then back propagation or day training or retraining right Now this back propagation is different from like it is mainly used for deep learning. I'm actually going ahead myself. I'm actually assuming like you guys should know like little bit of deep learning, machine learning, artificial reinforcement learning, all these area of artificial intelligence. All right. And then what you also got to know is something little bit or, you know, have a basic idea of technical analysis. all right so that like you basically understand like basic function of the market okay how the market works all right now that there's a part let's go on and so we have like deep learning we have machine learning all right now we have reinforcement learning all right i actually went so this will be our first step this will be second and this will be third or we can like we can use them like most uh best model which i saw or build is like the combination of all this okay because you need machine learning for some other features you need deep learning for certain different tasks and you need reinforcement learning for like you know putting all these things together okay now Okay, so realistic backtesting we were talking. So it basically like it is self-adaptive learning. Meaning it will like first train and then go back and retrain. Because in market we need like that you know awareness basically like we have funny definition of awareness. But let's for a moment suppose that our ML model is aware, right? of its environment which is this market and it will like first learn something about it and then you know retrain itself and then it this cycle goes on like you know infinitely and also like how many time you define so like if you want to like train train on each hour or each week or each day it depends totally depending upon depending on how much like which type of trading you want to do. Do you want to do like day trading, intraday trading? And I think basically machine learning models are particularly useful for I think in day trading which is intraday trading. Alright, so we will be using lower time frame 5 minutes, 15 minutes, 1 hour for like larger perspective. Okay, now so with that apart, let's go ahead and see extensibility. Alright, so the generalized and robust architecture allows for incorporating any machine learning library method available in python okay so we can extend it to any machine learning library so we can use like tensorflow okay we can use pytorch all right like for tensorflow we can go ahead and like see um we can go and go with keras like building like vision transformer using like transformer architecture all right our architecture from our uh hugging phase it basically so like we have we can extend it to any library what why i'm telling you is this because most like this is the first library i've seen in like in lifetime basically if you if you see it in open open source space and so We can use TensorFlow, PyTorch, SKlearn, what else we have like CatBoost, we have XGBoost, we have AdaBoost, we have light GDM model. So we have all these deep learning library from Google, Facebook, Microsoft and independent library as well. So we can use like extended to anyone, cause I create our own custom, uh, Python scripts or Python like library, uh, and we can extend it to however we want. Okay. Now eight examples are currently available, including classifiers, regular regressors, and a convolutional neural network. Okay. So classifiers and regressors are the part of, uh, machine learning. and deep neural networks this convoluted neural convolutional neural networks is part of our deep learning okay so now uh let's move on we have a smart outlier renewal we have like remove outliers from training and prediction data sets using a variety of outlier detection techniques okay so there are many outlet detection techniques like we can use svm uh we can use yeah like SVM we have, SVM we have, db scan. Alright we have all these like you know this smart system which basically says that if your data is valid or not. If your data is valid or not because we don't want to feed crap into our machine learning model. Alright because these systems are like very sensitive to data we give okay so if you like uh if you give suppose like a crap uh data i i actually shouldn't say crap but like most of the time when you like you know go ahead and train the model and like even a small change you know changes the model's performance or like uh you know changes in behaviors on prediction it really like uh i mean it really drives you nuts so uh you know so keep that in mind that this is really really important all right so svm and db scan we can use all these amazing algorithms to remove our outliers okay now we have crash resilience which which ensures that it stores trained models to disk to make reloading from a crash fast and easy and purge obsolete files for sustained dry library so what it does is basically It saves the model on our disk so that we can use it for later backtesting and don't actually start training from the start. We want to learn from the experience. Learn from the experience. So the experience will go up with our training model. So this will be our cycle and we want that feature here as well. And we have here. Now let's move on and see we have automatic data normalization. This is quite important. All right. So normalize the data in a smart and statistically safe way, meaning like we have to normalize that data thing by X minus mean by it will be like, I think it's X. or something like this and you can say like uh standard deviation or something okay so i i i actually might be wrong here um let's see uh so it's basically like when you know how how you want to normalize the data so suppose you have like thousand okay ten thousand data you wanna basically normalize the data to zero to one all right so what you can say like if you have nine hundred nine and ninety thousand years so you will take uh the maximum of it and you know divide it each of them and then you will get the value so it will get one it will get point something okay so wait right so we don't want to give a large value there okay we want to like normalize our value and that features we have here okay which is very very important in a statistical manner okay now we have our automatic data download which compute time ranges for data downloads and update historic data in live deployments okay so it actually automatically downloads data uh on update the data in live deployments okay now we have cleaning of incoming data which handles nan safe safely before training and model inferencing okay so this is also a good feature um now we have dimensionality reduction which actually reduces the size of the training data via pca principal component analysis all right So we have that and now we will have we have one feature as well that is deploying botnets which set one bot to trade models while a fleet of follower bots inference the models and handle trades. So this is really the like craziest and amazing features of it okay this is like you know what we call it in mobile training is like informative learning or something uh so yeah it's it basically how it does is like suppose like how google does is uh they have a central system okay central system right central system and we have all this mobile right how they use their model is that this is their main model and like all these machines inference from this model it trains gets the data back right data back and like it like you know inferencing updating the model here so it like it sings all this it like it basically solves the polygrail of i mean not polygrain basically but polygrade of technical problem in uh training the deep learning model like at this point for now okay. So now what we have is now let's get started. So how do we start with freekis? The easiest way to quickly test freekis is to run it in dry mode with the following command. So we do freek trade, we pass it config file and we pass this strategy. free key example strategy and we pass which model or regressor or machine learning or deep learning model we wanna use and used to train the ML system and we pass the path of this strategy okay so it will like run in dry run mode and basically execute a live trade as well okay now you will see the boot process of automatic data downloading followed by simultaneous training and trading okay i i have actually done it and uh yeah it works great okay uh and now um you know what let me just show you okay because it will be easier to like have a picture of it let me just show you real quick um so here we have this system over here okay so three four point one six eight point one four six one two four three and then eight zero seven one okay this is our freekia example okay this is our ai trading system okay let's call it okay ai trading system now uh password here and let's see okay so see here it's running and let me show you the log okay let me show you the log here see it actually gets the data okay and train the train it okay see for alpha usgt it took 25.23 seconds here now i started training for alice trained so this is a good message here drop zero training points due to nams in populated data set and then training data starts from 10 8 to 10 23 so around 27 around 20 days or something okay and so this is this is how it goes okay and see this all these uh logs here right so here see it found order for pair k c t k usdt and we are entering with this amount this short is false so we are entering long long leverage is one open rate is this much and open since this time okay so this is how it is doing and you can see the performance here all right you can see the performance that like we see this steadily rising curve all right and let let's see the trade as well if we can see all right see exit signal and all these trades over here right and now uh let me show you uh let me show you the chart here okay see it entered long here i mean long signal executed here it entered uh long here okay actually exit long year here because and that long entry okay so yeah we can go ahead and look at any chart okay we can also go ahead and look the head can also candlestick so it's it's actually pretty cool all right so now let's go to the our uh this one okay so we have done this all right now after this what we can see is uh we have seen the seen the gutta process screening testing and also like deployed online okay so so an example strategy prediction model and config to use as a starting points can be found in this pre-k example strategy model is this and our config example is this so let's go ahead and see that okay so we have config examples so this is our config pre-ki config uh this one okay so we have all this figrate config which was our standard figrate model and then now we have our config for free ai model is this which will be like uh reading about later okay now what else it's saying in the example is we can get the um strategy as well so the strategy is not there there so we want to like get the template strategy so we go to click create uh templates and here's our example strategy so see we inherit we inherited the i you know strategy interface okay and then so this is populate and indicators here we populate all the indicators which will be reading later and this is like usual victory and stuff okay so yeah right now so we have our templates and now we can get our model as well so go to freeki and see the prediction models and we can get our prediction models as here it was shown lgbm okay gbm regressor so it is here this is our we can inherit this base regressor model. It is fitting just we get the data and just we are fitting the model on regressor and yeah it basically does all like most of the things automatically right we just have to pass the data and we can get all these different models. We have tensorflow model, calbus classifier, calbus regressor, calbus regressor multi-target. light gbm classifier regressor and regression multi-target so this is all useful i mean most interesting thing is this one okay uh in the future i i'll be like making uh videos on like how to build this uh tensorflow model because i don't see any example over here so yeah we'll be doing that so we can inherit a train okay we just have to train and yeah that's it right So that's our prediction model. Now let's go to the our introduction. Now general approach, you can provide freekai with a set of custom based indicators the same way as in typical free trade strategy. As well as target values labels for each pair in the white list. Freekai trains a model to predict the target values based on the input of custom indicators. The models are then consistently retrained with a predetermined frequency to adapt to market conditions. Pre-KI offers the ability to both backtest strategies emulating reality with periodic retraining on historic data and also deploy dry or live runs. In dry or live conditions, Pre-KI can be set to constant retraining in a background thread to keep models as consistent as possible. up to date as possible. So here see in the log, if you go in the log and see here it is like constantly training every hour or so and also like you know see SVM this classifier, this outlier is tossing out all those outliers from our training point so that our prediction doesn't get defected. and that's how this performance is coming up all right see if you go to the stats here our daily stats see on the first day it made 43 which is 1.44 profit okay now on the second day it made it 0.85 profit and on the third day it made 0.69 profit so it is like a curve is rising curve so it's working quite fine all right big rising curve we haven't seen any drop down like any catastrophic drawdown till now. So let's go and see. So we have this and okay so ThreatKI can be set to constantly training up in a background set to keep models as up to date as possible. All right now an overview of the algorithm explaining the data processing pipeline and the model is shown below. Important machine learning record and you know what let me just go ahead and explain it to you basically what it is and because it's very important also as this is all like you know this is all integrated into fit and like they have built let me let let me just give them a credit and say that they have like accepted a very huge problem for day trader or any research Analyst research scientist anyone. Okay, so it's very important to at least appreciate like what's going on in the background All right. So let me just grab a pen and let's see. Okay. So first how we do is we create our strategy What we do is we pass our data which is candlestick data. This is our base features All right, we have expanded features so we can like expand these features as with multiple time frames we can add multiple time frames we can add multiple indicator periods so we can use like 5 sma 20 sma 30 sma 50 sma 100 sma we can use rsi stochastic rsi we can use average true range right we can use true range all right so like on on my top of the head like the most important like suppose says as like atr is like measure of volatility alright it measures volatility I mean it is one of the best technical analysis out there for measuring volatility okay RSI does the same for like you know observing overbought and oversold condition so that is very important information in order to like figure out tops and bottoms of the market because we wanna give market as much data as possible and as much good data as possible so that you know it can like get apply some linear algebra to it basically and come up with its own thinking of what it what it means when we combine this this and this and this and what what it means when we are in oversold condition okay what it means when we are in overbought condition what does it mean if we if the atr is increasing As a human, we know that when we see rising ATR, we see that like more volume coming in, more and more volume coming in. But if we pass the air model, just the open high, low and close and volume data, it can however do it, right? But you need a lot, a lot, a lot more data to train it. and that's why we do feature engineering so basically you know it's a basically a clever way to train the model so yeah and let me just you know list down a couple of technical analysis which are in my top of my head so we have our cumulative wave volume now we have our percentage change in close we have our we can see this sorting of the up thrust sorting of the downward thrust so that like we can like you know add those cumulative changes of clothes right we can add you know we can do cumulative sum and it will represent uh either like sorting of the thrust or sorting of the upward thrust or sorting of the downward thrust all right uh so i'm actually getting more technical here but let me just stop right here and continue with the deep learning model okay now we have this pass candle features all right So this pass scandal features is basically we can, so as we have current day data, we can pass it last to, you know, three days of data as well to train it and see like, infer what is going on and make a good decision here. Okay. For the current day. Now, after this, we apply feature set engineering and we get a cleaned, this is our feature set engineering. And now we apply, we get a clean features. All right. Which. basically removes any not a not a number values or you know none value truncate those features okay and it also does outlier detection and removal which is like we were using SVM for that one and in the log I showed right now normalization okay statistical analysis and basically basically normalization we use it for statistical stability all right now we have dimensionality reduction with pca so if we have 10 k plus features okay and uh we wanna get the same result as this one then we can use apply and reduce this to like 5000 features 5000 features from 10k to we can reduce it to 5k or we can reduce it to 1k it depends on the like how much data how much the basically data represent okay the currency current situation in the market current market condition as well as um we can like we will actually get a feature importance plot feature importance plot which will show that which features are most important okay and we can select it from there as well actually and see in this process visual inspection is also very important all right so we'll be doing that later as well so now we move on to adaptive learning and how it works is that we have running it on separate thread and it basically does this see that cycle which i showed earlier right it basically pass okay so first basically it will pass uh see it will pass those data to here we will get our target okay this will be features this will be label and then we will get to our uh this model and metadata storage we can store it here and from here we can also like take the saved model and like build a first training if first if it is first training then we will do the whole process or if it is like you know repeat or training again then we will fetch the previous trend model and we will update each retraining okay so we will go on this cycle okay now after that we will store that uh we will continuously store that to our local disk and also we will update to the ram as well for faster inference now after that we can like purge the old models okay and then our predictions are we can get our predictions from here all right so um so see one of the benefit of showing this step is to so that like how we can you know save some time on each of these steps so like suppose if you don't want to pause the old models just set it set it to true all right if you want to like train it for one hour two hours that's also a like we can optimize it basically we are doing some like process hyper parameter optimization if you will all right now we can get the predictions this is our model inferencing and then we can get we can do the post processing here we can which usually means like denormalization so we will get the actual value which we pass to the model all right now we will gather statistical quantities okay as well statistical properties and then we determine the prediction confidence from like uh we can use uh this di dissimilarity index like if our prediction is far away from our like uh you know index which we define so we can like see if we are on if the machine system machine learning system is confidence in its prediction or not okay and then we have we pass those action codes to here to enter our to handle our those like prediction to handle our entry exit and logic with predictions okay and yeah so that's that completes the whole cycle and this prediction like see this here see here it is this entry exit logic is very very like abstract here but basically you can do like enter at a specific point like entry suppose entry right stop loss you can get trailing stop loss okay suppose if you like set the trailing stop loss to atr value right to to atr all right you you trail it and you give it give this model and the machine learning model will predict the next value where to put our trailing stop loss which will be far better than atr this is really amazing man all right so we can get our trailing stop loss then we can you know uh basically do an early exit as well we can set some parameters so we can do all sorts of we can get like all sorts of our predictions um so yeah all right it's just like entry and exit logic we can like define it there okay and then that's the prediction and that's our like training data okay and this is how we are doing the training okay now now let's move on so we move okay so we have completed that and let's you know go with this uh uh machine learning simple vocabulary so like refreshing okay so we have features The parameters based on historic data on which a model is trained all features for a single candle is stored as a Vector in free AI you build a feature data sets from anything you can construct in the strategy All right Now the labels the target values that a model is trained toward which feature vector is Associated with a single level that is defined by you within the strategy these levels intentionally look into the future and are not available to the model during dry live and back testing okay so it is just a target to train the model okay uh now we have training the process of teaching the model to match the feature sets to the associated levels different types of models learn in different ways more information about the different models can be found here so like training would be so suppose the here levels will be we can put our stop loss and as in previous example atr value here okay so we can get how do we set a stop loss trailing stop loss okay atr value okay which will be like uh either is it two times is it three times so we can like get our prediction over here and that's the our labels and training training is just like you know well this is the term right so we are training is how do we how we teach the model okay how it is the system now train data a data set of the feature data set that is fed to the model during training this data directly influences weight connections in the model so that's how deep learning that weight uh in the that multiplication in uh our directed graph of linear algebra all right now we have test data a subset of the feature data set that is used to evaluate the performance of the model after training this data does not influence nodal weights within the model okay so it is just a test data basically we don't touch it we don't do anything with it we just you know we test it on the learned system to test if it has learned the system or learn the beta training data good or not and that's how it like you know gets its weights from so basically uh it's loss not we don't show it the actual data but it's just the uh you know the loss of like how well the system is learning okay now inferencing is the process of feeding a trained model new data on which it will make prediction okay so after training the data and we want to get the next prediction what's going to happen in the market throughout whether should we enter long short that's what we call inferencing okay now this is our install prerequisites which we have done already okay um and uh here's our note so cat boost will not be installed on arm devices which is raspberry pi mac m1 Arm based VPS. Okay, since it does not provide wheels for this platform Okay. Okay, so cat boost our classifier is only can be installed on Linux based system like AMD yeah, any system Intel Okay now We have users with Docker so if you are using Docker a dedicated tag with freekid dependencies is available as freekid as such you can replace the image line in your Docker compose file with image this. This image contains the regular freekid dependencies similar to native installs. Cat boost will not be available on ARM based devices. So that's how I deployed this system actually. I had used the Yeah, the docker file for deploying that okay now common pitfalls of big trade is that PKI is that it cannot be combined with dynamic volume pair list, which is fine I think or any pair list filter that adds add removes pairs dynamically because we gotta like, you know, make a list of these pairs and have it in memory as well. So we want kind of like static pair list. Now this is for performance reasons and FreeKI relies on making quick predictions and retrains. To do this effectively it needs to download all the training data at the beginning of a dry live instance. FreeKI does stores data and appends new candles automatically for future retrains. This means that If new pairs arrive later in the dry run due to a volume pair list, it will not have the data ready. However, FreeKI does work with the circle pair list or a volume pair list which keeps the total pair list constant but reorders the pairs according to the volume. Okay yeah then this is this is very important feature. So how if like we should ask a question like what moves the market and Market moves basically with volume. And as we have like if we have 50 coin pairs, right? On some days 25 of them will have more volume than other 25. Alright, in other days 10 of them will be like have more volume than 40 of them. So, you know, we'll be focusing on like which have more volume. Alright. so that we can like uh you know so that like we can capture more big moves here then you know stay on a trade on on the low volume pairs basically so that's also a good feature but we gotta keep our volume pair list contest constant and we can also use suffer pair shuffle purely so that we can you know the full the uh coin pair uh randomly all right and take the like which one because random distribution basically like it shifts it like on the long term it will go to that mean and that's how we get that system is performing or not and right so yeah which is very important because you don't know like i mean we can't anticipate or you know predict what's going to happen in the market next but machine learning system might be able to do it and that's why I'm doing it, that's why I'm recording this video and like you know documenting it lies to see that like how we can use this system to our like you know how we can join the force with AI to like you know work with in contribution or you know in conjunction with AI okay so all right so I'm getting a bit philosophical there so you know you just like right I think it will it will it will be able to you know I think I'm what I'm most interested to learn from this is that you know I want to learn from the AI system okay and right now we don't have that much AI system but we have machine learning we have deep learning we have reinforcement learning and this is where my like i wanna focus on mostly i think but uh but the uh but the process of developing the uh reinforcement learning system is incremental going from ml to dl then dl to rl over so and then yeah and then we can use this follow mode uh for like you know uh to train build a one uh one system okay so we can you know get the influence from that one system all right so all right so i think uh that's it for this video um and uh so these are i think we should we must give credit to these guys uh they have developed amazings uh software here so uh in the next video i'll be you know going through configuration file and you know showing uh showing you basically uh what we can configure how we can configure and what we should and we should should not do and also like what's the how do we basically go ahead and start building the fake ai the immortal ai trading system so uh i'll see you guys in the next video thank you