ML Ops Course Lecture Notes

this course will get you a high paying job offer and this is not just another machine learning course I have grabbed my First Data Center internship which is at a us-based startup and whose pay was 2x more than what Google India's pay to this software engineers and I've got several offers from several International companies across us UK Germany and Etc and people often think how it is possible to get that level of comparative pay just by living in India and working on a remote job what I used to do I used to add a creamy layer on top of my all the take-home challenges and a projects and the creamy Lair was envelopes and this course is about to add a creamy layer on top of your project and take home challenges this course will teach you fundamentals of envelopes as well as we'll be having one end-to-end project which will involve from data ingestion to deployment using several state-of-the-art tools like ml flow xenamel and Etc mlops is completely new field and there are very less resources around it and this is a gold mine and a game changer in the ml Community if you do this with the dedication and the patience you will be able to succeed in learning about mlops and then you'll be also getting the international Pace or whatsoever job offering which you want in your hands but but wait who am I I'm lead data scientist at triplet I've LED several products in creators economy along with it I have worked as an envelopes engineer as one of the fastest growing envelopes framework which is xenamil and I have experience in working as a data scientist at artifact and building large-scale NLP products before even GPT was launched and hopefully by all of this experience and the right guy to teach you about mlrs all the required links and everything resources is listed down in a description on mocks pillow you can go ahead and check that out hey everyone welcome to the another lecture on amazonops we'll be starting off with giving you a slight introduction to mlops also I'll make you aware with the terminologies of Emirates which is very important for you to understand the leader content of this course as well as will also make sure that you understand the basic things like pipelines steps which is a library which we'll be using out there but before that we'll make sure to that you understand why there is a need of mlops or like what are the stages in envelopes and etc etc so let's get started with this lecture so the first thing first so the first thing first out is um about text the the growth of data is increasing there's exponential growth of data and the importance of artificial intelligence is also has also increased over time now the data has increased but now we need to make sure that we utilize that data in a right way and in a positive way right so that's where it comes with that that that's where we we have artificial intelligence right so now you might be thinking that okay fair enough we can just build a prediction model on top of it but you should understand that machine learning is not just about building model why I say this why I say this because your ml code or whatever Machinery model is just 20 of your whole machine learning project or a whole um business problem right there's a lot of things which comes into that place and your machine learning code is just 20 out of the whole set of things so I hope you are I will I'll prove you why why there's a 20 of ml code throughout this video and it's ml in the industry is more than the training models it is validated by Jim Huen who is one of the experts in mlops and it is also validated by Elon Musk who just said yeah it's like machine learning engineering is just 10 machine learning and 90 engineering and that's something really interesting to worry about and you might be thinking that every other course is online teaches only about machine learning engineering which is a building machine learning models but know what teaches about the engineering part of it right and you might be thinking it might be some data structures and algorithms or it might be some design patterns or Etc of course yes it has some factors but a lot more than these dsan stuff which we'll explore throughout the course through our project so we in a typical ml team uh in our corporate we have the following uh people who are actually responsible for doing x amount of tasks so your data scientists discover the raw data develop features and train models right and data engineer who productionize the data pipeline will talk about the term uh productionized in a bit but data pipeline is like where the data is coming from and then making it on a large scale right and then we have a ml engineer who sits on front to deploy the model right so that it can be used by users by you we'll talk about what does deployment mean in just some seconds and then we integrate the service into the into your website or application and then you have to monitor it we'll talk about each and every steps in great detail and then you have a lawyer who who can just ask you we should ask questions to them can I use this data for my model yes or no and and I'm pretty much sure that you might be not be aware of it any of the any of the red lines over here I'll make sure that you understand each of the things like training models productionizing deployment integrating monitoring and all the stuff throughout this introductory lecture the reason why I'm doing this introductory lecture to make sure that you understand each and every bit in the project which will use which will make use of like terminologies which will make use of it there so what data science actually sees you might be thinking about okay fair enough you have pd.org CSV you read the data you fit and then some some happens and then you simply and then you also have the classifier you also fit that and then your dot predict and Dot score right that's what you see right but do you really think that um by writing three these three lines of code people will get your job of course not right and the main focus should 90 focus should be on engineering and what engineering sees is much more uh very scary than what data science is so amyl in production you might if you're better aware even about how does it goes Etc the first step is you collect the data you drain the model and they deploy the model in production what does deployment mean so let's talk about a little bit about deployment so that you understand in a bit however I recommend you if you want to understand deployment much more in great detail will have more sections on afterwards to actually understand what does deployment mean over here so deployment means that you once you're once you have the trained model for example let's take an example of where you're working on a email spam detection project right and the model is currently is in a local server right is it a machine how can you use that and integrate into the Gmail right so that that we can use that model to make predictions for the users who who are who are whom for for whom you are making this model for right and that's where deployment comes in right deployment is about that you have to make your local model available to the lot people to the users for which you're building the model for right that's what deployment means you have to deploy the model uh online and and we'll see we'll talk about it on deployment in very great detail in some time but you might be thinking that this is the process but what it's actually it looks like so basically what happens that first of all you collect the data you train the model and then you deploy the model now once your model is deployed you again go back and collecting the data and then train the model and this Loops goes in environment but how does youth how how can we say how can we say that that okay how the what's the loop is about you might be having several questions what is a loop what is a production environment and a lot of things out there so let's talk about in great detail about what does a loop means and what is the production environment is so I've taken very very simple example of this image right so for assume that you have you had to collected the data and then you had to train the model and then you deployed the code right so assume that assume that it is deployed in production your spam reduction sir spam protection system and is being used right now what happens that you changed your model you changed your machine learning algorithm from legitimation to an ibase right you change your ml algorithm right so you have to you you have to rethread go back to the model and then and then whatever exchange you have to again to deploy that changed model which is update the model right which is one case is different model needed or ml algorithm changes right another another point is for example you're building some span reduction project you might have trained it on a on a on a data set which which is which which might be un-updated right or a new data arrives right for example you your hackers or spammers change their strategy of sending spam emails right so it so the the data changes and your model should be able to identify the new patterns which the spammers are following right so what happens that if any data changes happen it retrains the model so first of all data again new data comes in retrains the model and then deploy it that's why we call it in the loop in a product production environment it is a never-ending process you know it is a never-ending process a deployment goes to production right and it trains the model it it sorry uh it first of all collects the data trains the model and deployment goes in production what if if your model changes model changes if the model changes you again you have to go back and then push it again or if new data arrives you have to go back to data collection and then push it again I hope this really makes sense if it does not don't worry we'll have a lot of examples to study more I'll take I'll give one possible scenario of this production when ml algorithm changes or of the about the loop in a production environment so one possible scenario of going back of going back is about model performance starts to Decay right so once you train the model you deploy it after a certain period of a time your model starts to Decay so let's take an example of a fraud detection example so assume you have a trained year model for fraud detection and let's say you have deployed it as well and you see your model is giving incorrect protection right so it most probably happened that your fraudids change the strategy or patterns to fraud right the the patterns which a machine learning algorithm has learned has changed so you need to recollect the data and retrain the model which means go back and then do it again and it may happen after some sometime again hackers change the strategy right so this is what the model performance starts to Decay then you have to go back into the loop and then retrain and redeploy the model another scenario can be we might need to reformulate the problem as it difficult to get gathered data more that as we need so you reformulating other problems violation of assumption which we made during training so basically what happens basically what happens when you train the model we have certain assumption for our input data that the the input data will be in certain range or input data will will will be there are a lot of assumptions which comes into this place so if the if the Assumption changes if the assumptions changes here which we had in a training data we might need to uh reformulate the Assumption or maybe go go back to this and have those accommodate the Assumption which are being violated or simply the business objective changes and basically it will restart again so a lot of things which comes into that place which can be of a loop and it is never ending process you have to have continuously uh seeing your model monitoring your model and stuff so amyl production which is data affects the output system and it's very hard to make it reliable when deploying modular retraining and then collecting and then the loop is very very hard to make it reliable and that's where envelopes comes in place ml Ops is a set of practices it is not some library or it is a tool it is a set of practices that aims to deploy and maintain machine learning models in production reliably and efficiently so to make sure that if there's anything changes in the data it retrains the model I'm just taking one for one or two examples it reaches the model if the assumptions are violated it again goes back so we have to make a reliable productionized which is which is happening at a large scale so uh the term envelopes is like the extension of the devops methodology to include machine learning and data science assets at the first class citizens within the devops ecology I'm pretty I'm pretty much sure that that you might be a bit uncomfortable with this so let's try to think about in another way okay I'll tell you a very simple example of mlops so assume that you are you're given a a game to build a beautiful city right now you just built the beauty now if you build a beautiful building in that city is that helpful just right he has to know is that helpful yes or no building a beautiful city in that bill in that city is not at all good thing because it needs the electrical connectivity it needs the maintenance it needs the security systems it needs the connection to the roads in the railways and lot of things which comes into the place right so a single building is like a is like a model you have to connect it you have to securitize it or you have to monitor it a lot of things which comes into that place for making the fully functional City and companies wants what companies wants the full Standalone cities not a full building right and that's where the people are not getting jobs it's only because they are only focusing on building that building not the whole city and mloffs is a way to building that full city which is required so we really talked about a deployment but you might be thinking it's very very easy to deploy the model in production but let me tell you that the trouble begins after deployment so you might be worrying about why so I'll tell you what are the some of the things which needs to be taken care of the first one is accounting for latency so what is latency latency is about that that you might be shocked by the statistics that 53 of the visitors are abandoned if a mobile site takes more than three seconds to load so for example if your sites take more than three seconds to load 53 of the people will have abandon the site and you know why and I'll tell you why it's because for example you have you you might have deployed a 120 billion parameters model at a very large model do you think that model will give prediction in less than three seconds that's that's that's really hard right and latency is one of the biggest problem right and if visitors are not being your viewing your model on a website they're most likely not it's engaged with a brand and they're most likely not buy your product or utilize that product right so this is one of those another one is that fairness right so for example you deployed your model right so basically what happened that Microsoft created a Twitter bot to learn from users and you know it became the waste test it started supporting the various bad ideologies after deployment they thought that this will be so good and it was taken down by Microsoft in just some hours it was against feminism it was sorry it was against um I'll I'll not take any names it was against x amount of thing right which was which was which was so racist out there that's what they had to take it down in matter of some some some some hours after deployment they thought that this would be good but but eventually it learned Very Bad Things and then need to retrain it but however it never gone into production from from then another one is lack of explainability and audibility it's very hard to explain the prediction right um and and and and and we also also have to make sure that it is authentic enough to trust this right and that's why there are several rules and guidelines which are coming by and by again to make sure from the you EU to make sure that we fit some of the principles of AI and it is painfully slow I'll tell you 36 per day there was a basic you know um survey conducted from a set of data scientists about how much time they spend in deploying and machine learning models 36 of them said that they spent a quarter of their time which is 36 percent of the time deploying Machinery models right and and and this is so you know um um which is and and they're like more you're 36 percent of same quarter to half of the time you're deploying and 20 half through three quarters and seven percent in more than three quarters it is very very slow and you might be noticing why it is slow and there's a lot of things which will face when when we are building project when we want to be very surprised so to see that I'll be I'll be so correct to you I'll be so truthfully to you that when I was building projects for this course I actually spent my whole week in deploying models because it's painfully slow and I and you might be shocked that I built the whole level model data processing in just two days that's it and spending four hours but a freaking whole week in deploying it so this is uh this is one of the updates so yeah so um so I'll talk about the model Centric and a data Centric in a bit but what exactly model Centric and data Centric needles so model Centric means that you are you want to improve the model while do not changing any data so you fix the data so you have X amount of data you fix it and then you iteratively improve your code or model by tweeting it some parameters and expect your model to perform well or in data Centric what you do you hold the model fixed and then you keep on iteratively improving the data and a lot of work is in this model Centric only few of the work is in data Centric so I suggest for you all to focus on data Centric uh more probably to focus on data more perhaps than the model but yeah it's stored totally upon your choice this is also saved by Android on which is again the very uh Pioneer which is one of one of one of my instructor do is very nice in what he teaches and I think that his ideologies that I've I was in one of the webinar of them and actually he told about this part of central data Centric and which which we really experience in our day-to-day Life as a data scientist so let's get started with talking about uh the whole process of mlops and what does it include so the first thing to worry about it what is the business problem we want to solve so what is the business problem we really want to solve that's the first question to start off so any envelops project any machine learning product which we have to start first to worry about not about what Pro what what exactly ml thing which you have to solve what a business problem we join usage so in that business probably when I was talking enough we have to take care of several things out there the first is the cost of wrong predictions so I'll tell you a very basically we'll have a basic example so let's take an example what we want to do we want to predict we want to prove we want to forecast you know we want to forecast our retails for example what happens in a company that sometimes because of the wrong estimation sometimes what happens there might be Overstock of a particular product which leads to wastage of resources and they're under stock which leads to again a revenue loss so in both cases under stock or overstock of your um of your resource of of your products is being the problem in a retail company so you notice you you want to really solve that so the first one is what is the cost of wrong predictions if we actually if if our model gives if we actually don't estimate the right thing the cost of wrong prediction is quite High Overstock means having more stock of your products leads to waste resources and possible write-offs for unsold products and under stock which means which is Miss sales opportunity and unsatisfied customers because they're not able to get the things on time and both of them has the quite High quite High uh costs because one at one point if you have Overstock wastage resources and one point it is like Miss sales opportunity so we have to worry about and if we solve this problem we'll fix Overstock in under stock problems so let's break down the process of Salesforce casting processes so basically in this sales forecasting process you decompose the process of sales forecasting into component task see see see just notice that we haven't reached to our ml thing right away we're first of all talking about the problem which you wanna solve right and then dividing the problem then right so now it's a sales forecasting problem we're dividing the problem into several things first is data Gathering second historical sales analysis Market Trend analysis and actual forecasting so what is data Gathering mean getting the required amount of data which we need analysis of the past right what is the trend of the market and the last one is actual forecasting so what do you do you you actually data Gathering is something which is pretty easy right um not pretty easy I would say it's a it is it is something which so let's worry about what things can be solved in ml in this case and also will will it return High Roi right High return of our time right which we devote in this so of course we can have data Academy has all the equal importance but what eventually we could solve for using ml is this actual forecasting to actually estimate what will be the number of pro stocks which which we should have in a certain time period and we can actually use this AIML in this actual forecasting task where it could analyze the past sales data and market trends right to predict future sales with higher accuracy than whatever traditional methods they're using so now you might have noticed that now after dividing it in it into the components we understand the actual forecasting is the one which we should solve for by by utilizing the past data and the market trends and the ROI could be estimated by potential increase and it degrees in sales wastage due to improved forecast for example if your foreign forecasting is good you will be actually noticing that your that there's a decrease in a wasted resources which means eventually it's helping right and then what you do you actually worry about what what will the cost of developing an payment May maintain a solution so if your wasted resources are decreasing a lot so you might focus on building this AML solution of it and then you have to prioritize it in this case or to prioritize the implementation which is the actual forecasting in this case Okay cool so we'll talk about structuring in our project as of now I'm just leaving this slide so there's a machine learning canvas which usually you have to really worry about while building a project or absolving a business problem using machine learning the first one is a value proposition right so first of all we have to define the value proposition what is this importance Define the problem the importance of the problem and what who will be our and a user right so basically you have to for you you need to understand for who we need a product service this product will be benefiting Joffrey more value proposition positioning statement which means for Target customer who need our product slash service is a product category category that benefits basically we have to make sure we have to make sure the problem importance is so high right to proceed with the solving for the problem another one is we are data sources from where we should identify potential data sources and it can be including some internal databases or apis or open data sets and Etc which comes into that place we should also consider hidden costs such as data storage purchasing external data and Etc which comes into that place so basically this is the second second step third step is what will be the production task when it is a supervised or unsupervised problem or anomaly detection classification regression or ranking problem so just worry about what will be my input what will be my output what will be the degree of model complexity right this will give you more clear compare Clarity before building the before going an actual coding part the next step is featured engineering so basically you have to interact with the domain experts for example you might be working in your medium you might be building something really good in thing in healthcare space but in order Mbps doctor right you actually need to have their Mbps doctor to actually get more information to understand the terminologies and actually make a more information extract more information from the available data sources which we get that's a feature engineering company comes into this place offline evaluation which means you set up some Matrix to evaluate your system before pushing it to the deployment pre-deployment means using the model by your own and understanding the prediction errors and what will the cost of wrong predictions and then using predictions to make decisions how will the end user interact with the interact with the predictions will it will it involve any hidden cost which can be under in human intervention and a lot of things which comes into that place and at last we collect the new data we keep on collecting new data for model retraining and preventing model decaying performance we also consider cost for data collection and the role of human intervention in data labeling because it's very very important for having good labelers in the data so actually for for actually helping models to extract patterns from it and then you have deciding frequency of model retraining and Associated hidden costs for for how many time will retrain our model at what interval and as well as if there's any changes in a tech stack which you have to worry about and then what you do you set up Matrix to track system you have to monitor your model once your model is deployed you know you have to deploy it so that so that for example in Span reduction you have to you have to keep on check you have to have some Matrix to keep on check whether your model is giving the wrong predictions or not that's the monitoring part and also identify situations where AIML may not be the best solution it can be some sub tasks of it where you actually worry about something outside of AIML right because it's very very important to understand if we can solve it without AIML because it's very um hard as well as the the cost of implementing gaming solution is pretty much big and also so so that's that's pretty much about um what exactly we need to worry about uh in a whole envelopes procedure so there are three things which comes into into the workflow of building uh workflow building the machine learning based software development that there are three main artifacts in building ml based software the first one is data second one is the machine learning model and third one is code and three main phases which which is the engineering data engineering MN model engineering and code engineering so let's talk about each step by step so data engineering is like you have to collect the data acquire the data and prepare the data accordingly right which is uh and then also make sure that there are certain things which is which we have to make sure in this so the pipeline of data engineering here how it goes pipeline means step-by-step procedure to go ahead right first of all you ingest the data and the data which was ingested you explore and validate the data that is coming from a true space as well as exploded to understand the data you format and clean the data you label the data if it is a supervised learning problem and you divide the data into training validation test set so so that that can be used for training the models I'm assuming that you already know about training and validation and all those things which is already there I'm not in a wire to explain you ml things I'm over here to explain you things which really matters so the next step is model engineering so the core of ml workflow is writing and executing ml algorithms the pipeline have it like this you train the model you evaluate the model validate the model with pre-deployment so let's make sure that your model is working pretty well you've test the model right using the unknown unseen data set which is the Unseen samples which your model has never seen and the new pack is the model so that business can be used accordingly it can be dot pkl file or any such you know uh models and at last but not least you have you deploy the model you serve the model in a production environment you monitor so that it's it's going well and then you record and then also log it so that for every infinite for every prediction it is making so that we can go back if there's anything goes wrong so I hope that you understand these three things pipelines will go into that greater detail when we actually do these products we'll implement it live over there so that you could see pretty much easily and we'll use xenamel to develop execute and manage our machine learning systems so I'll talk about uh pipelines and steps in pretty much small detail and will eventually go to projects because I think that it's pretty a long video so we'll just try to talk with greater detail later on as of now let's talk about what are Pipelines so zml follows a pipeline based approach so don't worry about the general thing as of now we'll come to that what exactly it is later on but currently let's talk about pipelines and steps so zanderl follows a pipeline based approach to organize machine learning workflows it can be methods to promote efficiency reputable sorry repetitively and collaboration in your process so say for example so what is pipeline it's like a movie production process a pipeline is a high level workflow that organize a series of tasks to create a final product in the context of a movie production process it can be of script writing casting filming editing and distribution casting depends on Street script writing filming depends on casting editing depends on filming and distribution depends on editing everything is interrelated everything is step by step not you you can't do scripting and then editing right so similarly in xenobl your pipeline represents a complete ml workflow and each step over there it can be involved a step can be data preparation feature in another step facial engineering so fish engineering can only be done if the eye of the previous step is completed and then train the model to evaluate and deploy it so here's a very basic basic example over here so basically you you you actually first of all uh step one which is the prepare director you have some mli where you load the data which is using the step decorator you have another function which using a step decorator you train the model you evaluate the model which is editing and then you deploy the model which is distribution now you combine all these steps into a pipeline right you have the data you give the input data which is using a pipeline decorator you give it to the featuring and then you give the features to the model and then you give the models to the for the evolution to evaluate and then you deploy the model and then you run this whole pipeline so it's not step by step to reach and give you the trade model so there are a lot of things a lot of benefits of it which we'll discuss throughout the course right so in the next set of lectures I'll make sure to introduce you to bit a bit of collab notebooks to make sure that you are aware about basic functionality of xenomen so that we could actually use this in a process and then we'll go on building our first project of having knobs so let's get started with this uh and first I'm an obstruction hey everyone welcome back to the new video on ML Ops course so basically today what I want to do is I want to make you familiar with core and the fundamentals of xenamil because it's very very important to understand what are the Core Concepts of the animal to actually start on building several projects using xanamer xenable is an open source library for building a full stack mlops applications and the reason why I want to use this animal because I personally worked over there for for about six to seven months over there and I've worked there with their core team and actually this is super simple to use that's why I want to use Cinema you can use several other orchestrators which are available in the market however the most easiest one with the best ones is xenamel so that's why I want to use zenml however uh you might create several other problems uh but there's always a community which is where you can interact and uh resolve your doubts so let's get started with ML pipelines with xnml and today uh this is the Coco lab notebook from Zen bytes which animal team has already built for us and over there this the the Zen bytes what they want to do from these kind of collabs they want to teach you the code Concepts so I want to utilize these and then record videos on top of it to actually make you understand some of the Core Concepts of xnml so let's get started um and also we'll be doing projects don't worry this is just for core understanding because as we say the core is the power so let's get started with this notebook so uh first of all what what we will do we will install the zenml server which is the zenml server which is very important for us to install uh you can this is a command line uh command which you have to paste on Terminal if you're using vs code however you can just get started with collab will when when will go to projects you can actually actually see the way I'm doing over there we will will also make use of scikit learn because I want to show you the demo that's why I want to train a very simple model over here and Pi parsing which is important for collab and then we do it for simple things which is needed however I have already did it so you don't need to uh you you only need to do I don't need to do because it takes a bit of time to download so uh and the next uh you need NG dock account if you want to see the visualizations and all the stuff which is pretty easy you only need NG Rock account for collab you don't need NG account for something if you're doing uh for on vs code or simple python code you'll be easily having access to that however for collab you need ngdoc you can actually have it I have a coupon code over here it will I you can actually I will hide it I'm so sorry for that I'll hide it or you can just have your own engine token as well cool so uh and over here you this is just for collab setup this is not for uh any uh things which you have to learn so uh what what we will do is uh we'll you might be familiar as it says that you might be familiar with scikit-learn pytorch or tensorflow so as I said the ml pipeline is simply an extension which includes this step by step as it as I told the example of a what I I told the example of a movie production process so and and in that movie production you have scripting and casting editing and all those things those are steps which are interconnected with each other right so the the reason why we use pipelines is because of the following reasons the first one is we can easily rerun all our work not just the model right so basically you run each and everything from the starting which which helps to eliminate any bugs also make our models easier to reproduce second one is that for every pipeline you run you have for for every time you run you have the you have you can easily track the previous previous run in this in these kind of pipelines for example you run the code one time and then you run the second times and usually you don't have the access to the previous run so basically using pipelines you can have the access to every runs and it can be tracked as well and then you can comp then you can use for several purpose for for example comparing two different versions of the models right and also if the entire pipeline is coded up we can automate many operational talks like retraining redeployment and all those things which is needed via cicd workflows will uh don't worry about if you didn't understood this line when we'll actually do one simple uh project okay uh one simple project you will see if any things change from the data how we how actually pipeline help us to um this uh redeploy or retrain our model Okay cool so let's get started with this animal so first of all you need to have the unit to have this animal Library installed first of all what we will do we'll remove any existing files over there and then initialize as animal repository which is very important which is very important which is the first step whenever you use xanaml Library so it initializes your xenable initial in your current directory So currently you can just use the number in each to initialize it and this happens so this uh exclamation mark shows that we are dealing with terminal commands so now so now I'll show you what we exactly do for basically I'm going to train I'm going to train a Class A scikit-learn SVC which is a support Vector machine support Vector classifier on a train and support Vector machine classifier to classify images of handwritten digits so basically we will do the handwritten digit recognition using uh support Vector machine so over there we have there there is there are several images and each image is either 0 1 2 3 4 all the way around to the 10. so basically when I classify the numbers the images are handwritten digits based on zero zero to ten uh numbers so let's get started with that if you are not sure about what is handwritten digit recognition you can first should take a look at online what is the problem is about so basically what what we will do will we load the model which will train the model we'll trust the model uh on that and then we'll see the accuracy however this is not the right thing to do okay this is just for practicing this this can be 1000x more complex then this is very basic version of anything right this is just a dummy dummy thing just to Showcase you so basically you load the digits so load digits will load the data set from SQL and data set so basically we'll use the SQL and data sets to load the digits and then what we will do and then we'll reshape it so that we'll just do the little bit of processing and once the processing is done we'll divide our data sets into extreme extra s y train and by test the reason why we divide it because so that we can train our model on an extrane and Y train and then we can test our model on testing however again I'll say that machine learning algorithm coding is 1000 sex much more better than this we teach in our core machine learning course which is if if you actually see the core machine of course you'll see what is this Pro so yeah this is just for dumb example don't take inspiration for machine learning uh algorithm over here right and then use the simply train test split and then you have the support vector classifier and then you fit the model and then you evaluate the model okay pretty simple now what now this is I and then you run it you get the test accuracy now how can we run this into how can we divide this into experiment into pipelines so what we will do as we have this animal in it which is a general repository will create our first pipeline the first pipeline will have the following components and that it will import it will drain it will evaluate the model okay so our three distinct steps in this example loading of the data training of the model and evaluating the model right you can simply use weekends will will we will simply make different different functions for different different components over here so first of all we'll make you'll make use of AD grid step operator which you can simply import from zenml and then what you will have to do you the input the Importer importer Does this does not takes anything this returns right this is called typeset the the thing which we have to return over here so the so basically the Importer will load the digit will reshape if it will drain this split and then return extreme extras white and by test the reason why we have to actually write over there what it's actually returning is for several reasons which happens behind xanaml behind the scenes because this as we see trainer should know what type of input it is coming to me because as we trainer will get extreme and white train right so they should know what type of soda to verify the type of the data types which importer is sending and as we see is getting we need to actually state that this is something which is it is going to return this is also helpful in readability also it happens it also helps in the back end of the system which is annotated and then NP dot error and the array this is the extreme will return the extreme the type of that is a numpy array Express type of that is in the area so this is a formal annotated which will annotate our outputs right and then we have another step which is SVC trainer right we which we again decorated with at the rates Tab and then it returns as we see away it takes X strain which we say that it is a numpy array why train it is also a numpy array and it Returns the classifier it Returns the classifier right so basically it Returns the classifier which means that we train the model and it Returns the classifier so basically you you can import classifier mixin from SQL and base which says that this is the whatever it whatever it is the type it will just make make the type so it will be a classified mixing it will be a mix of classifiers you can easily search online about classifier mixing if you're unconfused about what type of data type is this this is just the SVC classifier a data type you can also write SVC over there uh by identifying the type of this model the next step is it will take the input as the X test and Y test and the model and the model type would be classified and mix them and it will return a float and also it is decorated by step operator and then we just uh this the test accurate score it and then however there there can be several set of classification measures just as of now as we say that we are taking a baseline now once we have the steps now we do connect each and every steps so we'll use this nml Pipeline and the pipeline will have following like this first of all your extreme exercise might as you import you use the Importer which you built over there and then you use the SVC trainer and then you and then use the SVC trainer where you give the the this step which extreme and vitron you give it and then you evaluate up so once you run it once you run it you will be getting and then you simply use this SVC and then digits pipeline when you run it it will initiate a new Run for the pipeline which is currently it I have I've already run so this is the version number two so you can you can in the dashboard you can visit the version number one okay and then revisit the accuracy over here and as well as previous one so um you can simply go this it says step importer has started step imported has finished in 2.73 to 2 seconds as we see trainer has started as we see trainer has finished evaluator has started it is finished and then every run digit has finished you can visualize your pipeline runs in simply's animal dashboard right and then you can run this code and then go to this you URL so basically when you run it you will be prompted to URL something like this and then you can easily go over there and then you will simply have in your pipeline so basically your password should be default okay so basically whenever you um go over there so let me show it to you so basically um let me run it quickly if you want it uh I'll run it so you can see now it will train the third version okay so sorry it will in the second version because we have re-initiated out the stuffs right so you can simply go over there and then um it will run it so now you see it's starting this nml server you can easily go over here and then it will it will open the zenml dashboard now once it opens you can uh so basically you'll be prompted to something like this you have to write default over here yeah you you have to write default over here and then click on login so once you click on login you'll be automatically having your automatically you can go to pipelines and then visualize your pipelines over here however this is not the pipeline which is over here you also will you'll be reviewing something like this so this helps you can easily go to the previous one see your model score come to this your model score and etc etc so I hope you really understood what exactly steps and pipelines means is animal this is just the basic things in the next lecture in the lecture uh 1.2 what what we'll do we'll I'll show you some of the Magics of what zenml does and then you'll be surprised to know about that as well right so let's get started with a new lecture and that says pretty much simple that will first of all go ahead and understand what our data looks like so that it gives you much more clarity about the problem statement which we really want to do currently I am not setting this too much on business objectives at all most probably will focus on the technical aspects of building this mlops project so over here you have the data and or and and the data says the oldest customers data set and that data set has the uh most probably over here if you see that custom ID customer unique customer City customer State and then you have geolocation data set and then we have items data set and a lot of the data sets out there so what we did we we made our custom data set over here so if you go and see our custom data set we have the lot of features where we combine everything to one and then we have the reviews code which is our uh most probably our review score which is the satisfaction score so I'll quickly show you um uh very basic how does it looks like in Excel sheet because it's more important to let you aware about wow the Excel sheet because it's much more common because currently it's a bit complicated over here if you think uh in a basic um Visual Studio code so let's get started with actually showcasing it actually takes a bit of time because the file is a little bit large but no worries we'll get started with it but as soon as it as it is opening so what I'll do what I what I'll do I'll create several four folders which is very important for us which you can take as a template for starting off right so let's get started actually showcasing but before that it actually opened up so you see that order ID customer ID order status order purchase order approved that and a lot of features comes into it comes into this place and then finally you have a review score which is from one to five which is from one to five however currently we'll be not using this review comment and will delete lot of you know features though not not because of the feature of not because of the it does not hold its importance but all because but because because we I don't want to make it complex cross it initially you can of course tweak it accordingly make it on whole data you know do do in the setting of machine learning setting and a lot of things whichever you can do currently I want to make it pretty much simple that's very nice so let's get started this is our uh Target variable and all of this is our input features which we have to use to actually predict our customer satisfaction score so but before that what I'll do I'll quickly make several folders which is very important for us to get started so um and also but but before that let's install several libraries which are listed in readme.mmd so uh one thing which I just have to make a note that you have to actually perform all the actions all the installations every of your operations in a virtual environment currently I'm in customer satisfaction virtual environment I actually use something known as you know what I use I I use spy EnV you can use conda or you can use uh when or literally any virtual environment which you want to use for Action creating virtual environment if you're not aware of what is virtual environment it is actually containerizes all your applications into one's an environment so that your dependency conflict does not happen I know if you do if you don't you if you don't know you might be not able to understand this so we have linked a very nice resources in the GitHub repository just before this section which you'll see in the GitHub repository to actually understand what is it virtual environment means but it's very very important too in a virtual environment to actually have it everything on the good page cool so what I'll do I'll quickly install so basically uh I'll I'll first of all pip install xnml server so that you can this is for the one who actually wants to run the whole project Let's ignore this and let's try to install this first of all as xanaml so I'll I'll quickly quick quickly go and install xenobl so you might be seeing that it is giving some errors so what what we have to do we have to actually add something like this I hope so it works if it does not we'll have to go to uh Xenoverse and see it okay cool it's actually working so it will take some time to actually download a x animal and and I'm personally downloading over here so that you can also see that how exactly exactly these things works and also what I'll do I'll quickly um import the requirements which I have it over here so that you understand it much more in a great detail so um what I'll do I will have it over here okay cool so this is the requirements dot txt however you can add this is just for showcat Boost like jbm will not be learning about these algorithms if you want to learn these algorithms in role in my core machine learning course but uh but we'll be not learning which this is just for installation of the libraries which is very important for us to install it prior however you can totally choose to ignore this we'll just coding step by step so that you understand it much more in Greater detail so now it shows that it is actually installing and I pretty much think it is installed uh yeah so it's also say that that we have to upgrade so let's just copy from here to here and it's very because the reason why I like to upgrade it because it really shows pretty interesting and pretty beautiful the way it downloads not not is this white white one is actually very colorful that's why I like that I'll just clear it very quickly and then what I'll do I'll go ahead and zenml up so what is zenable up does it is UPS although or awakes the zenml server so that you can view your pipeline so you can view a lot of things out there if you then simply put them animal up but before that what you see that this is not running the reason why it is not running that we forgot a very interesting thing out here before that we have to actually write Zen ml in need which initiates the repository over here which means that it really it initiates the xenomenal repository over here so right you could see that that a DOT Zen folder will be created over here as soon as possible as the runs is completed so let's just wait for few seconds and let the let this animal gets come zenable in it gets completed the reason why we want to create the repository because we want to containerize or have all our code inside that repository so that it can be used for several other purposes which will realize it later on Okay cool so uh but let let that running it is the first run that's why it has been taking time so let's just go ahead and create folders which is very important for us now you see that zenml is Zen dot dot Zen is created and it says that your zml flying version does not match the server version the version mismatch might lead to errors or unexpected Behavior kindly refer to blah blah blah so let's do one thing let's simply General downgrade so that should definitely replicate all our errors out there which is this warning because it's it's you know it I'll tell you from my personal opinion that is very very important I'll tell you from my post it's very important to fix up the warnings the reason why uh I want you to fix up the warnings because it's because sometimes it might happen that you'll completely get unexpected error and you'll never realize that you were here right so that's why I really want you to first of all um uh make sure that your satisfy all the errors so basically it says that your Zen will find version doesn't watch my Master's third word version so again I either downgrade or do a lot of things to actually get it done but uh living that that it is let's just go ahead and create our folders so the folders which I'm going to create is first of all the Zen folders created the data folder is created one thing which we will do in the future in in other projects of this whole course is that will will not use CSV data set we will use the password SQL and then retrieve the data set from SQL because in real world setting the not eventually use CSV we actually use uh SQL databases from cloud or somewhere like uh Foster SQL on local right and then we use from there and then we retrieve and play with that data right so with that that we'll do it later on but most probably just let's just keep it very simple and let's just go ahead with data folder cool um another folder which I really want to have is something else model folder model will contain all my um for all my files which is of like models and stuff all the all the things which is required for training the model or you can also name it as a source so let's name it as a source because that's more important right so let's name it as a SRC and which is which also which all contains your data sets now what I'll do I'll quickly create something known as Pipelines so pipelines will contain all our pipelines which we have which will build saved model will contain if you if you want to receive the model um eventually you don't need to but you know just as for reference we create it steps so steps will contain all our components or the tasks which needs to be done over here and then at last uh you can actually have any dot Pi so let's quickly create another known as init dot pi and then after that we'll create there is always that there's always a requirements.txt what I'll do I'll create something known as run pipeline so we can run our pipeline over there run pipeline.pi cool so what I'll quickly do I will first of all code all the um the data things first of all what we'll do as as I said this is not a formal machine engineering course right it is the mlops course I'll make sure that to keep keep things very simple very nice right but if you want to learn like more of the advanced things in machine learning there's always core machine learning course available out there to actually help you out the first thing which you want to do the first thing which you want to do is ingest the data so we'll start off with first of all uh steps so steps in that will create the file name ingest data.pi and in just data dot Pi will consist of the steps will consist of the steps where we will ingest the data in it so what I'll do I'll quickly import login out here so that we could eventually log when when things completed because it's very very important to log as well and then I'll import from import pandas as PD and then I will from zenml import step as I as we have seen in basics of X animal that we have to actually use this step over there then I'll create a class of ingest data I'll create a class of ingest data um oh my God you know the the way my keyboard is not working pretty well and over there what I'll do I'm actually using pilot still but um but uh the you have to also you know give the good documentation which will write pretty nicely so init so let's just quickly do it so I'll just delete it and uh you can actually do it like this um which is innate and then when you run it you can actually write the get data and then ingesting data from the data path pd.weed CSV and then self.d.path right uh or what you can do can simply give the P 0.3 series with a direct file to this it totally matters on what you want to give okay so then we'll create a step uh so where we can use that class we can use and then that step will consist of the data path which will take Str as an input which will be string of course and it will return the data frame right it will return a very nice data frame and then what I'll do I'll first of all make a try statement I really don't don't want to take the help of um I really want to take take the help of this guy uh co-pilot but if he's helping me I can't do literally anything so what I'll do I'll show you the way to write the documentation first of all would you know what you write you write a description about that function so use ingesting the data from the data path then we'll write the arcs arcs means the argument it is going to take the data bar which is the part to the data and then what it returns it Returns the pandas data frame right this is how you actually do the ingest data and this is what the very interesting workflow is then we'll write in a try uh try and accept the workflow which will have something like this where we first of all instantiate our class which is ingest data which is ingest data where we'll first of all ingest data which is the data path and then we will simply say DF ingest data.getdata and then return DF this can be easily done in a three or one line of code as well as shown by the co-pilot but I want to make it pretty simple as well for the beginners if you're watching it so for accept exception as e and then it says error while ingesting the data and this is what the error is so this is this actually helps us to uh the best practices of coding and same ghost over here we have to actually maintain it nicely right so let's just go ahead and quickly do it let me just remove yeah so I'll just make use of interesting data from the data path and then we actually instantiate the method so this is this is this is used that this is used as instantiating the method RX that's it and you can also write instantiation but if node bits is not required eventually over here what what we'll do in Justin Bieber I mean data path and then you simply write nothing and then you just that's it so this is a basic workflow which you have to go ahead and create this step the first thing which we have created is of course ingest data now you can actually use this in just enough which will eventually use it later on as well so now the next step once we have ingested the data we need to we need the next one which is we need to clean the data so now what no so now what what we'll do we will will work on creating step which we will use for cleaning of the data okay so let's let's do quick quickly one thing we'll create first of all let's import login which is pretty important to do so and this is something which is of course and then from zenml import step right and then I create a step that step will clean the data that step will clean the data right and it will take the data frame I don't know what it will return so let's skip it as the this and then it'll pass it so we want to make this step we want to make this step right which cleans the data The Next Step which I want to make is the the is the one which trains our model okay which strains our model so first of all then I'll write model train dot like and in that what I'll do I'll create another step I'll create another step okay I'll create another step so again same thing we have to just go ahead and put login and uh import login and then import pandas as PD and then from zenml import step right and then you just write the step and then you just go ahead and create the train model right train model and it takes blah blah blah and then it returns something and then change the model faster okay so that's that's something which we have to go ahead and I'll just quickly make this like this cool now my battery is low I'm so sorry for that but yeah first of all this is the model train which we have to have now once the model train is there we have the clean data we have to ingest data we have trained the model now the next step which should be it should be evaluation it should be evaluate the model so I'll just write evaluation DOT type and over there same thing which is something again so import login from Cinema import step and the next step let me just have something Define evaluate model and then returns nothing okay so this is something which we have to actually have and then once the evaluation is done we'll have you know some that's it so that's that's pretty that's the four steps which we want to have now you might be saying that I'm not implemented I'll implement it the first step is always create a blueprint right so that it runs nicely okay that's the first step and whenever you go you have to actually understand the first step now what what I'll do I'll create a pipeline okay I'll create a pipeline the pipeline would be first of all training pipeline meaning pipeline Dot the pipeline first of all what what it will do it will from xenimo okay so we can just from zml zenml import Pipelines and then I add the rate pipeline so let's just let's just write that the right Pipeline and then what I'll do I will simply go ahead and create the training pipeline the training pipeline the training pipeline will consist of the following that will consist of the following that will ingest the data that will ingest the data cleans the model cleans the data trains the model and evaluates the model okay so the training pipeline will consist of the following so that I think that something is wrong over that so let me just quickly do it uh just just give me a second okay so now what I'll do I'll create the training pipeline so let's just quickly create a simple training pipeline our training pipeline does not takes anything most probably it takes the data path right it takes the data path as an input and that's pretty much it I guess yeah so it takes the data part is the input it first of all will will import everything over here we'll import first of all from steps uh from steps dot in just data I'll import the ingest data I hope that is working so let's just make sure that it's not following any conversions so yeah ingest DF cool and then what I'll do after injection from steps Dot clean data yeah so we'll just go to clean data import clean maybe I'm not sure what it is so let's let's just go okay clean data let's import clean data and most probably will have clean DF just for make sure that we have the good naming conventions and then after that I'll do I'll import from Step stream model and then after that from your evaluate model I just make sure that evaluate model is there cool so once we have all of these steps what I'll do I'll quickly do all of these things very nicely and show it to you the pipeline so we'll just go ahead DF is equals to ingest data it will ingest the path it will clean okay so most probably we'll just have something cleaning so clean data will take the dfz input fair enough it returns nothing so it returns nothing so it will just have something very nicely over here after cleaning we'll have train model and then after that we'll have evaluate model I just hope so that everything takes um DF as an input so that it makes sense Okay cool so now once we have this what I'll do I'll do nothing I'll just go ahead and then run the pipeline okay so how to run this pipeline so we can just create something known as run pipeline as we have created let's just go and create the Run pipeline as soon as possible so we'll just go from pipelines dot training pipeline import training Pipeline and then your joke just go maybe just write train pipeline just just to make sure that we are following the little conventions so the train train Pipeline and then after that I'll just if name is equal equals to mean I just hope that it does yeah we'll just run the pipeline so run pipeline will happen something like this and the data pipeline which I'll give is I'll just copy the old path from here and send it out to here there are a lot of things which you can do you can actually upload in cloud and do stuff which will do it later on of the course Okay cool so let's run it so are you ready if you are then give me a thumbs up I'll get started with it so I'll just go ahead and clear most probably and then let's run it Python and pipeline here though when I code I actually listen you know a lot of you know what do you say uh music but eventually I'm not as of now so sorry for that install pandas pip install pandas right so let's install quickly because that's more important okay something really happened over there evaluation right okay as I did import pandas SPD let's go ahead okay it says model object is not callable I think most probably what's the error is about it's about uh let me let me just quickly go to the error training pipeline Pipelines okay it's actually pipeline bro right it's actually pipeline not cool enough let's go ahead Okay cool so something it is giving a really interesting error uh because of some other things so let's just quickly fix it so it went so what is happening I'll tell you why the error is happening it says that wrong type out wrong type for output First Step clean the F why it says that it is expecting Panda's data frame because you have told that it will give Panda's data frame but it is giving none so we have to actually write none over here same over here same over here so it is actually expecting that um you know it will return something that's why it was initiating that error so it'll give warnings just go ahead with the warnings okay that's pretty nice thing which happened cool so I'll explain you what what just happened you can choose to ignore um completely about all of these stuff like what is this user stack and orchestrator or artifact store we'll explain later on so you'll see that the ingestion has started clean DF started right clean DF has finished evaluate model started evaluate model finish trade model started trainer is finished nothing goes over there that's it now what I'll do I'll I'll showcase you this very simple dashboard which is out here so we'll just go ahead and go to the dashboard quickly so the username would be default and then login so when you log in simply go to the pipelines for you it might be super new so let's just go to Pipelines that this is the train pipeline so let's go to the first one okay fair enough so ingestida gives the output which is the data frame so that that is a data frame so if you go and and you see the this is the output and it also shows some of the visualizations or you know data type of it you see the data is imported this is called the artifacts okay the thing which is stored uh and over here if you see for after every step so this ingest DF has something known as what is the name of this what is the doc string which is like the documentation the right start time run time and all those things and then after that what what is the output and the artifact so artifact is something which is a returned after every step which is stored in some local stores it is which is stored in some store which can be retrieved further so you see where where it is stored it is stored in this uh URI so if you go over there and you will see a very nice output over there if you if you go to this particular location okay that is it and logs are simply nothing is using cash version step two you might be noticed what is using cast version of it this is pretty interesting to understand will understand its way greater detail I'll show you a very nice example of it just wait uh and then you have clean DF which is again clean DF is finished evaluate model train model so you see the ingested that gives the and then it does not returns anything so this is a visualization which you can for sure see over here and that's pretty much it and now we pretty much think that our dashboard is working our pipeline is running up and we are good to go with it right cool so one thing which I just want to make sure that you are aware about is something known as caching so uh so what if I do enable cache is equals to true false sorry let's run it okay um let it yeah cool run pipeline now let's see this on our dashboard Pipelines I want the latest one so um so you see the same thing which happened over here so I'll just quickly you know do it and then show it to you foreign you could see another version is over there which is version number four now in just data started ingesting data from this ingest data has finished but over here do you think that something really happened interesting using cast version of intersdf so xenomen has an amazing and super duper amazing feature what does it mean that it uses the cast version so if there's nothing changes in the data if there's nothing changes in the code of this nothing changes in that step it will use the step from the previous run and you see that how interesting this is that see how is the level is going on that's nothing changes right and we are eventually using those because the caching was enabled but but caching was disabled right so casting was disabled but over here I told to enable to to actually uh actually I sell caching is good so first that don't do cashing don't use to run from the previous version so stays step into State I have started uh interesting data and then ingest has finished so if you make it true if you make it true so let me let me just make it true let's run it so uh using cast version of ingest it is from the clean DF also not change evaluate models it just cleans the model in a matter of seconds you see how good it is say you're retraining a large language model right and this is a feature over there you'll be super happy that your podcast version has been used sometimes it causes error but most of the time it works like a charm okay that's pretty much cool uh I hope you understood most of most of the things from on the next session or the next video what I'll do I'll Implement all these steps and run it step by step uh and after afterwards I'll deploy the model I'll deploy the model using ml flow and also track I'll show you how I how we can use ml for experiment tracker to actually use this and then we'll make us very basic stream with streamlit application to actually use this uh the to actually use a deployed model to actually make the Press inference right we'll use the ml4 deployment and ml for tracking a libraries to actually integrate into xenable and then user correctly we have the blueprint ready so the what you have learned in this first of all you learned the way to about write code and to structure the code also you learned a very important thing that it's always good to start with preparing a blueprint and then start encoding it I hope it really made sense to you I'll be catching up in the next video bye hey everyone welcome back to another video so what I'm going to achieve through this video is we'll Implement all the steps which is listed out here so that with the clean data ingest data evaluation stuff however uh we'll we'll do it in very uh nice way I'll show you the way to write code in a nice way by using design patterns I hope that you have already that you're already aware about the design patterns before you started this project if you are not then also we have a very nice resources uh which which will be linked of course before this lectures will be also you'll be taught the basics of patterns like strategy pattern Factory pattern Singleton pattern and all of this is already taught to you so let's get started with actual implementation of data cleaning and see if you are not aware about these patterns you we actually teach in our course core machine learning course you can actually consider enrolling over there or we'll add before this uh section as well cool so in Source what I'll do uh we have several other so first of all in SRC will Implement all these classes or click the classes of these steps and then use the classes from this in these type in these steps first thing which which I really want to develop is data cleaning and data cleaning is something which is obvious which we have to work on so let's get started with actually uh creating data cleaning classes so I'm going to start off by importing logging if we do really need to log literally anything and then I'll import from ABC import ABC and Abstract method and then after that from typing import Union but I'll just make I'll just import some basic libraries and then as we need we can import more so um I'll import pandas as well import pandas as PD and then from scale learn.model selection because we are going to split out here as well so what I'll do I'll create a abstract class abstract class for defining a strategy for handling data okay you might already be aware by several animals example of strategy pattern and all so first of all create abstract class for defining our strategy this is known as the data strategy so it's a Terrace strategy this is this will be an abstract class this after abstract class would be abstract class defining strategy for handling data Okay cool so now this will uh we'll we will create an abstract method in it we'll create an abstract method this you know the reason why we do it is already known to you the reason why why we do it is to just make sure that we have a that we can just uh that that data cleaning class will show the same handle data right we have to make the same class so when when we'll work on other the the the the strategies of these data cleaning so you will see how handy it is so handle data this will be the Taylor frame because I expect the data frame uh DF PD dot data frame and that should return that to return um set off you know PD data frame or the CDs so basically we can we we can just say as I said from uh from typing import Union simply add Union and then PD dot data frame and it will return PD dot series 2. okay so this is what it is going to return however this is just an abstract class this is just a blueprint I would say blueprint mean this is what we have to implement in our strategies we can override this method to implement our own custom Solutions so let's first of all start building data pre-processing strategy so we'll we'll build a data pre-processing strategy data free process okay let's make it very good strategy and that will have something or some something which is that will inherit data strategy with an abstract class so that we can override this handle data right override this handle there that's why when as soon as we have the handle data so it will it will take the data frame as an input and it turns the data frame as an output okay so we don't need to logging as of now so it will dry so we will have the trying exceptions trying uh so basically we so basically I'll first of all drop certain you know um certain columns from the data because if this is something which as as I already told you that we want to make it super duper simple that's why I'll I'll drop several columns however these columns are not like they are not important they are actually very important just for Simplicity for this project I'm going to delete some of the uh some of the um columns from the data because that's more important right so what I'll do I'll just um I'll just I have already already written the names of the The Columns which I have to delete so I'll just copy the amount from quickly from there yeah so you can also see that a or reprove that order you know receive that and all the stuff switches required over here okay and there you go so data dot drop you do you drop certain uh columns your drops are in columns out there and then you simply go ahead with it okay cool so now what I'll do I'll go to go go to the next step I really hope that it works nicely yeah good so now what I'll do um there are some certain columns there's some certain columns which has which has the um null values okay so what I'll do I have uh I'll quickly fill up the null values you can do by two or three two or three things okay what do what you can do you can actually uh do the when these things are analyzed when you do the Ada part however I have done already Ed on my part as as I said you can actually make the Ed and then see which columns and all I've already did it just for Simplicity so that it actually makes sense to you to actually get started with directly with the project so there's some certain columns um which is actually you know um there's some certain columns which which has the null values and will try to you know uh make it work so here we go so the data which we have there so basically these these columns fill with a median of that column and then Place equals to throw with a median of that column in place of ecosystem median means we have to take the median of this and then we have to permanently apply this on our data right and there's review comment message which also also fill the null values with no review because there are several values in the data so that we can just write node view over there okay that's very cool so now what I'll do I will just go ahead and we will we will drop the columns we'll drop the columns which are you know uh which are of non version of uh non-number type or some some columns which are actually you know um numbering times okay so basically we'll just take columns to train our model who are numbers however you can take like see that the reason why I'm doing this the selecting the number is not because if I want to you know I I'm doing it on purpose it is like the reason why is I just want to make this product simple so I'll select the and only select the columns which are of numeric so that I don't need to apply a lot of processing steps okay so what I'll do I will simply go ahead and data is equal to data dot select data dot select select data types I'll select the data types which are include numbers so NP DOT number okay so this data will have the columns will have the data which are of normally numeric type so now we don't need to worry about categorical encoding ordinal coding or whatsoever or even tokenizer of this this is also removed okay but you can do a lot of things out here you don't need to remove this you actually Implement another processing strategies where you encode the data where you tokenize this review comment messages and a lot of things which you can do over here we'll also drop a couple of uh I got a couple of you know columns and the columns which we have to drop is the following so the first one is a customer zip prefix and the order item so these are the columns which you have to drop the reason why we have to drop because this is not important at all okay so we can just write data is equal to data top drop equals to true and return the except exception is e logging error and base e cool so um you might be worried about like what did I did over here first I dropped certain columns which is not required for us as of now because of the because of making the project sample second you drop the you actually fill up the null values which are available in the in these character kind of columns and then you only select the data which is a numeric type we are not selecting the categorical encoding just categorical data that's just because of the Simplicity of the project and then you drop certain certain uh you know um columns and then you just return the data that's pretty much it that's pretty much what you're doing that's pretty nice so now what I'll do I'll create another strategy the another strategy is data split strategy so basically data divide strategy and then that will inherit the data strategy and in that we'll just quickly create strategy for dividing the data into training and testing set will again make the handle data and then simply data it will take the data frame and it will return you know it will return it will return the union of Panda state of Freeman series you will notice why I'm saying Union means uh both of them so here we go so I'll quickly explain you what does it matter so X is because this is all copilot that's why I love him so there dot drop we are dropping the um the the target variable and then we have Y which is the target variable and then extreme bike stress y Train by test and give the test size to be 0.2 and the random state will be 42 and the next train is the Pana zero frame X taste is the panels data frame y train is a series and Y test is a series that's why your output of the combination of both of them I hope it makes pretty much very sense so now once we have that now once we have that we will make we'll make a final class where we will utilize where we'll utilize both of these strategies into the class so we'll create another class which is data cleaning okay the data cleaning data cleaning class which will process the data and divide the data so I'll just quick create data classwork which will which pre-processes okay processes the data oops and divides us into training and testing center cool so what I'll do I'll quickly create Define init and then it will take self data frame and it will also take what strategy you want to implement that strategy would be that what would be the data strategy it can be either you know this these are are other these are the types of data strategy right this abstract class so strategy will take either do you want data process strategy or divide strategy Okay cool so what I'll do else quickly sells dot strategy is equals to strategy okay so now we will have another class which is handle data sorry method and that will that will I've either return Union or you know a simple Panda State data free so this will return self.strategies with handle data and then self data so basically the strategy will live in for example if someone chooses this data divide strategy so so data we can use the symbol class so basically someone can someone will go and just run this class data cleaning so if name is equal equals to is equal equals to Main oops and resurfred okay um over here it will just say data cleaning okay so it will say it's something sorry for example you assume that we are reading this uh date CSV file however we are not going to do it right now and then data cleaning and then data cleaning is we'll instant share this with data and then you know we want to use this data preprocess strategy so that data pre-process Strat that data cleaning will use this data preprocess strategy in this case over and then it has a method called handle data it will run over there then you can same same way you can give it another strategy right which is do your device strategy it will do nicely okay so I hope that you really understood the the way that you do this is called a strategy pattern where you first of all create the abstract class and there are several strategies in it which is data preprocess and data divide and then you create the final class which will make use of those strategies uh over here okay and this one this this is actually very helpful when actually um just for flexible code writing as well as readable as well as right or not writing so much of FNL statements cool so what I do I'll quickly implement this into um clean data so let's implement this into clean data that's more important for us okay so clean data will take the data frame as an input and then we'll just uh go ahead and try and then let's go so first of all we'll import we'll import what we'll we'll have to import our from Source dot data cleaning I'll import data cleaning data divide strategy and data pre-process strategy cool so once once we import this now we'll just go ahead and use it try and then try and accept so basically first of all we'll create a processes process strategy so over here we can just go ahead and create pre-process process strategy process strategy learning class data meaning is equals to data instantiate retailers by giving this process strategy and then we will uh have something which is processed data is equals to the the object which we have the object which is data cleaning dot process data okay dot process data sorry sorry handle data so what is that eventually doing we have this class we are giving the strategy which you want to use we are going to be we want to use the data prepare strategy and then we are calling that strategies uh method which is handle data which will handle the data then we'll have the data divide strategy or maybe divide strategy right and then it will again data cleaning and then you have something known as process data in this case not the data now you have the device strategy now we can simply make use of extreme microsy Trend by test don't handle data because it is returning Panda's data frame in series and then we have to actually login data it's completed and then accepts acceptance e logging error raise e okay so now one thing which is missing ah is it returning none it's returning X strain X just y train and Y test right so we'll use something known as annotated which is the python built-in type setting parameters a type hence parameters so let's first of all quickly do this from typing extensions from typing extensions I'll import annotate it which is a formal one so annotated what it will do and also I will have to import the Tuple let's quickly import Topo from typing import Tuple sorry uh over here we'll have to pull and then okay so annotated the first output the first output is of course PD dot data frame and then it will it it's actually X string right it's actually extreme now we have another one which is annotated X test and then another one by train and Y test and mostly we are done so so basically this is what happens that we are done and now one can actually now it says that it will return the Tuple it will it it will return the following it will return that uh four four types which is data Frame data frame series and series which is an annotated using annotated uh from typing extensions so I hope that it makes sense uh now I let me see what type of error it is giving it's mostly because of the this I hope this fixes it so now we are done with this step now we can just you know simple make it very basic doc string cleanse the data and divides RX so let's just write arcs raw data and then simply you just have you can also write returns training data testing data training labels and testing labels okay so now we have this class ready for us and then we can actually use sorry step ready for us when we're using several strategies and then we'll actually implement it okay so you hope this this actually makes sense to you all now the next thing which which will work on is something something known as um which is something known as model development so model development is something which is pretty much important will actually make use of uh linear regression which we will Implement right away from here and Implement right away from here and uh you know so yeah um so we'll just Implement linear regression out here so that it makes sense for you to get started with it so we'll Implement a basic LR so that it is not however there's a lot of things which you can Implement I in the Trap repository which you will get you'll be having implemented these kind of like you know random forest or xgboost CAD boost and then after that we'll evaluate our model So currently we are not focusing on core machine learning kind of thing we're just focusing on building a full envelopes project so I can build it in more complex such situations another ways which we have is evaluation as well as where we'll we'll make an evaluation measures and then after that we'll also make the steps for it and then we are mostly done however there's something which is left which is something known as deployment pipeline will also deploy the pipeline right so that you will be amazed to see that the way we deploy the pipeline the way we run it right and the way we run it and also we'll just use a stream with application to actually go ahead with this in deployment so I hope that really makes sense so let's catch up in the next video so now everyone what I'll do I'll go to the next step which is moral development which is pretty much important as well um so let's get started with modern development quickly and then try try to complete this project as soon as possible so model Dev dot pi and in that what I'll do I'll create us I'll create again the you know uh abstract class and then we have to extend that abstract class from ABC from ABC in both ABC and Abstract method so let's just go and start off with it we'll create a glass model the class model will have ABC right this is the abstract class for all models snapstract class for all models and then after that we'll create an abstract method to the abstract method an abstract method will be called as a self train and that self train will have something with extreme which is train training data by train which is testing sorry uh training labels we can also create some method known as optimize uh but it's not required as of now so let's just leave it so I'll create a very simple class see my point again I'll say I'm emphasizing on it first of all focus on learning about envelopes and then implementing complex models and stuff so I'll just make a simple linear regression model on top of it so let's just make a simple linear regression model and then it will take extreme and by train and quarks it will first of all forget that and it will just Okay cool so we'll have some some something which is training and the training will first of all we'll just import from a scale learn m dot linear model import linear regression and most probably let's name it as a model it makes much more sense okay so um I'll just make it over here quickly which is reg equals to linear regression Forex and then reg.fit which is and then return the regression and then return the regression right if we can also put put this in a try in try an error so so try to accept okay cool okay that's very nice so now what I'll do uh so now what I'll do I'll simply go ahead and you know just we have the model training model ready however we will see in the next product which we'll do this is model development is much more complex because we have to first of all train the model validate the Assumption test if things were working or not you know tweak the data feature engineering cleaning which we'll do in the next project you don't need to worry about it okay so this is the model development which is linear question model where we simply fitted and then we train the model we complete the training of the model so let's quickly go to model train and then we just Implement something over here so uh what I'll do I'll just go ahead and then import from model sorry from source dot model def import linear regression model right yeah linear regression model now what I'll do and simply go ahead and uh I'll simply go ahead and then first of all we have to first of all get the get the data which we have to so it will take several inputs so it will take extreme extras y train and Y test so let's just take cell X strain and then Peter data frame X test flight train okay X test white train and Y test and it will uh yeah that's pretty much it it will return regression mixing so actually it will return that linear regression model right however there is something known as regression mixing right so from a scalar from sklearn dot base okay dot base import regression mixing regression mixing is a type of anode which is the type like of course we are going to Output the regression algorithm right trains the model and then simply you know appreciates the model in my uh I mean strange the model that's that's pretty much a good task Okay cool so let's just first of all do it and then let's go on the next PATH so the model which we have is equals to none and we'll also make a config dot Pi we'll also make the config.pi and that config.pi will have from Cinema dot steps import base parameter base parameters and then create something nice model name config that will have base parameters out here and it will contain the model configurations which we which we want to add model configs which can be model name first of all model what what model name which we want to use what model we want to use and then yeah so so that's it so that is the model name which we have to use so first of all we'll import some something which is over here we'll import a model train and then we'll import um from dot config import model name config and it will also take config which will be the type of modeling config cool so now it will it will also take the config so config will contain the stuff so if so if if the config dot model name is linear regression we will say uh just just you know use that model which is linear regression model linear regression model and then just train the model on Extreme and Y train okay that that is something which you really want to do or what or what what we can do we can just have something which is a model so it will of course it should it should return something right so let me just go and quickly say yeah it is returning linear equation model we just have trained model is equals to model 3 and X-ray next test and it Returns the train model else um we can just try to know something which is model name not listed or something some something like that you can raise a value error okay so um the reason that the reason why why I do this over here you might Implement other models as well you can just go ahead and Implement class random forest model right and random forest model so you just go ahead and don't don't need to change the name you can just say if the config says if the config dot model aims is random productive regressor you train another model so this is how it works okay you don't need to worry about like a lot of things out out here it's very simple to understand so just have it in a pry and accept exception as e dogging enter and then raise a good so that's it about the training of the models we'll just go ahead and quick quickly create some something known as evaluation system part so let's just go and create the evaluation part as well evaluation dot by let's go ahead and create the evaluation.pi so again over here we'll create a very basic again abstract class and then it extends that abstract class to other strategies which you want to use over there from ABC from ABC import big ABC and Abstract method and then we'll just have class evaluation and that will take ABC and then it will have something it is an abstract class right it's an abstract class defining strategy defining strategy for evaluating our models right and then we'll have abstract method abstract method will have something calculate scores so it will calculate the scores out here which is why true and then it will it it is a number in the array so just import numpy as NP so numpy and the array and Y prediction which is also the numpyendary so cool so over here you have something which is Catfish course which is abstract method and then abstract method will have something over here which is y to the the modern prediction sorry ground truth and the model prediction now what I'll do I'll simply go ahead and create several strategies for it the first strategy which I will shall create is something else MSE so that at that MSC will inherit the abstract class of evaluation and this is the evaluation strategy this is a evaluation the evaluation strategy that uses mean squared Adder right mean squared error and then we'll create the calculate score that calculates scores will take self again why why true and Y true and which is of of course numpin layer as I just copied from here Okay cool so uh again so we'll just start with uh we'll just say we have entered calculating MSE so it will start off at the calculator MSE so basically we can use simply something known as from Psychic learn Matrix from sklearn dot matrix I'll import mean squared error and R2 score so we'll just go ahead and just do it so let's let's do so we'll just have some something MSE will just give y true and white red we just say that it is done and then we return the MSC otherwise we see if there's anything wrong which is error in calculating scores and that's pretty much it so it Returns the MSC so now we have one strategy done we'll go and create another strategy we'll go and create another strategy another strategy would be R2 score so as an auto score will have the evaluation strategy that you that uses so that that uses R2 score and then we'll just calculate this course and give everything out here so it just supplements automatically of course you can add your documentation on your own over here I'm I'm not adding it right now please add it by your own the way I have taught you to do so we love will have another evaluation strategy which is evaluation rmsc and then over there we'll again that's evaluation strategy that uses the root mean square error to calculate stuff so it just again just you know mean squared error and then rmsc and the squared equals to false so basically over here your calculator root mean square error right okay so now we have the rmsc also done so we have several evaluation strategies totally done now we'll just go ahead and then implement it in evaluation out here so now which is the last thing which you have to do is a very very simple that you actually implement this so we have uh so first of all I'll import from src.modeldev sorry evaluation I'll import MSC rmsc and R2 R2 is there evaluation R2 is there cool R2 is there um so it will just first of all we'll have the evaluate model this will take lot of things first of all it will take model okay that model will be a regressor mix in so we'll have to input this is the type of the model would be the regression mixing because it is a regression model right so import regressor mixin then we'll uh then we'll uh get the X test then we'll get the X test the X test will be the pandas data frame and then we'll get the Y test again for and for understanding now let's try to try to implement the solution let's try to quickly implement the solution so first of all we'll get the prediction we'll get the prediction quickly the prediction which we'll get is model dot predict and then on X test so model predicts on X test we can we create the MSE which is which is equals to MSE class so so sorry MSC class is equals to MSE and then we use that MSE is equals to MSC class dot calculate scores it will simply you just have to give y test and predictions and then we are done so now we have another which is R2 class R2 class and then 2 square and and then after that you have to calculate this course and that's pretty much it and then you have another rmsc class arm calculate this course and then that's pretty much it cool so we'll return at least let's return two things let's return um MSE let's let's return R2 score an rmse right because that's more efficient to actually look at the Matrix so we are done and we can just put this into try and then it doesn't accept and evaluating the models Okay cool so now we are mostly done the one thing which is left out here so you might be thinking what is left guess what is left so over here we are returning to R2 score and rmse so we also have to indicate over here that what thing we are returning so I'll just import from typing import Tuple and from typing extensions I'll import annotated okay so this will have Tuple and then it will return are two things let's annotate the float R2 score as well as the rmse okay so now I hope that it makes little bit more sense now I really hope so yeah cool so now we have the evaluate model also done which is which means that we are pretty much done with ingesting of the data cleaning the data more training the model evaluating the model now you understand everything is completed now what is left let's worry about that so we have something known as run pipeline so let's try to go ahead and do into into that Pipeline and let's try to create the pipeline let's try to run that pipeline right away from here so let me just quickly go and run the pipeline um so I'll just go to um yeah cool so I'll enable the cache as a true and this takes the data path that's that's good that's good takes the data path we have then cleaned the or just clean the effects clean DF takes something and what is the returns this returns extreme next to the nitrant okay so let's just quickly write this extreme extras and Y train y test it's a clean DF right this takes the data so now it is done we have train model so train model what is the dust train model you know takes extra index test why Trend y test and the configs as well so what what we have to do if we help we have to actually um okay so let's try to Quick quickly do that as well um so I'll just go ahead and model train so this is train model okay so train model and then after that model is equals to train model train model extras white as all the one on the Y test and then we simply go ahead and MSE which is MSC sorry R2 score and rmse you evaluate the model by giving these things X test and Y test I hope that really is X test and white test that's it yeah that's true and we are mostly done right so now we have the pipeline ready we have everything ready now let's go and run the whole pipeline to see the magic I am pretty I'm pretty much sure that it will give some sort of error but always be on a positive side so let's just quickly go and run the pipeline okay so no module named scikit line so let's just quickly go in so I'll just go skill learn pep install and then just go and okay I'll use this one because this is much more easy to install I just wait for this to be installed because I'm I'm actually using the new environment that's why please activate your environment before working on the project please that's the request for you all of you out here thank you let's wait and let's see the magic what happens it's running on so just wait for a few seconds um and after that we are mostly done done with the pipeline of mlops you will see the dashboard the next thing which is left which is integration of tracking of our experiments which is I'm using ml flow and then deployment of our model using ml4 deployment these two things are left and then we are mostly done with the project and you will be seeing like this and now now I really hope that you are seeing the way we do the project the way you know we do the caching stuff the way we write the code that is much more visible in the next set of projects you'll see much more challenging code much more challenging topics which eventually you will learn by yourself so I don't know why it is not running but okay so cannot unpack a non-iterritable step or effect object so I guess something really have an interesting out here so let let's just quickly go and see what the error it has given so this expects the data frame and this returns X Plus and Y train and Y test and then clean DF will have the following full name uh okay my test so let's you know if it is actually returning oh yeah so you see that that it is not returning anything we have to actually return it right that's what you're saying that that's why it is returning none and when it is asking for the output so that's why it is not able to cool so in just it has started clean data completed and then we just go ahead and then it does something small training completed something failed in the pipeline and R2 score is not defined so let's go and quickly do it so uh let's go to evaluation and then this is R2 not R2 score so let's just go and do that as well so now you will see how quickly it will be run first of all it will use the cached version you know it it will use the cache version of it and it will just do the evaluate model and then you're done you see the magic you see you you just see them as you can just simply install pip install Pi Arrow to remove this error so let's just quickly go and send ml up so let's just go and send them a lab please do it it will open it okay okay sir I'll give you the default let's go to pipelines let's go to train pipeline let's go to this pipeline you see it ingests the data it against the output clean the data at the clean DF it returns these these goes into training the model it returns these texts inviters goes in evaluation of the model error to score and rmse you see how magical this is right how magical and how really interesting these things has become right now I'm I I really think that this is the power this is the future of ml right if you don't know about this you don't know anything right so I just go I just hope that you understand it much more greater detail you are of most done with the project however there's two things which are left which is deployment as well as we are also left with tracking of our experiments so I hope this makes sense I'll be catching up in the next lecture bye so hey everyone uh let's come back to our project so basically the project is left with two things the first thing which is left which is of course our most favorite experiment tracker and the second thing which is left with the deployment of our model so I'll talk about what this experiment tracker means along with I'll talk about the different pipeline so let's just first of all talk about what the spec experiment tracker means so when you do the data science engineering or a real world Machinery Machinery engineering job then most probably what you will see that whenever you have you actually want to track every runs which you do because you have to tweak the parameters and then we rerun it and then check the school from the previous one compare it with several metrics and see how well well it was performing in the 38th run or even in the first run right so we need to track our every experiments which we are doing over here where should we Implement our experiment tracker the experiment tracker will will be implemented over the train model so what I'll do is quickly implement the experiment Tracker out there so when you go to the model train so model train will have something like this and uh I'm so sorry for the background noise I'm explaining extremely sorry because this is India and you keep on hearing these voice so I'll simply import ml flip import so once we import the ml flow what I'll do I'll uh simply go ahead and then initiate an experiment tracker class sorry uh object but but before that we have import something known as line from zml before and that seems like experiment tracker is equals to client get experiment get active style so I'll just go ahead and then I'll give it for a second yeah DOT active stack the experiment tracker so once we have this we can easily easily use this so basically what what we have to do we have to use this ml flow tracker right so we have to in in The Decorator we have to pass the experiment tracker an experiment tracker which we'll use is the following experiment tracker dot name and then the name of that so it should be notified that this step has the experiment track if right so now what we have to do in this case we have to actually you know um uh log our models okay lock our models so in this case we have to actually use the scikit learn Auto log so basically what I'll do I use this will automatically log your model's scores and everything out there right in the same way you have for several other libraries so basically we'll do the SML dot Auto log see what I'll do I'll do on something known as evaluation right so so what what I can do and let me just go to evaluation part and in the evaluation part I can I I have to do the same thing I have to actually copy the two couple of things which I did and then what I'll do I'll simply copy the step as well step as well and then over here what I'll do and then I'll simply go ahead and then select ml slow DOT log Matrix and then I log the MSC right same goes with mlso DOT log Matrix I'll log the R2 and then I log rmse right so I've already logged these three things now what I'll do it's mostly done so now we actually you can use this particular um statement over here sorry sorry sorry let's import something on this and for whatever flow so now what we have to do guess what what we have to do we have some simply go to not run Pipeline and then simply run the same pipeline so let's just quick quickly run I've already uh done this so let's just run the pipeline first so once we run the pipeline it will say that we are using the mlso tracker and once once it says that app we are using the amplitude tracker it will say something like this no module names uh ml flow so what you can do you can simply go ahead and it's just just like this animal integration install ml flow which is simply will go over here and then search something which is like this right and then paste it sorry for that uh simply paste over here which is animal integration install ml flow it will take some time to install them also but before going on to running that you have to make sure that first of all I'll explain you what the stack means stack means that there's something the stack which is a containerized thing where your project is running and the stack I'll show you where the stack contains stack contains very artifact stores which are default okay orchestrator which is the one you don't need to worry about what these terminologies means so but basically the thing which is a default the the stack which you're working on the what do you say stack means I'll say in terms of environment which you're working on you will also need to stay to the Xenon that I am going to use ml flow please register this experiment tracker okay and just like this as in stack you have orchestrator orchestrator means we'll talk about it which eventually it's called this pipeline however we have artifacts right artifacts will talk about all of these terminologies in very Grady day however you don't need to know lot more you all have so much theoretic theoretical books but basically will first of all install our ml4 integration once it is installed as soon as you can see over here we can simply go and register our experiment tracker so once you go and register our experiment tracker you can just as animal experiment tracker register ml4 tracker but before that we're going to do I'll show you stack list and it will show the set of stacks which we have over here right and uh it's taking too much time I guess yeah okay so basically this is a very common error which you might get if you're using Mac so you just have to do a couple of things the first thing and then what you have to do you have to run under the command which is animal up so when you disk disconnect it so basically it is giving another error which is error initializing SQL store error initializing whatsoever this is something new error however you can totally choose to ignore this right it says foreign or we might be okay fair enough so and then we'll up it and if it gives the same error we have to run the general disconnect maybe and then let's see if it works ml okay thank you that's forbid for a few seconds and it should work okay cool it's working pretty fine that's very very nice so now we have we have black stats so let's just quick quickly go and let me show you what the satisfied me so guys the current stack which we have as of now it will say that everything is default right so your orchestrator's default and artifact store is default right orchestrator where you're running an artifact store where your variables you cannot assume the artifacts which are being stored over there so let's quickly go and then also make the experiment tracker so you can just go over the readme and then copy this command and then paste it which is animal external tracker register our ml4 tracker which will have the flavor of ml flow so it says that unable to register ml flow tracker which is in the same workspace so let's quick quickly go and change something okay so now it should work fine because I guess I've already used it somewhere cool then you have to just go and uh okay let's just quickly ignore model deploy as of now okay we'll come back to this or let's just do one thing let's just quick do do this as as long because this is important to do so uh will we come back what does this model deploy means and then we'll register okay so I'll just custom just copy it so that it makes sense and it will set the deployer so it will just set to select something like this okay fair enough so it says that unable to register the stack name full stack so again it is saying the ml flow stack name is registered because I've already used it in the past so what I'll do I'll simply make it customer customer yeah you do it then just wait for a few seconds so most probably it's done so now you have the amateur so when you do Cinema is once you do it you most probably see your stack over there so which is default which is and now your model deploy is ml flow customer and expand tracker Android tracker so will we have done this so let's now run the pipeline and then let's see what this is what what this leads to and let's just wait for a few seconds to do to complete the Run of it so it is initiating a new run and it is saying something really happens using unsupported version of your power errors blah blah blah you just have to downgrade or something upgrade ml flow or whatever it is giving if it totally choose to ignore this but there is something interesting I'm saying as we said that there's something interesting comes in so me might need to um maybe most probably my cases that maybe just search a video because I'm not sure what this error means Observer it says given no okay uh huh okay so scorer maybe let's just go in upgrade it what do you think so it says the warning and lets the warning says that try upgrading and downgrading the scale learn version to a supported version or try upgrading your ml flow okay so pip install equipment installed okay okay and then what I can do I can just quickly go and then serve it over here but yes okay so we might need to upgrade a little bit version of thumbnail flow and let's see if this if the error still processes if it persists I'll just see the one under the solution which I have in my mind so most probably your most errors in envelopes is being fixed by you know just upgrading reinstalling you know disconnecting then connecting even restarting your laptop fixes letters because sometimes you don't know what is happening behind the back so you actually have to be very careful while initiating stuffs please okay so that that was a very simple letter we might have to you know so we might have to you know upgrade then it run completely fine okay boom that's nice so let's just quickly go and so it's default log in pipelines this one this one and here we go so if you go and see the configuration you have the experiment tracker as well you might think two things as of now the first thing which you which which you might be thinking that hey ayush how where can I find my um where can I find my you know uh most probably my ex tracking URI you know how can I view the ml flow stuff in in my own stuffs right so let's I'll tell you two things okay first thing which I'll tell you that how you can track the URI and stuff like how you can view the experiments so let's let's quick quickly go and search about uh how you can track the URI right so let's just quickly go okay thank you so there was a man I'm just trying to search one thing which was they had given a very nice code actually you know okay we got it we got it okay so what you have to do you have to just go quickly over there and then just go quick quickly over there and in your run pipeline go over there and just okay let's place this because you will get your URI wait for a few seconds let it run we have initiated our cached so it'll just use the cache version of everything okay cool so it says that your file is available over here maybe yeah it makes sense Okay cool so um so your file is available over there now what now what we will do we'll just run ml flow UI button so something like this I'll show you which is which you can find on official you know this animal page as well MLP URI and then you paste the URI which you got by pasting that code which is ml runs okay there might be some error okay let's let's just quick quickly around that you know it's something very important to run I guess it will give some sort of error I'm not exactly sure okay got an expected argument which is maybe we have to make it in like this and it lasts as well let's just see if it works if it works then it's fine it works so let's just go and paste it over there and expect it to run so this is three three minutes ago you just go over there you see the Matrix which is listed out here right you see the parameters you see the model which is literally logged in ml model right you can use this model to make prediction make ml flow to make predictions or even pandas to make predictions you see how interesting this is this is pretty pretty amazing this is just locked each and everything so I that's what I wanted to Showcase to you I hope it makes sense to you none what I'll do I'll just cancel it up you can just say all these commands are available you know don't need to worry about it cool so now you're done with experiment tracker in the next video we'll just go ahead and then worry about something known as deployment of our model I hope that will also sound well to you okay I'll catch up in the next hey everyone welcome back to this video in this video what exactly I'm gonna do I'm going to actually cover the last of the last thing which is dipline pipeline so we'll use the ml4 deployer to actually deploy our model locally so that you can use it uh and and make predictions right and we'll see how to actually deploy our model uh you might have seen that you just saved the model use some fast app applications load load the joblib and then you do it that's really not true which happens in a production use case you actually use something known as Amazon deployment or sell the deployment pipelines stuff to do it so what I'm going to do I'm going to actually use something known as some something really known as MSO deployment which is entirely used for local deployment mostly used for local deployment for deployment on AWS or you know GC Cloud you might have to use celt and core because that's much more advanced deployment software but as of now let's just go with ml4 deployment software or sorry A Tool uh to get started with it so the first thing which I'll do I'll go ahead and create something known as uh deployment pipeline okay a deployment pipe a deployment pipeline so let's let's just quick quickly go and make a deployment pipeline so what I'll do is go ahead and then create a deployment pipeline okay so over here I'll just close it down it looks good okay so run deployment done deployment dot buying right let's just go there and then just first of all remove all of this because I you know the reason why I like to remove all of this because I think that gives me much more pleasure if I remove all of this because that seems like okay fair enough you have something uh off the load that's why I really really like this stuff okay cool so let's just go to the Run deployment and then and run deployment what I'll do so basically you might have already used a click over there right so so basically we'll create two pipelines we'll create two pipelines the pipeline which we'll do so let's first of all create the pipeline as well which is deployment pipeline deployment uh pipeline dot Pi so in that deployment pipeline dot Pi will make two pipelines which is continuous deployment pipeline I'll explain you what this continuous deployment pipeline means as well as inference pattern later we'll explain what does it mean as of now let's just go in quickly assume that continuous pipeline is like a traditional pipeline which we have built prior so let's just go from pipelines dot deployment pipeline will be going to import some certain things so as of now let's just go with this which is a deployment pipeline and inference pipeline okay so inference pipeline okay and then what I'll do I'll simply go ahead and create some do do one thing is I'm going to use click okay I'm I'm going to use click so that we can just state in a command that okay we're going to deploy or we want to predict or whatsoever so I'll just go ahead and then create a click command which is Click dot command and click option click option would be click option would be uh so let me just quickly copy it because that's something is easy that rather than I write whole set of thing so let me just copy it over here okay so click command is config right config and then it will say Okay coin config what you're gonna choose you're going to choose deploy or you're going to choose predict or you're going to choose deploy predict so let me just quickly write over here deploy predict and deploy and predict okay so you can actually stayed in like this python run deployment pipe rundeployment dot by dot slash sorry uh dash dash config and then you want to deploy or predict you in simply write it over there and then you have minimum accuracy will come there come come toward this mail minimum accuracy means in some time so uh then what then what I'll do I'll create something known as this which is run deployment and that config is Str and minimum accuracy is of course number will will come toward this minimum accuracy means uh okay so we just go ahead and write float okay and over here what I'll do if it says deploy if it says deploy what I'll do I'll run the depth deployment Pipeline and if it says predict I'll run the inference pipeline I'll run the inference pipeline okay so let's just quickly get started with it so now now you might be thinking hey is it done no of course not we have to actually build this deployment pipeline as well as the inference pipeline explain you what this kind of macular c means so let's just go though over there and quickly create our deployment pipeline right away so I'll just import import numpy as NP import pandas as PD from zenml import pipelines comma step and then from uh let's let's quickly import all of this from xenamel.config you'll see where where we'll use this Docker settings docker settings zenmill.config right that's correct okay so and then what I'll do I will just import some something which is ml flow deployer so I'll just copy and paste all of this thing so that it's much more easy as of now you can just forget it what does it mean and stuff will come back to this later on so let me let me just quickly go and just copy and paste it over here so we have imported from General constraints we'll actually make user for all of this please don't worry about it we have also imported our uh Steps From the steps which is clean data in that clean data we have imported clean DF so let's let's just go and import clean DF then you have evaluation in evaluation you have evaluate model so let's just go and write evaluate model ingest data you have ingest DF so let's just go and import that as well and in model chain you have train model which is already there okay so now what now what we'll do I'll just uh first of all so basically I want to train the model right sorry I want to deploy the model as well as if the model is good in accuracy we'll deploy it okay so let's and also we'll also create a Docker setting so Docker setting is like what are the libraries or the tools which we need over here so in Docker setting the required Integrations which is the the the required Integrations which we have Integrations which is equals to uh only ml flow right so we we won't use only ml flow Library into this okay now what I'll do I'll create something known as I want to actually use the model you know I want to use the model to um to actually deploy the to actually make the predictions but before that I'll create the basic pipeline which is the continuous deployment pipeline so I'll explain you what this continuous step around by pipeline means let's just go first of all so pipeline comma we have to enable cache equals to true we're going to enable the caching and then settings is equals to which is Docker first of all so we'll use Docker settings right Docker settings and then that's it that's that's pretty much it so we'll create a deployment pipeline so continuous I guess the spelling is correct continuous deployment pipeline and then over there we'll have first of all minimum micros will will come to that what does minimum accuracy means uh then then after that um we'll have workers number of workers which we need and then we have timeout so timeout means how much like what what will the required amount of time if it is in Loop then at how at how much time we should stop the run right so when the timeout should be there so so that's why we have imported this default service start which is from constant we have imported this divorce default service start stop timer timeout so that we can actually stop the pipeline if it is taking too much okay so first of all let's just quickly go in and then just run that in justdf we have actually imported over there and then we have extreme so let's quick quickly go over a training Pipeline and in training pipeline let's just import everything out now Okay cool so let's I've imported everything which is we have the R2 score as well right so now um so you have the R2 score now what I'll do I'll create um what you're saying the deployment uh the deployment which is the deployment decision okay so now now once we have the rmse now we have the trained model so now what we have to do we have to actually deploy the model so there should be some criteria for deploying our models what is that criteria criteria can be if your model is great if your model accuracy is greater than the minimum accuracy which which is required to deploy the model then only deploy the model that's where your minimum accuracy comes in place so let's just quickly go and create something known as deployment trigger okay so the deployment decision will depend on this deployment trigger so first of all let's create a step called deployment decision okay so um so let's just quick quickly over there so I'll create a class and the class will have Docker sorry deployment trigger config in that we have the base parameters right base parameters and then the minimum accuracy the minimum accuracy of it will will change or don't worry we'll change it as if I'm just adding a random number we'll create a step and the step will say is the Define define deployment trigger trigger and that trigger first of all the accuracy uh which will be a float and config and the config from where we are using that the config so basically we need a config right so we need a config dot deployment trigger and so you have it's important so basically what does it does so let me just write it implements a simple moral deployment trigger that looks at the at the input model accuracy and decides if it is good enough to deploy or not okay so this is a very basic deployment trigger it says first of all it will return the accuracy greater than or equals to config.a so basically so basically we will make use of the deployment out here so basically I'll I'll tell you what is so the will create a deployment decision deployment decision and deployment decision will contain the deployment trigger and let's use as of now rmsc rather than R2 score so if you if you don't know about R2 score you can just go online and search about it like space it's it's like it it indicates like whether it's a goodness of it or not okay then then well you actually actually use them right so let's just go with something known as um MSE or rmse right or maybe maybe let's let's go to R2 score because that's more good okay so now we have the deployment decision now so what it does it takes the R2 score and then it takes the which is the minimum accuracy which is required minimum R2 score which is 0.992 it only deploys this particular model if and only if if the different deployment decision true how does evaluate versus checks that your R2 score is greater than this or not if it is then only it deploys the model then it will go to the next step the next step is ml flow model deployers step so what is ml flow model deployer step so let's I'll show you what does that mean mlso model deployer step which is over here so we actually import from zenable integration sample for steps we actually use this ambulatory model deploy steps which actually is that we as already pre-built step we can actually use that to deploy our model so we'll have to give certain parameters so what is the model what is the deployment decision is the deployment decision workers which we need so workers is equals to workers timeout is equals to timeout okay cool so now we have the deployment pipeline done now we can actually use this for inference so for running our deployment pipeline okay now um so now I think we are mostly done with it right so let's just quickly go and run our deployment continuous deployment pipeline that's much more good to get started off with and then we'll come back to uh building up our inference pipeline so let's just quickly go to run deployment pipeline over there and uh uh yeah so let's just quickly go over there and now what I'll do I will simply this is the config which we have right and this is the minimum accuracy which is required for us the deploy our model so we'll create something something known as ml flow model deployer component so this component will will be like Amazon deployed so let me just quickly go and then import with that and each and every libraries which we need technically yes so let's just import the libraries which was required I'll just paste from my repository okay so basically we'll actually use the first one which is ml4 deployer which is ml4 deployed component and then this will ml for deploy.getactive model deployer this will take the active model deployed out there and then deploy is equals to config is equals to deploy deploy or confess config is deployed this okay so if the deploy is this or this it will run the deploy if that predict is this or this it will run the predict cool so let me just quickly import my from pipelines import continuous continuous deployment pipeline that continuous deployment pipeline will be the following so let's just take that and then continuous deployment pipeline that deployment pipeline will contain the minimum accuracy which is required and then after that it will have following workers which let's name it as a three and timeout maybe 60 seconds okay so this is our uh The Continuous deployment pipeline so now what what we can do we can actually you know um use it so let's quickly let's let me let me just copy it from the uh repository so now we will have the will make the predict one very soon but but let's just write it out so we can say that you can run your so this is the thing which you which I've copied from um General repository so it says you can run your ml for UI text so so you can see the visual representation of your models right and then what it's then then what it does it fetchs and then we have to fetch the existing services with the same pipeline step name and model name right so that we can say that if there is any existing services are there so I've just copied it from their own ml flow examples repository because these are mostly same so basically we are fetching these existing Services if it is running if there's an existing Services running or not right and then we have M4 deploy a step and then model which is the model name if the existing services are there then then we say that the existing Services is running locally as a German to stop the service it will link this and if the service is failed then it says the ml4 service has failed or you just say there is no ml for prediction server is running right so let's just use the deploy model to get started with so this is a very basic run deployment um as we don't have the inference pipeline so we don't need to worry too much about so let's just quickly go and run this pipeline up right and then we have and then we are mostly done so let's run this pipeline and we have the minimum accuracy right so then we don't need to really worry about stuff okay cool so let's go and run python run deployment uh config and in that configure I deploy the model let's see if it gives any error if it gives then we'll solve it right away so it says that materializer is not found like material there's no module named materializer very in deployment so are we using the materializer I guess yeah okay let's remove this one we are not using it you don't need to worry about it just go ahead let's solve the error with it is going to give okay it says that invalid settings can either refer to this this invalid setting Docker settings settings can be the refer to the general settings or stack components there might be some other interesting error so that let's see first of all if where it is giving error okay it is giving in deployment Pipeline and in deployment pipeline what is it giving it's that's that's why I say that most of our time will go into this only just by solving these pretty errors so where it is bro okay fair enough so we actually have to write settings to be Docker not Docker settings right and mostly we are done okay fair enough okay cool let's run it now it can be available keys are either resources or Docker so we need to have the docker one over there rather than Docker settings okay fair enough so it gives the main why it gives the main so we have run deployment rather than main so let's run that sorry for that let's wait please run whoops so it says convex so basically I there is some naming error I'm so sorry for that again no problem you know these things you know very silly errors which I do you know this rectifies this I really want a tool that rectifies us all of this naming errors or you know import errors and all the stuff okay it says that the wrong arguments Amazon deployer got an unexpected argument called deployment decision what is it so where does it gets an error it gets intern deployment and then it says okay fair enough so it takes and continuous in that continuous you have the ml diploma decision and then it takes that okay fair enough so let me just quick quickly go okay so basically it's uh actually non-deployment decision it's a deploy decision not deployment decision Okay Okay cool so let's just run wait I hope it works diplomi decision is not defined so why did or give enough again I did the big mistake it should be over here not there I'm so sorry for it for these mistakes because this is this is something you know when when things are super occupied when you have no things out there these mistakes happens so it initiates the new run it says missing entry point input data path so let's input our data path where to input the data path okay I'll I'll add it over here only because I guess that's more important right or let's do one thing let's write over here will be data Str and then let's just go over that and then also add the data path it's complete from directly over here let's just quickly replace it okay I hope it works now if it does not then we again have to fix something inference is still left so stay tuned for inference and mostly will be done by then missing empty Point data path why where it is getting I guess okay okay okay again we made a good error so we have to actually put the data path to be data path sorry sorry for that this little little layers keeps keeps on happening so you have to actually debug it and see where your where code is running and stuff it is using the cached version again I'm saying it's like a deployer okay so it says that animal flow model with the name was not logged in the current Pipeline and no running Amazon server was found please ensure the pipelines includes a step with the ml to experiment to configure that trains a model and locks it to that so most probably what I feel that we have to make it false right and then let's run it now if it does not then you know then then we have then then we'll get in a big trouble now if it does not deploys the model we are going to get in big trouble if it is not please let's wait now this is something where I just pray you know that it works out because this is a step where you get most of the errors and if some unknown error happens then just you have to actually spend your ton of time in it ton of people has to spend your time in it you know uh because it's not a very simple thing actually go inside your system and see if it works or not now see what is happening so it says no materializer is registered for type planar regressions or default pickle materializer was used it's not production use each or not we can blah blah blah so basically we have to actually make a materializer I'll show you how to make it later on I'm just waiting for it to deploy our model ml flow model deploys them and if it is and if it goes above service demon is not running okay something really happened now so it says the fail to start the mlso diploma service model serving ml flow okay for more information the status please see the login file again something really interesting happened so basically timed out happened over that okay let's just go and see if what we can do in this case so what we can do is fail to start the service I'm also deployment service demon is not running okay so xenamel up the cinema tag describe is there anything which we did wrong in that deploy we have the following why it is skipping the worst error okay fair enough so something really interesting happening over here we actually have to undeployment okay let's try to run it if we see it like this this might be causing some problem okay this is this warning okay okay skipping model deploying because the model quality does not match the criteria so again it will see the same thing so basically what's really happening that it is not matching the criteria so let's let's write 0.5 this is that skipping model because the model quality does does not match the criteria using last server deployed by step and continuous per model so I guess that's that wasn't like it was not meeting the criteria maybe that's why yeah so let's try to I have reduced my minimum accuracy or maybe I have not so minimum accuracy 0.5 let's wait now we just we can do just one thing just wait for it and save it box so basically I'll tell you what happens is in these type of cases you have actually concentrate more you know you have to see actually what's going wrong what might go wrong even the smallest thing even restarting your laptop really works I've literally seen uh I was calling a header for two days and I saw okay fair enough like I just restarted the laptop and it works like a charm so yeah so just just wait for a few seconds and let's see if it works or not the development trigger started ml4 development service skipping model because the model does not meet the criteria my goodness so it's a really interesting things are happening out here right so uh what I'll do I'll just quickly go over through the code and let's see if it works and let's see if this let it work right and then we'll just go and see if what happens over the app okay so we have the deployment existing services and uh okay so it will of course not work because this does not takes too much of time right so I'll just you know go ahead and then see if it works on your site and also deployment service and okay and then you just go into deployment Bike Link in that deployment pipeline you have the get data for mlso deployment service parameter steps service demon is not working fair enough I I get the words the error is about I get it what's the error is about 10 parameters if your accuracy is greater than config dot minimum accuracy and your minimum accuracy is this one 0.5 okay fair and then you have the ml flow model if a load step parameters so basically we have pipeline step name running okay so let's just copy this ml4 deployment loader steps as well so what it does it helps you to get the get all the stock which is and also deployment and then we have the prediction service loader and predictor which will come to in some details you know I've already written this these code I would actually right so let's just go and then just you know okay fair enough so I'll try to try to run one more time maybe right so I'll try to run one more time and then see if it works but before that we'll do and just check the everything is working fine on that side which is run deployment and in that one deployment you have the following ml's low deployment services and then you have that to get Rich's animal that model deploys I'm also model deploy do we have the MFL model deployed okay we have the get tracking URI and then okay fair enough so I'll just run it right nicely again let's see if it works or not if it does not then we actually have to go inside and talk to this animal theme and then see if it works because you know we have to actually have the continuous talks to the Zen ml you know because because it's something you know some something which should be which maybe have which may be a bit common in their side and which may have they may have solution to it or we may have to open the GitHub issues and then may most probably will have the errors solved because this is how we solve it it's just we we are not exporting this we just go to some people and then talk to them about it right we'll try and we'll try one one more solution which is this one we'll try to this solution which is the linear regression model okay fair enough again it is just does not matches the deployment criteria so I want to see the outdo score Auto score is so bad bro okay fair that's why it was not given okay okay so I guess I'll do scored is very bad that's why it is not giving good errors so I'll just add so maybe this and this this might run zero means like which I I want to just showcase you that that it deploys the model right so now I'll go and run it R2 score is there all right so of course the autoscope is greater than that so yeah why it is zero brown something real interesting case is happening on fit with me really interesting Okay cool so it's read and also we'll fix that um which is this one we'll fix that if these things does not works well please run run no more data lines that is registered that that is that that is common sort of updating an existing members or deployment service which is this one and let's see if this works if it does not which is like it met the criteria now it met the criteria now let's see what it does it should work you know but if it is not will come back and then see what it works or not okay let's see if it works or not updating an existing and also deployment service a test test the stage I think it will mostly not work because this does not takes this much time it will say that a Daemon is not working blah blah blah we might need to do something we'll try one one more solution which I have in my mind okay service demon is not running for more information please see the following log file so I'll just check it out and then come back very soon so everyone there was a very simple letter so basically my I've already tried this ml flow on couple of environments that's why it was like service that we have the current service running we cannot actually use that so what I done I actually deleted that uh then I'm you if you were working on a new new uh you might be working on new stuff right you might work your new stack right so we have to actually use the new stack that's why it was giving me error now it is working totally fine the only thing which is not working fine is the following so let's just go and uh fix that thing so basically what what you need to do we need to import something else cast and let me go and just import that from typing in podcast from hyping import cast okay fair enough so we'll just run it deploy and then let's just see if it works you can totally choose to ignore the warnings and stuff or you can just go and solve that thing if you want so it ingests the data first after it is ingesting the data it cleans the data data cleaning is completed it goes to the next step which is trains the model it change the model then gives some sort of warning so you can totally choose to ignore this or maybe see if it works if monitoring is completed train model is finished and it gives some you know root squares and then deployment trigger I started diploma and also models step has started it updates an existing ml flow typical development services right that starts with the ml for deployment services and most latest times right so it starts the service so let's just go in and see if it works hopefully it should work if it is not then I'll just you know kick his ass off okay so sorry valid okay so now your model is available to make the predict and make your prediction over here because it is already running you can also delete the model if you want so now your model is successfully deployed so now we need to do we need to actually make predictions from this model so I'll what I'll do I'll quickly go ahead and then create something known as um I'm sorry for it go to deployment pipeline in that deployment pipeline first of all we have this Amber store diploma load loader step which will help us to load that model okay and then let's go and then start doing stuffs so uh we'll just go ahead and then create something let's define Protection Service loader right so I'll just copy and paste the code if you want but yeah but okay it's like I've already this is already pre-written Okay so let's just go and then write the prediction service loader so we'll create this step where we'll enable the caching equals to false because sometimes caching is also not very good then we'll create a prediction service service loader and in that we'll have the pipeline name pipeline name will be Str python step name pipeline step name will be also also Str is it running boom it goes to true and then model name to a model okay and then it returns whatever it turns it returns ml flow deployment service okay it Returns the Amazon deployment service so basically it gets the prediction so service started by it it gets the prediction service over here so just copy and paste it yeah so it gets the prediction server started by the deployment it takes all of these arguments in it so first of all get the ml4 deploy stack component so basically we'll get the ml flow deployer stack component so which is very simple get active model deployer over here and then what I'll do I'll existing fetch existing services with the same pipeline name and model name so what I do I'll go existing Services which I'm also Ms or deployer which is the model deployed component dot find models server and in that we actually write pipeline name to pipeline Name by pipeline step name model name and run it right so if it is running running running to be running that said that's pretty much it so if not existing Services then we say we raise the runtime error and in that runtime error we say that no Amazon Services is found which is like this it's no step in this Pipeline and something like this you know pipeline for the model name is Spotify so I just copy and paste the errors which is so traditional errors found from you can just just go at you know as animal examples in this copy and paste this is not a big thing then you print the existing services or maybe surrounded by then then you return what you return you know you return existing Services Etc so you return the services by using so it is actually prediction service loader it loads the current Protection Service so to actually use this for model predictions now what I'll do I'll create the predictor so I'll create the predictor the predictor will have this service that service would be the ml flow deployment service type of that and then it then then it takes the NP Dot and the array right and it returns those array of predictions and B Dot and the array so what is error area of predictions so we'll first of all create a step over here as well that step will be of dynamic data importer so that will create this step of enable cache equals false then we'll create a dynamic importer that returns Str that returns a string right so it downloads the data from the first of all downloads the data from a mock API or maybe just just just go ahead and create data is equals to get data or test and return data so we have to actually build this quickly so let's just go and quick quickly build this which is utils urls.com let's go there and run it so we'll just have import login SPD from Source dot model sorry data cleaning in both data cleaning and data process strategy pre-process strategy and then we'll just use this once get data of a test where we first of all get the data for the test and then we want the 100 one we actually clean the data we drop the review score we convert into the Json format that's why it is the it is returning Str okay now what I'll do I'll simply go ahead and make this so let's just go right quickly make this so I've already made this let me just copy and paste that because this is pretty simple to understand okay so first of all it starts the service um and then it loads the data it removes some of the column data for the columns which we want from the data we create we converted into pandas data frame we convert it in less and then we finally convert that Json list to a numpy array and then we make the prediction from that service okay I hope it makes sense now okay fair enough so let's just go and run it now so we are mostly done now what now we have the prediction service loader now at last we'll create the inference by plane so that inference pipeline will have okay sorry pipeline will enable the cash flow settings the docker then then I'll create the inference pipeline that influence pipeline will contain the pipeline name which we want and the pipeline step name right which is Str so it first of all uses the dynamic importer Dynamic importer right and then its service which is a prediction service loader it gives the pipeline and running equals to false let's just write running because positive now and the new protection protection should be predictor yes this is a server basic also service and then data like this and then we're mostly done okay so we have the data over there and then we have the service over there now we give the service it uses that service from the from this and then make predictions on this data using predictor right which is like predictor you might have scale over how Okay cool so now we are mostly done let's just go to run deployment and then run the pipelines so let's just go to run deployment and then we'll just import our inference Factor now there so I just go and import inference Pipeline and once we import the inference pipeline we'll just go ahead and run our inference pipeline so yeah the pipeline the pipeline name should be continuous deployment pipeline right and then your pipeline step name which is Amazon deployer step I guess it should work now most probably okay so now we have done that so let's just run the predict it's so tiring please fix this error I want you all to fix this error this is very basic you just have to write X animal downgrade fair enough so we get better nice nice that that is expected from dot URLs import get data of a test I'm very happy that uh that you hope so one thing which I'll tell you that it is very likely that you sometimes don't understand it right because this is something very conceptual very technical things out there so I want you to be very strategic and understanding stuffs right so please be if you if you're not able to understand it that's totally all right so it says that data process strategy is not there okay Blossom strategy is notifying what does it mean name the other pre-processor strategy mm-hmm if it doesn't it again gets better we have to worry about it something really interesting what happened I really hope that it does not give sad or not Yahoo okay so building materializer can only write unable to handle class numpy and the array built-in materializer can only handle the artifacts of the following so it gives the matter let's just go and fix this up this is so tiring um okay let's go to where built in in the predictor okay where is predictable and deployment it gives them predictor and by Andy array ep.nd array sorry this was Str by numpy array so basically the the data which you're getting is SDR not an Empire ndre if it works if it is not then we have to worry about now okay Json is not defined import Json anything else please give me the error Fast Pro I'm I'm so worried about errors fix this up so just just say it's animal down downgrade get to it should fix please yeah so we are done so now we actually completed our stuff right so let's just go over here enjoy so you have the dynamic importer prediction service loader Dynamic import outputs and then predictions service output something which is a service and then this uses output and output output of the which is the data for test and this first service and then uses both of them to make the predictor and this actually the predictor outputs the following so if you're going to see the visualization you see something really interesting that your predictions has been made over there so I guess this is not showing any visualization because of some mind error so you see that your mean is a standard deviation of predictions is this right so you actually actually it made the predictions out there okay I hope it makes sense now we have deployed the model made the predictions two now you might be thinking here is how can I make the single handling single handle predictions right so we are done with the deployment and inference as well now it is actually making good insurance out there it is actually predicting but it might happen that it is still confused so let's just not have too much confusion in your head and then fix that confusion too so I have actually made a simple streamlit Dot send it app.i Abby okay paste it right right I hope it works mostly so you from the deployment pipeline it improves the prediction service loader and then run deployment from run deployment it Imports let's make it main rather right Main and then over here as me okay so everything is same the only thing is if the person clicks the predict button it will go to the S prediction service gives this and then it says that if the service is there then create the date data frame that's the same step which which we obtain and predictor and predict from the data right so let's run the streamlit run streamlit app.pi please run fair enough high level overview is not just so I'll just make sure that I remove all the images okay and images okay let's run it now [Music] let's wait it is running cool so I'll just make it zero zero everything so it's now with giving predictions so basically your detail is 4.22 so basically it is actually using the prediction from the model so you see that we haven't even saved a model save to save the model is not there it's actually using it it's actually the pipelines over there if you go and see the pipelines so far so let's go to pipelines and then let's go to continuous deployment pipeline we're going to continue step learn pipeline you have the following inference pipeline which is done right and then let's just go behind back okay let's just go to continuous deployment continuous deployment continuous deployment and in that continuous deployment you see the 100 continuous deployment is also there right so hold back pipeline is done so we are done with one project right I hope it was really good project for you uh I understand it cool so let's just go ahead and then that's it that's the wrap of the project and the next project we'll actually use something known as customer Channel maybe let's cover the next project up topically and catch up later bye

Transcript for:ML Ops Course Lecture Notes

Transcript for:
ML Ops Course Lecture Notes