Transcript for:
Bangalore House Price Prediction Project

Hello everyone, today we are going to predict the price of a new house in Bangalore. So, here you basically predict the price of a new house. So, first of all you select the location. Let's say 6th phase JP Nagar. You enter the BHK number. Let's say 4 BHK. Number of bathrooms, let's say I gave 3. And square feet, I want it at 2000 square feet. And you predict the price and it gives you back the amount. So, it will cost you 1 crore 43 lakhs INR. So, let's see how this project can be done. So, we are going to use Flask for building this website, Bootstrap for designing and a little JavaScript to display the prediction on the same page. Do not direct to another page. Okay, so let's talk about the data set. I got this data set from Kaggle, Bengaluru house price data. You can check out the URL for that. I downloaded this and now let's go and explore what the dataset looks like. So here I have imported pandas and numpy. I am reading this file which is named bangalorehouse data.csv in the data variable and let's see the head. So in this head, let me just zoom in. So in this head what we have is First of all area type which is super built up area, plot area, built up area Availability means whether the flat is ready to move whether it is available or not Like if I buy today whether I can move to it today or not So it's ready to move or the availability date Okay and the location Size, in size it's basically BHK mentioned 2 BHK or 4 BHK We will see the bedroom in some places Society Total square feet How many bathrooms are there How many balconies are there And lastly price in lakhs Come on So let's see how much shape is there We have 13000 rows 13300 or 9 columns as specified here First we will see its info So in its info what we have is that These three columns are float64 and I think that's it There are many non values in society We will see how many There are some in location and some in size So what I have done now is I have put value counts on every column So let's see them one by one First I have seen on area type So there are four values Super built up area, built up area, plot area, carpet area and their corresponding value counts are these First 8700, then 2400, then 2000 Then Availability So there is a Ready to Move Ready to Move means if I buy a flat today then I can shift to the flat today So those values are 10,000 and the rest of the dates of when the flat will be ready So there are the value counts here Counts basically you can say There is Location column, on Location you have different locations so whitefield is the most common one which is 540 times then sarjapur sarjapur and all the others are there too and then the column is size size is basically mentioned that how many bhk are there 2 bhk 4 bhk so here basically there are two types of data one bhk and one bedroom we will fix that we will cut this integer 2 bhk is the most common 5200 times. Next we have the society column. To be honest I didn't do much analysis with society because here the society column will drop. I'll tell you why just in a second when we will see missing values then we will drop the society. Then there is the total square fit column where you basically want float values or integer values ideally. But there are some values Where the range is mentioned, we have to fix that. Wherever the range is mentioned, we are going to take the mean value of the range. So we have to fix that. It's in object. Then bathroom. The bathroom is actually in the float. Here basically mentioned were 2 bathrooms, 1 bathroom, 3 bathrooms. Actually 2 bathrooms are the most. 7000 then 3 bathrooms. Kind of obvious. Then balcony. Balcony 1, 2 or 3. Balcony house 5000 Means 2 5000 Almost 5000 then in price it's actually numerical column but we have added value counts on it so let us now see that how many null values are there so in area type there are no null values I have added easy and sum on it so there are no null values in area type there is no null value on availability there is one null value on location there are 16000 null values on size and 5500 null values on society Society and Balcony are the most in the society So if you see our data shape was 13300 But the missing values of our society is above 5000 So what we are going to do is drop them We are also going to drop the balcony because I have checked that it is not working So we are going to drop Area Type, Availability, Society and Balcony Because these 4 features Our model pick like this no use so we are going to drop them Next up man a deca is car describe method so a key numerical actually door numerical columns a a cabot bathrooms or a hair price so Yappa bolero key bathroom pay the minimum number of bathrooms that our data has is one a bathroom cut flat Fair my first quartile a dough bathroom car median had the bathroom car third quartile a three bathrooms car or The max value is 40 I don't know which flat has 40 bathrooms But that is what our data says Our mean bathroom is 2.69 It doesn't matter because it is a categorical column Price, our mean price is 1 crore 12 lakhs Our standard deviation is 148 Our minimum price is 8 lakhs First quartile is 50 Then third quartile is Sorry, median is 72, then third quarter is 120 and then the max value is... How much is it? I can't even count it. 36 crores. And then I checked the info on this. We had dropped some columns. So, after that, we have these columns left. Now, let's fill up the missing values one by one. So, first I put the value counts on the location. And then I checked that... So, this white field and sarjapur have the most values and only one missing value is here. Here it is 13320 and here it is 13319. So, it doesn't matter, we can replace it with any value. Because, if one row is wrong, nothing will go. So, we are going to fill that. I have applied fillna with sarjapur. So the missing value of location is missing. Then I checked the size and added value counts. So I already saw that these were very problematic. How many missing values were there on size? Let's go up. There are 16 missing values on size. And 2 BHK is the most. So we are going to replace the missing values with 2 BHK. 16. There won't be much difference in the data. So we are filling the NULL values with 2 BHK So bathroom is actually a numerical column So bathroom had 73 missing values So we have replaced those 73 missing values With the median So first I wrote the bathroom median And then I put fill in it So the bathroom's null values went away Like bathroom's null values went away then I checked the info, there are no null values now if you have noticed that we have to fix the problem of BHK and Bedroom so here what I have done on the size column BHK and Bedroom, I have split str. so first I have split according to space and then picked up the element so str.get0 and the string that came after that was actually a string so I typed int as and stored it on another column BHK and then I have seen that which are the flats where BHK is greater than 20 so I saw that there are only 2 flats like this so 27 and 43 so these are basically outliers in your data so you have to fix them now when I have seen I told you that total square feet is a range problem ideally you would want this square feet to be a float number or an integer number but there are some ranges here we have to fix them so what we are going to do is and then we will split it according to hyphen and then we will get the mean of both the integer values so we will do 1133 plus 1384 divided by 2 wherever we get this range so I have made a function converter in which every value will be taken suppose we take this value and we will split it over it suppose we take this and we split it over it So, in this temp, both strings are given as 1133 and 1384 And then I checked if this is actually a range or not If this is split and two values are created That means that we have got split If this is there, then its length of temp would be 1 And for this length of temp would be 2 And then what I am doing is, on 0 of temp I am converting temp's 0 to float and temp's 1 to float and removing its mean and I am returning that If we get only one value after splitting So we are going to return float If there is any problem in the conversion of float, we are going to return none Because here there will be non values So whenever we try to convert it to float It will throw an exception to handle that we have written none except and this function I have applied on square feet column I guess you know how apply function works so on apply you pass a function reference like I have passed the name of convert range and basically it gives you the square feet as your input you will get square feet of every row on this x and it will return I have stored it on total square feet. Now if I see the data.head So here you will see the BHK, this is basically the integer value that was extracted. And I can drop size but I haven't done it yet, I will do it later. So now we are going to make a new column whose name is price per square feet. It helps. like removing outliers so every price is in lakhs so what we are going to do is make this column price per square feet so what we have to do is price divided by total square feet that is pretty easy but that will be in lakhs if we do 39 by 105 then it will be in decimal value but it will be in lakhs it would be feasible if we convert that to like rupees not lakhs so what I have done is first on the price of data I have multiplied 1 lakh so it got converted to like decimal not decimal whatever the minimum denomination and on that I have divided the total square feet of data so I have got price per square feet and this is what it looks so basically for this house we have 3699 as the price per square feet then I ran the describe so first we had I guess where did we run the last describe? here it was bathroom and price now we have more columns so bathroom and price here BHK also came because we have converted that to float and then price per square feet now if you see the distribution of price per square feet its mean is almost 8000 8000 rupees its standard deviation is very high around 1 lakh the minimum price per sqft is 267 rupees the first quartile is 4300 and then 5400 is its median value 7300 is its median value 3rd quartile and the maximum value is how much value will be this I have no idea but it's actually an outlier so we made this column price per square feet now in location there are outlier values not outlier but there are many values right so if I done this here if you see there are 1306 locations now obviously these values If we take out its dummy values or encode it in one hotend, how many columns will we get? 1,306 or 5 or whatever it is. Whatever it is, we can't pass it on the model. So we have to reduce. So how are we going to reduce? We will replace the location that is less than 10 with others. For example, this is Agara village. It has come once. So on the rows where Agara village is, we will write other instead of location. So whatever other values are there, like whatever values are less will be other so that will reduce our number of locations so that's what I did here, first I stripped it, actually the problem was leading or white spaces before and after the strings were creating different values here so what I did here is x.strip so on apply I passed a lambda function so on every location what is happening is x.strip and I am putting it back to the location so the space which was there from the front to the back will be removed and then I have put value counts on the location so this is my new location if I show you location count so these are my new locations 1295 ok it has reduced of course and then I saw that how many means which are those locations where which are less than 10 whose count is less than 10 means our dataset has more than 10 or less than that there are 1054 locations which have more than 10 on our dataset so as promised we are going to change these locations to other wherever there are these values on the dataset we are going to change that to other So, I have applied a lambda function on the location. So, it is checking every location that if that location is in location count less than 10. Means if it is in these locations, then we are going to write other there. Otherwise, we are going to write back the location. And if you see now, if I put value counts on the location of data. Now other values are 2886 values and we have reduced the length of fit from 1356 to 242 So when it will get encoded it will be 242 columns Now we have to remove some outliers like total square feet Isp if you see the minimum is 1 square feet, I don't know where this square feet is where is the flat where 1 square feet is available but this is surely an outlier where is the flat of 1 square feet available then our first quartile is 1100 median is 1276 third quartile is 1680 and the max value is 52000 sqft. What if I calculate total sqft by bhk? Like how much sqft should be there in 1 bhk? So when I calculated that and I put describe it, so I found this. This is a flat where 1 bhk is 0.25 sqft and So, I applied filter to the area where the square feet of the flats are less than 300 So, that is not a feasible flat So, I removed the flat where the square feet of the flats are less than 300 So, I basically applied this wherever Total square feet by BHK will be greater than 300. We are going to keep that. Then I put describe. So if you see now our minimum has become 300. And our data shape is 12,530. Earlier it was 13,300. Now it has decreased. When I put describe on price per square feet, which was visible here too. So here I saw the max value. So this is surely an outlier not great. So to remove this outlier I have written a function So let me step by step explain. So first I created an output data frame. So what this function does is it takes a data frame. Basically our data data frame. And then I added groupby with location. Now you know that by adding groupby you get a key which is the location. And you get the sub data frame of that key. Now in that sub data frame I... price per square feet mean. Basically, one location's mean price per square feet. One location's price per square feet standard deviation. And what I am doing is, I am filtering all the data frames. I am keeping the same price which means one standard deviation is here and there. And I am dropping the rest of the prices. So if the price per square feet current on this sub data frame is in mean minus standard deviation and mean plus standard deviation, I am keeping them. And then I am storing it on genDF and then I am concat on the output data frame and I am ignoring index. And then when it is getting less, then this loop will run for every location. The output will look like this I have added a describe on it It's taking a time, okay it's done Now you see our price per sqft should have It has reduced a lot, it was 173 Now it's 24 and our mean has also Decreased and our shanner deviation has also decreased So I have removed the outliers of BHK I made this function BHK outlier Which takes a data frame So first of all I am making an array which will store the indices which we don't need to exclude. And then what I have done is, the data frame will have a location and we are doing a group by. And we are going to get a location and its sub data frame for every location. Then I made a dictionary by calling it bhkStats which will store the stats of bhk. I will tell you what stats. So next up. I grouped buy on BHK on the location where all the data frames will be available So I grouped buy on location and on every location I grouped buy on BHK And I am getting this BHK DF Now in the stats of BHK, I stored 3 things for every BHK One is mean, one is standard deviation and one is count Mean is the price per square feet of that BHK its mean, when did BHK come? Price per square feet standard deviation I have stored and that BHK means how many times it came I have counted that, so this is the BHK stats, so if I show you to explain the next part I have to show you this, so I have first location BHK stats, just watch what happens, okay now everything garbage but trust me it is missing a lot of things. So, in this location you have a dictionary. In this dictionary, every BHK has a stat of this location. So, 2 BHK means count 3. So, 2 BHK is 3 times in this location. In our data set, 3 BHK has a stat. And the thing that I want to show you now. So for example, let us think about it. What are the outliers here? If I talk about this 3, like this 5th phase JP, so if I get a flat of 3 BHK, so what can be the lower bound of its price per square feet? If I get 3 BHK, what is the lower bound of its price per square feet? So, a valid lower bound would be that the lower bHK is 2bHK mean that is one valid lower bound If you think about it, what is the lowest value of 3bHK price per sqft? 2bHK mean which makes perfectly sense So, we have done that only If 3bHK current price per sqft is greater than 2bHK mean We will keep this row otherwise no And another condition is that if we have sufficient enough data to accurately predict mean of the second BHK So I have put here that if we have at least 5 2 BHK flat So we can accurately calculate the mean And then I have put this condition that if 3 BHK price per square feet is greater than 2 BHK mean we are going to keep it otherwise we are going to drop that row okay, so this is how I got the outlier so let me explain, uncomment and explain the code so here I am grouping by the BHK so I first got the stats and then I will loop over on those stats so as I said, I grouped by the BHK and I, for example, this 3 returned here no, it didn't return, it came as 3 So I will see two BHK stats Now what this get function does is Normally if you would have used this one BHK stats BHK-1 If you would have given 3 on BHK and BHK stats 2 You are finding the key right You are finding the stats for BHK 2 If this doesn't stay Like 2 will not stay in the key Then it will throw an error So what I have done is applied the get function Okay, what happened here? I pressed shift enter by mistake. So, no problem. I have done get. If it doesn't get, then the stats will be none. So, if it doesn't get 2bhk stats, then the stats will be none. And this loop, if statement will not work. So, I have added one more condition to accurately predict the mean of 2bhk. There should be at least 5. data points. So, I have put that only. If my current price per square feet like 3 BHK price per square feet of current row is less than the 2 BHK mean then I am keeping it otherwise it is going to the excluded indices like if it is greater than the mean then I am keeping it otherwise it is going to the excluded indices and then once I have all the excluded indices I am dropping those indices by saying I am going to keep it. Exclude indices x is equal to index manage go run Kia run over I time later I yeah Or now if you see the shape there are only seven seven thousand three hundred data Take it we had thirteen thousand three hundred a B. What okay? 7300 oh yeah, I got up data. They quote oh I said it's a now. This is our clean data or yeah, okay? Basically, two columns which are useless. Size, which we were supposed to drop earlier but didn't. I forgot then. Then I dropped price per square feet. Price per square feet was only a feature which was used to remove outliers. Once we have removed those outliers, we have like, it has no other use. So, I removed it. And this is how our data looks. So our features are location, total square feet, bathroom and PHK And our target column is price I have saved this clean data using data.to.csv I have made X and Y Excluded price so features are there and price goes to Y So then I have imported everything Imported trend test plate and used linear regression Lasso and ridge, I am going to apply these 3 model sections Linear regression lasso ridge one hot encoder standard scale standard scalar make column transform make pipeline or r2 score so if it many Train test plate Mara or fit X train or y train to shape the car so yeah pay me Like point to the random statement is zero there to get the same Split firmanic column transformer lagaia so on this column transformer palimane one hot and go to pass here I did sparse equal to forwards that is because of the reason I will explain So I just put one hot encoder on location column because that was the only column if you remember which was categorical Yes, after BHK these are integer columns I am not categorizing them as categorical So I just put one hot encoder on location and then I made a standard scaler to scale things I have applied linear regression first and then on a pipeline I passed the column transformer then the scalar and then linear regression so what happens is when you do pipe.fit x train that data will go to column transformer in column transformer first one hot encode will happen then it will go to scaler here whole data will be scaled and then it will fit on linear regression ok so that is what a pipeline does if i run this ok our data fit is done Now I am doing pipe.predict and passed the Xtest and stored the Wiprate on LR If you see the R2 score is 82 Now I have put Lasso I made the same pipeline again Let the Column Transformer and Scalar be placed And this time I passed Lasso instead of Linear Regression Then I fit it and saw its prediction score So it was giving me 81% Like it was giving me 0.81 R2 score Linear regression was giving me 0.82 And then the last thing I did was I applied ridge regression So make pipeline ridge So on this pipeline first I made column transformer Scaler then passed ridge Trained it Predict and its score came 0.82 Now I have put all three in one place No regularization, last one and then ridge So it turns out that Actually no regularization and ridge Both of them are giving the same result So I am going to dump whatever is in the ridge regression Like at last pipe is coming There is ridge regression in pipe I am going to dump that pipe Because ridge is also giving the same result Like linear So I imported pql and dumped it by the name ridgemodel.pql in write mode So here I am at PyCharm First I have transferred clean data and ridge model I have copied it here and saved the clean data Then I have made a skeleton of flask You will need all these libraries Flask, scikit-learn, pandas, pickle mixin So in this virtual environment, first you have to go to the terminal and install all these things Writing pip install Okay These four libraries Now what I have done is that I have made a flask's skeleton So I am rendering this template index.html So I have made a template directory on which index.html is written And there is nothing on it basically So if I run it which is already done so if I press on it it takes you to this page where there is nothing we have not written anything so the thing that I am going to do is I am going to use bootstrap so I will pick up their starter template from their website bootstrap.com documentation I will take the starter template I have done paste Now we will not use any thing like this javascript We will use this Hello is being printed and we will change the title to House price predictor And I actually ran this app on this port, 5001 The 5000 port has a demo application running If I show you, where is it? Yes, this application is running on the 5000 port So I ran this on 5001 So now that we have bootstrap in place We should refresh Here, house price The name of the title is predicted and the name of the hello world is displayed. So this is a confirmation that bootstrap is working fine. And of course if you are working with bootstrap you need internet because everything is delivered through CDN. This is the link. So if you don't use internet for the first time then you won't get bootstrap. So it's recommended that you have internet on while developing your website. So, being that said, the first thing that we will do is we will add things on this So there is a card first. Let us see so here is a card where there is a text on the header Welcome to Bangalore House Press Writers and here is a form where there is a button and at the end there is a prediction There are four things on this form, three text inputs and one select where people select the location So first they select the location, then BHK, then number of bathrooms, then total square feet So let's quickly make this design. So the first thing that I'm going to do is I'm going to put a class on it. BGDark which will give a dark background. Then I'm going to make a div. So let me quickly write the code for this. So what I did here is I made a container and made a row inside it. And I made a card inside it whose width and height is 100% and margin is 50. Now let's go and see that oh no not this one let's go and see here is this background which is black or you have a card and Joe he a been a declare because there is nothing in it so let's start by making the header so it's got Joe card header over who's Pellica regal it's a h1 pay welcome to house price and we will give it a text line which will be center now let's go to div and make its body card body let's see how our website looks I am actually hitting control and reload and the page should be hard reloaded whatever chrome cache does, it will not be there the whole website will be loaded so here you have the card header and here will be a form so we are going to have a form the method will be post and acceptCacheSet will be utf-8 Now what will happen in this form? There will be 5 things So the first thing is location, then phk, number of bathrooms, total square feet and then this button So div class I will make a row and then call md6 and this form ok so here I made a group inside that I made a class like on colmd6 I passed the form group and put its text line center I gave a label on it select location and then I put a select picker I guess the class name is select picker not select dash picker and ideally here there will be options like there should be options but We will do that in a moment but let's first see if our location is visible properly It won't show the location but its skeleton will Here we can see the location and we didn't give any options so nothing came here Now let us send the options Before sending the options let us make all other inputs and here I will give location now the second one I will ask the user to enter the BHK and on this it will not select but input it will type it on the second one I will put the form control in the class and the ID will be something like this name something and placeholder will be enter BHK I will put BHK on its ID like this we have to make two other things if you see here location BHK number of bathrooms which is an input enter total square feet which is also an input so let us quickly make those two I am going to copy this one only and paste two times Here I am going to change the label, enter BHK enter number of bathrooms I will change it to bath and number of bathrooms and last one was enter square feet and this will be total square feet and I will change placeholder to square feet and that is Almost done. So if I go and check on this, so I have the four elements and then at last there will be a button. So here I have made a button. I will not give anything in its style. I have given a class of callMd12, button primary of form control and whenever it clicks, it will call the sendData method. Okay, and let's see. Okay, here the button is made. Now if you see the actual website here So here we have to make the prediction space So we are going to say this So my form is complete And after that I will print the prediction So I will put here div class I will give it a style text align center and I will put h3 and it will expand because I need to give it an id so I will give it a prediction id because when my prediction is done then I will fill this span by calling this id. Ok, nothing will happen for now. Lets see if our prediction is correct Yes, we have one location Now let us go and see that the location where is it the location what are the options So the first thing we have to do is we have to import the clean data but before that we have to import pandas as pd. Our data is df equal to pd dot read csv read csv and on this clean data dot csv. Now our location will be location that is actually the location of the data. I am actually taking unique of every location and sorting it. So I am passing sorted here. So I am sorting the unique values of location and I am passing the locations to my index file so that I can print it and fill the options So here I am going to write a ginger2 template ginger2 for loop so I am going to write for location in locations and I am ending the for before i forget to end the for you also have to like you have to end the for I always forget to end the for so something will happen on its value and something will happen on its inner HTML so what will happen on that? on both of them location location let's go and re-run this thing So if I refresh, okay, so our locations are here. We have these locations. Our next task would be that whenever we click on this button, so it will send this data to any to a predict method. That will be sent to a predict method in here on our backend. Let's make the predict method. or if we don't make it complete then let's make at least the skeleton so predict and on its methods I have to pass post def predict and I'll return something return open so our predict is made here and let's just implement this send data method it is actually a JavaScript method so I'm going to open a script tag here Script doesn't close, I don't know why. It doesn't give auto-complete. Ok whatever so I am going to make this function which is named send underscore data ok so what happens is the form if there is one button or one input on it then on the click of that button the submit event of this form is triggered so normally what you would have in a normal HTML form HTML form is an action where you and then you pass some URL so whatever form data is there it hits this URL but we don't want that it actually takes you to some other HTML page so we don't want that we want to suppress this behavior so on the press of this button we don't want to go to any other page we just want to send the data to this predict method and whatever prediction comes, we will display it on this span so to suppress that, we have to write something so let me write the code and then I will explain so here I have made that my document.querySelector I have used this querySelector to select the form and I have made an event listener on that so whenever the submit event is triggered on that form I will execute this function which is form handler and what is there on form handler? an event will go which suppresses its default behavior so my form's default behavior suppresses this so basically if you if I did not have this so let me just comment this thing out ok if I did not have this if I refresh on the, let me select something ok fine when I predict click on it, so it says method not allowed now if I uncomment this I have suppressed the default behavior of the form let me go back, reload now if I hit this predict price, nothing happens so that is what that suppression meant now I can happily send this form data to that predict method I am going to do that by using an XMLHttp request So here I have written that I am selecting the form and I have stored it in a form data object. So basically form data objects are required to store form data. So whatever data the user has selected, we have saved it in a form data object. And then I opened an XML HTTP request, I mean I made its object and then I hit this predict URL. Not yet hit, I have just opened the request. So, whenever I open a request, the user has clicked this button. Whenever the user has clicked this button, it will say weight predicting price. So, I will write weight predicting price on this span. How am I going to do that? I am going to write document.getElementById. Whatever element I have, like, I will write document.getElementById. I will change the innerHTML of predictionID to waitPredictingPrice This will not be visible because the prediction will be too fast but in cases where our server gets slow you have to show the user that we are predicting the price and you have to wait basically that Now if I hit this let me refresh if I hit this where it says waitPredictingPrice and when the data comes back we will change it now somehow we have to see that when our prediction comes suppose the prediction is going here so whenever we get the prediction we have to catch that and we catch that by using this onReadyStateChange we pass a function on this onReadyStateChange what function? if the XMLHTTP request is done that means we have got back our response which is in this response text so the prediction span which was here if I am changing its innerHTML then our prediction value will be there and now there are two more things left so you have to write this onLoad function otherwise it doesn't work So in this function you basically have to pass an empty function Onload is not a function, it is an attribute Yes, we pass an empty function on xh.onload Now our xmlHttpRequest is just opened we have not yet sent the form data okay so to send the form data which we write xhr dot send or i'm low form data pass so basically you are opening a new request we are sending the form data whenever we are getting back the response we are updating the response in the prediction prediction so if i refresh and run everything should run fine here by saying weight,prediction,price then prediction nothing because our prediction method is not returning anything which we had hardcoded now let's fill this predict method so for that we are going to need our model so I will store it on a variable of a pipe we are going to do pickle oh I didn't import pickle so import pickle dog ok and I will load which file, I will be opening this file which is ridgemodel.pkl and I am opening it on read binary mode ok so in this predict method we are going to get four things locations, what did I write, locations then we will get BHK then we will get bathrooms and lastly square feet I will write it as sqft so to access locations you have to write request.form.get and whatever the location is this location has a name of location and this is the name that you want the data when the user selects this option the name is location and you are getting the data and you are getting the data in this location so job you just select here warm local mail care take it now you also have to import request so from flask import request phil bhk bhk i guess bhk then we are going to get requests we are going to get request.form.ok I had written the name of this request total square fit now let's print and see whether whatever data is being selected by the user is it being correctly displayed or not so I will return it once let's refresh So I select 7th phase, JP Nagar, 2 BHK, 2 Bathroom, 2000. So as I click on predict price and I will go to PyCharm and see that all our data is being displayed correctly. So now we are cleared to predict. So to predict basically we have to have an input right. So I pass a data frame on its input because that was how our pipe was trained right. So I used to pass 4 things on this data frame Columns, Location, TotalSquareFeet, Batch, BHK And we have to fill the user data in this data And I am passing a double list because that is how data works location sqft then baths, autocorrect and bhk I have passed all the things, my input data frame is ready now I am going to do pipe.predict and I have passed the input on it whatever the input data frame is and here I am storing it on a prediction variable so let's see actually I know that at its 0th position, I am going to get the prediction like this list will give us where there will be only one value we are going to get the 0th value and this will give you back an integer number and you basically write htr prediction whatever prediction comes Let's rerun, let's refresh So I have selected JP Nagar 4 bedroom, 3 bathroom 2000, as soon as I clicked on the predict price It is saying 120 lakhs or 1 crore 20 lakhs flat and you can see the precision is very high we have to reduce it somehow and we are going to convert this from lakhs to rupees so to convert it from lakhs to rupees basically you can multiply it by one lakh this is a good way to write this you pass 1 e 5 and i will round off this to two decimal places for that i'm going to use numpy you can use anything else import numpy as np if it may split now in p dot round and i want to round it off to two positions if i rerun let's refresh okay So, 6th phase JP Nagar, 4 bedrooms, 3 bathrooms, 2000 sqft. So, this is what it is. So, it is telling how much it needs. So, basically you need 1 crore 43 lakh rupees to buy this house. Okay. So, I think that is working pretty fine. And now if you want to deploy this to Heroku, you are going to need something called a flask course. Flask Cross Origin Basically, this HTML request is an HTTP request and the URL provided by the header is an HTTPS request So, sometimes it causes problems So, you have to use something called Flask Codes If you have that problem, just google it And, Thank you for watching