Transcript for:
Matplotlib Basics and Plot Customization

welcome everyone in this session we are going to continue learning math plot 11 python in the last session we have seen how you can use M plot lip to create or what's the basic fundamentals behind M plot lip what are the basic data structures which M plot Li uses these data structures comprise in a hierarchical fashion right so there's a hierarchy within the data structures in M plot lip so at the top of the hierarchy we have figures in M plot lip these figure objects basically are like blank canvases so whenever you begin plotting any figure in mat plot lip you create a blank figure object inside that figure object inside that canvas you create multiple axis objects here each axis will contain one plot you can create multiple plots in a single figure then within axis we have x axis and y- axis which allow us to assign labels to these particular X and Y axis we add labels we add titles we add tickers and we can customize the tickers as well so essentially in the last session I have given you an overview of what comprises what are the data structures which are contained in M plot live and how are these data structures in uh you know uh how are these data structures arranged in this particular library in today's session we are going to dive further into the details around the same here we'll practice some basic charts plus I'll also give you introduction to some of the ways in which you can create multiple plots in this chart I'm and I'm going to taking these sessions forward so in this session we are going to see how you can customize plots in matplot lip by customizations I mean how you can essentially assign colors how you can create markers how you can uh put multiple plots together inside a single figure object and things like that we will cover some plotting techniques in which we'll take a look at how you create plots for categorical data what plots you use for numerical data and so on then we'll see how you can use the inbuilt python libraries like pandas and numai and how mat plot works with pandas and npire together and towards the end we'll summarize the session so here this is going to be a practical session so let me quickly share my screen all right so I'm in my uh same notebook which we have been practicing in the previous class so in this notebook as you can see we had created some basic line chart in order to create this line chart we had create called this plot function and to this plot function we had created this plot by passing the voltage and current and in a way we are simply visualizing the ohms law using mat plot Li simple chart nothing too complicated here now going forward what I want to do is I want to create a couple of more charts first of all let me try to demonstrate how you can how you can customize these basic charts so until now if you see here we have created a simple chart for V equals to IR so for that let me again create these variables once so what I'll do is I'll quickly create these variables now as you can see our the vol Mage variable here an array it's an ire array and same way uh the second variable which is current it's so current one is basically what we will use here so current one is also an ire so now let's try to see how you can customize the way these lines are looking in this particular chart so until now if you see the default style is essentially this that is on one axis we put the voltage on the second axis we put current and if I plot it what you will see uh sorry what you will see is that currently we are getting a solid red line Sol solid blue line so this right here is basically a solid blue line but if you want you can even customize this so here if you want to change the color so now there are acronyms which you can use to specify different colors if you want to create a red line you can specify R and this will change the color of this line to R if you want a blue line or sorry a black line right so you can spe sorry Blue Line you can specify B for black line you can specify k for orange you specify o sorry capital for green you specify G right and so on so each color has a uh you can say acronym or alphabet assigned to it and you can provide that alphabet and it will essentially convert that into the respective style now not only colors you can also customize the way these lines are looking so right now this is a green solid line so here right now what you are seeing is a line which is currently green but it is solid if you require you can also create a dashed line by passing two hence alongside the color the first character dignifies which color it is following you write which are the how how do you want to style the line that you are presenting so in this case I want to create a green line which is dashed in nature so for Double Dash indicates that you want to create a dotted line so if I run this what you will now see is we have a dashed line which is green in color again if you want a red dashed line you simply specify R double hyphen by specifying R double hyphen what you will see is that we have a hyphen is basically used to specify the pattern here so dash dash means you want a a dash line if you specify a single hyphen it will create a solid line so single hyphen stands for solid double hyphen stands for dotted so the default style which is used in M plot live is blue dash so be is basically the default way in which a plot is created and you can see here we are getting a solid Blue Line all right so this is uh these are some of the ways in which you can customize now apart from that there are other characters also which you can specify to further change the way these charts are looking so in this particular case if you want to have circles instead of dash line you want a dotted line so in order to create each marker instead of joining the markers together in order to create a solid line you simply want dots to represent each data point so here you can see if you specify o lowercase o it simply means that you want to create a dotted line so this right here is basically the dotted line right so here each marker is represented by Circle you can specify Aster if you want to put crosses here so here you can see these Aster represents Stars if you put a carrot at the top carrot is going to convert it into triangles so these are basically the triangles if you want red triangles you can specify R Karat and these will convert them into red triangles so you get a bunch of different customizations when it comes to how do you want to show your lines the first character is used to customize what color you want to show and the second character is used to customize how you want to make the lines look like that's first customization which it allows now in this case if you want you can also create multiple lines inside the same figure and you can design how you want both the lines to look like so in this case what I'll do is I'll simply uh call PL do show in order to hide this particular output here so that we are not we are just showing the chart not anything else so the customizations which we have done here basically we have changed the colors and we have changed the styles of these markers now what I can do is I can call pl. show sorry pl. plot and here on x- axis I'll plot voltage on Y axis I'll plot current one for the first plot solid red line and for the second plot is voltage versus current to Let's create a black uh dotted line right so in this case we are creating two different lines first line has a different style second line has a different style and uh here you can specify a label as well so let's say the first line label is V equals to I into R and for the second line let's imagine the label is bals I something like r r² or anything like that like I'm not sure what exact formula I had used just guessing the formula here then you can call pl. Legend you can see here in the same chart the first line is the red solid line the second line is is a dotted black line and both these lines are created in the same chart all right so next up we'll talk about how you create multiple plots in M plot Li in the same figure object right now what we were doing is we were using one figure object and inside this figure object we were overlaying multiple charts in the same chart right so here what You observe is that although this figure object is containing only one axis object on the same axis object I have two different line charts sometimes what can happen is that since the scale of both of them are different so you see that there is no much movement in the bottom plot whereas there is significant change in the Top Shot the reason being that since both of them are sharing the same Y axis so whichever variable has a longer scale will obviously dominate the variables with the Lesser scale so in this case you are practically not seeing any changes in the variable which is black whereas the red variable here is certainly showing up much rise so in order to to avoid such problem what people often do is they want to break each chart into a separate axis that is they want to create multiple plots in the same figure object so for doing that first thing which you have to understand is the concept of axis and how to label axess in Mac plot Li so until now what I have shown you is that let's imagine we have this figure so what I'm trying to do here is that in this example which I'm showing I'm creating a bigger figure object so I'm creating a figure object like this but this time I'm dividing this figure object into two halves right so I'm dividing this figure object into two halves and what we have here is a figure object with one row and two columns so here we have one row and two columns right so since we are starting to count from the left hand side so here this is the first this is the second so here in this case we are going to create a subplot so here 1 12 one is going to indicate that we have a grid in which we have one row and two columns and this is the first object in which I am creating the chart now there are two ways of indexing I have told you the previous way in which you can index an access object by specifying the values like this but if you want to perform or if you want to locate a particular access object as if you are counting which figure you are a part of you need to start the count from the top left corner and go rowwise so this subplot will have a count of one this will have a count of two in the previous whiteboard here if if you start counting in an integer basis so this subplot will be one this will be two this will be three and this will be four right so in if you want you can index values like this or you can index each plot like this so in the subplot function you use this particular way of indexing subplot that is you specify which subplot you are creating here so in in our example here which we are creating we are dividing a figure into two halves in the first half we are going to plot one chart the second half I'm going to plot a different chart let me demonstrate that here so firstly I specified how many rows and how many columns are created inside the figure object so we have one row and two columns and then what you have to specify is the index of the plot that you are creating so here I'm going to create my plot in the first box inside the chart I am creating so now I have accessed a subplot right so here I have accessed a particular subplot in inside my bigger grid after that I can call any function that I want to plot inside this particular figure so in this case I will use the plot function to create X versus y graph let's create a title where the title shall be uh let's call y equals log of x right this is the title of the remember this is the title of this subplot now in the next subplot what we will do is we'll again call the subplot function this time again we are creating a grid but now the grid is 1 by two the grid is going to remain the same I have created Y versus log X in the first subplot now I'm going to create a different chart in the second subplot so how will I create a second chart again I can call plot and then this time let's create Y versus uh NP do exponent and here we'll give a title PLT do title y = to X and then simply call pl. show and let's see what it looks like so what you see here is basically I have as you can see here this first of all this figure is much bigger right so what you see right now the figure that you are seeing on your screen right now it's much bigger than the previous figures which we had created why is that because in this example here we have created a bigger figure until here until the previous examples Matt plotly was automatically creating a figure for us with the default Dimensions but here I wanted to change the dimension of the figure that I'm using that's why I'm providing a parameter called Fix size to create a bigger figure here as compared to the default ones then I divided this figure into two halves so here you can use the subplot function to first of all create a grid the first parameter specifies how many rows the second parameter specif how many columns and the third parameter is going to specify which axis you are plotting on so in this case you have 1 by2 that is total of two axis inside this figure from this two axis I am plotting currently on the first axis in the second chart I plotting on the second axis the grid is going to remain the same because we are plotting on the same grid here the grid is going to remain the same and then you can call any function that you want to plot on top of this particular figure which you have created and find finally you can call pl. show to Showcase what the output looks like this is basically the first chart on the left hand side is yal log of x so you can see here both these charts have independent x- axis and independent y- axis so if you want you can even assign them individual values like this so PLT do xlabel what you will see is that both of them will remain completely independent PLT do V label log of x then again at the bottom here I can write PLT dox label BLT do y label Y what you see here is that you can see you can simply call the respective function you can assign individual X labels y labels titles and you can also call Legends as well like you can create labels here as well so here you can specify a certain label for this chart here I can call label equals to chart sorry not [Music] here right and then simply call pl. Legend pl. sh there also you have to call pl. Legend sorry for both the charts you want to call the respective Legend so you have to call the Legends twice you can see here chart one and chart two the Legends are showing you have uh the respective X label y label and so on so everything is customizable for both the charts individually so you can set for each respective chart you can set an X label y label and so on I hope this is clear to everyone now next thing what I want to demonstrate here is basically uh this again uh is a really interesting thing say let's say here you want to create three plots right so instead of creating two charts I want to create three charts in which the top two charts are going to be divided in plot one and plot two whereas the bottom chart is going to occupy the space for plot three and plot four what do I mean I'm again in my whiteboard here I'm just demonstrating it on whiteboard imagine that you have a big figure object this figure object I want to divide in two grids right so here let us divide this into grid so here let's say we want to create two rows and two columns right so I want to create two rows and two columns I'm just dividing it into two rows and two columns I hope this is clear now what I want to do is the first plot that I'm creating right so the top two plots uh this this plot one this is plot two this is three and this is four that these are this is how you label these respective axis inside a figure now the first subplot is going to be created in the axis one right so this is going to create contain let's say a line chart the second one also is going to contain a different line chart but the last plot has to span across the both the bottom charts right so here instead of creating it like this what I want to do is I I want to erase this line in between here right so imagine that you basically Arrangement looks something like this so here in this example here what I want to do is the last chart which you are going to create this is going to span across the bottom two so although you have a grid of 2 cross two but you have creating you are only creating three plots inside this grid of two cross two such that the first two plots are occupying one one axis individually and the bottom one is spanning across two axis simultaneously so how do you achieve this so for doing this again I am back in my mat plot Li there will be slight modifications to the previous code what I'll do is I'll simply copy the previous code that we can achieve this a little bit faster so here instead of first we are changing the grid space so here I creating a 2X two grid here we have two rows and two columns the first plot is still going to remain the same which is Y versus log of x so we are not changing anything in that space in the second chart again the grid s shall change we are again creating a two cross two and we are plotting on the second plot here or the second axis object here this is again going to remain the same now what we'll change is let's let's say I'm again creating a third object here one minute so here this time the last object which we have it's spanning across two axis simultaneously so AIS three and axis 4 so what I did is I created a tle inside this tle I have specifi all the axess that I want to span this particular plot across so across three and four is basically what I want to use for creating the last Axis or the last subplot and here also I can specify whatever plot I want to create so in this case let's say we want to create a chart between do square root of x so X versus square root of x right so here we want to create a chart between square root of x and so X and Y will remain the same it doesn't matter and yals to square root of x is what we want to create we will'll call the individual Legends and we'll finally call the show if you run this right so you can see here that this is what the output looks like so in this output we have the top two plots the first chart is y equals to log X second one is y = x the third chart at the bottom is y = squ root of x this last chart is spanning across the width of the top two charts by doing we have specified two specific axis which we want to span this across and it has easily spanned the entire plot across both the axes this is basically a quick customization on top of it which allows you to you know create subplots which are spanning across multiple axes great so next St what we will do is we'll discuss some of the commonly used uh you can say plots right so types of commonly used Lots is what we'll discuss next so first of all what we'll discuss first the let's discuss one by one and for the purpose of this exercise let me read a simple data in this particular uh notebook itself so whether I have pandas or not I so here we'll read pandas CSV file let me increase the font size slightly so pd. read CSV and uh insurance is what we want if you see we are going to use this data here and we'll create some basic charts in order to visualize the relationships between the different variables in this data so the first chart which we want to create I have already shown you the line chart you can create a line chart following the plot function and you can customize the way the lines are looking like right so creating a line chart is pretty simple and you create line charts whenever you have usually you create a line chart for a Time series data so right now we don't have a Time series data here but let's imagine we did have a Time series data so in this case we actually have a Time series data heref time series read a Time series data so here you can see on x-axis we have the date and Y access we have the count of passengers which have traveled on our Airline we have used this data set previously when we are practicing pandas so I'm just going to use this data again here and just demonstrate real quick how would it look like if we had a Time series data here so before going and using this data as a Time series what I'll do is let me convert the date column into a datetime column right now if I show you the data types of the different columns here you can see the date column is currently read as a string and count column is currently read as an integer so you can change the data type of these different columns as well we have we have seen this previously as well so what I'll do is TF time series at the index of date I'll call period. two date time I'll pass the function I pass the column that I want to convert and I'll pass the format as percentage by hyen I have already explained previously how to create and how to understand these different formats sorry so now if I show you the data types again so you can see now the date column is converted into a date time object now the first chart which you can create again right so let me just label it again is basically a line chart line chart is usually used for time series data so for creating a line chart again you can simply call the plot function and in this case you can pass on x-axis uh values like uh EF time series do dat on Y axis you can pass DF series do and in this case the thing is that here this does not work with panda series directly so you have to convert it into an NP array and not just any NP array actually you have to convert it into a one NP so that's why I don't use data frames here so here let's create X and Y varibles separately so DF time series. date is uh what we want and in this case here rate values so if you see X so here this is a time stamp actually it's not going to visualize this easily here so what we'll do is uh let me keep it as a date only instead of reading it converting into a date time I'll keep it in a text format only uh the reason why I'm converting into text is because the mat plot lip does not directly support converting a date column into a into a chart cbon does it so when we practice cbon there I'll show you how you can directly use C Bon's implementation for this in a much more easy way but since we are discussing right now M plot lip so I I I'll skip the conversion to this to a date time column for now we'll show it later on when we practice the cbon part so here I'll create a line chart on x-axis let's say we want to put DF time series dot date. values and here I'll convert it into a list and on Y axis I'll do TF time series do count values. tool list and actually this count is a function in itself so I cannot use the syntax right so here now we have the X and the Y variables so X you can see it's basically a python list which contains all the dates and Y is a variable that contains my count as different dates so in this particular case if you call PLT do plot here put X and Y so right now it might take a little longer and the labels will also not look good at all that one minute so it's going to take a some time so in in order to avoid the process I'll just comment this line here I'll show you a much simpler way of creating the line charts using M plot Li and pandas so whenever you are working with data frames I think I have shown you this previously when you're practicing pandas as well so whenever you have a Time series data and you want to uh actually convert this time series data into a nice plot let's say here I convert this into a column so what I'll do is I'll call the plot function after calling the plot function I'll specify the values that I want to put on x-axis so here you you need to specify what values go on xaxis so here let's say on x-axis I want to put date on Y axis I want to put count and on simply kind I'll specify as line chart what it will do is it will create a line chart for you based on the data which you have provided so here you can see it's much easier to visualize internally it's also using M plot lib only to create these charts it's just that the conversion and how the labels are created everything is abstracted so you do not have to worry about properly aligning these so although you converted into a date but it automatically detected the month and it has converted into a much more readable format so it's much nicer to use Panda's plot function whenever you want to create a plot of any kind right so what different kinds plot it supports I've already covered it separately in the final video of pandas in which we have seen data visualization using pandas in that I have covered it separately so all the charts I have covered there in detail so you can refer to that lecture if you are interested in learning more about this the next chart which we want to talk about here is essentially called let's create bar chart let's create a box plot as well right so let's create a box plot real quick what's a box plot a box plot is a kind of plot which is created for a numerical variable so whenever you have a numerical data a continuous numerical column in your data so if you take a look at the insurance data which we have so in this Insurance data we do have a continuous column called expense so expense is a continuous variable and this continuous variable can be visualized using a box plot the idea of a box plot is to see how the distribution of that continuous column looks like throughout its range are there any outliers in your data or not I've spoken in detail around identification of outliers as well we have talked about the interquartile range and how you use the interquartile range to detect an outlier box plot is nothing but a visual way of identifying the same information so what I'll do here is that I'll simply call the uh let me create a figure we'll create a slightly bigger figure here let's do a 5x5 and then I'll call pl. boxplot then if you want to plot any variable here so on x-axis let's say I want to keep uh uh EF dot expenses and let's show it let meun PL [Music] show what you see here is basically the box plot of the expenses that are being paid by different customers so at the Box region here at the bottom here this is essentially your interquartile reach so the width of this box the height of this box I should say the height here is basically what is the interquartile range then towards the up and the bottom region are basically your whiskers by default you can see how these whiskers are calculated so here you can see there's a parameter called Vis this Vis parameter is used to control where do you want to keep your whiskers right so here you can use this parameter of wh to control where those whiskers are lying right so here you can see float or the default value is one 1.5 so what does this default value indicates it indicates that where is the default position of the whiskers located so 1.5 means that it will push the upper boundary to 1.5 upper boundary plus 1.5 times inter quartile range and the lower will push to 1.5 lower boundary minus 1.5 times inter quartile range and whatever values are Beyond this range are usually considered as outliers in your data so in our data all these circles that you are seeing the the the markers that you are seeing at the top here these markers are representing the outliers which are present in our data you can similarly create these block box plots for body mass indexes as well so if you can replace this entire thing with BMI or you can do one thing you can actually divide it into subplots so PLT do subplot here let's say we are creating one row two rows one column and let's create a quick title for this one uh expenses right and similarly I'll do a quick copy paste here this is the second plot we are not creating the figure again either way this figure is supposed to go up give me a minute I'm just rearranging the code to look at make it look a little bit uh nicer so here uh we creating a figure let's create a bigger figure so 10x 5 then I'm creating the first subplot which is essentially a box plot that contains the box plot for the expense in the second subplot I'm using or creating the box plot for body mass index so I'm writing the body mass index here as well and then simply I'm calling the pl. show function to show what the results look like so in this case you can see we do have some outliers in the BMI section as well so we do have people who are extremely obese so whose body mass indexes are more than 45 so you can see there are people who are extremely overweight plus you do see here that expenses also are in the outlier category that is there are people who are paying exceptionally High expense as compared to the majority of the population here is a quick example of how you can create box plots using M plot lip first let's create a histogram right so learning histogram is much simpler so let's first create a histogram now just like how you use box blot for visualizing a numerical variable box histograms are also used for visualizing a single numerical column in cas case of histogram you can take this exact same data exact same code all that's going to change is the name of the function that you are going to use to plot this histogram so in this example here I can call the respective function here real quick so in order to create a histogram you have to call the hist function and you can specify which column you want to create the histogram for plus you have a bin parameter that will allow you to create the respective number of bins in the histogram now if you're not aware what a histogram is let me quickly explain that in a minute so here I am in my whiteboard now what exactly is a histogram now imagine that you have a numerical data right so that numerical data is going to have some lower bound and an upper bound right so imagine you have salary let's say the minimum salary is zero and the maximum salary is 1 million right if you want to divide a histogram a histogram is nothing but a way to visualize the distribution of salaries in this range that is which bucket has the highest number of people which bucket has the least number of people so in order to visualize what the distribution of salary of people look like in a company what you can do is you can divide this entire range into buckets of equal size right so imagine I'm dividing this entire range into some buckets of equal size H I divide this entire range into buckets of equal size and then I'll count how many people are there in the first bucket how many people are there in the second bucket how many people are there in the third bucket and so on and so forth and the last bucket once I have the count of the bucket and the number of people in each bucket I'll simply create a bar chart in which the height will correspond to the number of people lying in that bucket and this distribution right here is basically what a histogram tells you a histogram is nothing but a way to visualize how the values are arranged inside a given range you can customize how many bins you want to create so these bins bins that I'm creating right here let's say we have 10 bins you can increase these numbers to 20 30 40 80 100 as many bins as you want the more bins you have the more smooth this curve is going to look like right so the more bins you have the more smooth this curve looks like and as the width of this bins uh the width like so if you tend to Infinity this basic curve essentially becomes your kernal density estimate as well we'll talk about kernal density estimates in a minute later on as well just to recollect here the more bins you have the finer the closely packed these bins are the more smooth this curve is going to look like and that is basically what a histogram is all about so in this case imagine that I'm creating 100 bins that is I'm dividing the range of expenses into buckets of 100 100 buckets of equal size and then I'm create a bar chart between the buckets and the number of items in each bucket same exercise I'm going to perform for BMI in case of BMI you can can create a lesser number of pins just to visualize it a little bit better you can call the show function and it's going to create the histograms for you for both the cases so you see here the curve on the left has a lot lot many uh since the number of bins are high the curves look a little bit more smoother the one on the right hand side is basically has a lesser number of bins but you get the idea right so this is basically tells you that the majority of the population are earning or the the premium that they are paying is somewhere between 15,000 or so like 13 14,000 is what the majority of the people are paying and as the expenses are going up what you see is that the number of people in that respective bucket are continuously going down there are very few number of people who are lying in this bucket right so we are paying extremely high premium if you take a look at BMI this is what it looks like a normal distribution indicating that majority of the people have a body mass index around 30 which is a healthy number which is good for the population as well so in this case you can see the highest density or the highest frequency is occurring somewhere in the range of 30 which is a good value to have right and as the bmis are decreasing or increasing the frequency of people in the respective buckets are also going down it's overall it's a good thing to have right so we see that these histograms are created for body mass index and the expenses column all right now it's time for the scatter plot the scatter plot is kind of a 2d chart which is used to visualize two numerical columns together so in this case you can call the scatter function the scatter function you can specify what you want to put on x- axis so here let's say on x-axis I want to keep DF do DMI on the y axis I want to keep DF do expenses right so pretty simple and after this let's say you want to customize how the respective uh values what the color of each marker so in this case I can specify the color using the C parameter right so this parameter of C is an acronym for color you can change the color of the chart by specifying this parameter as well and you can see here that this is what the scatter plot is going to look like in a scatter plot you create these Scatter Plots to visualize the numerical relationships between two columns what are the ways in which one column is interacting with the values in another column if you increase the value of one column does the increase result in a decrease in the value of second column and so on right so how the two columns are interacting with one another is what we try to answer using the scatter PL in other words it's a way to qualitatively see what the correlation between two numerical columns look like right so is the value increasing or increasing the value of one variable or is it decreasing or increasing the value of one variable in case of body mass index we do see that as the BMI is in this region if you increase the body mass index too much the premiums are definitely going up right so which is not a good sign right so if your body mass index is up then your chances of bad Health are also increasing so technically speaking yes you are towards you are more likely to pay a higher premium for your for your insurance and last but not the least what we'll also create right now are bar charts bar chart is usually used in order to visualize a numerical column and a categorical column so when do you use a bar chart say for example you did a simple aggregation on this data and you have calculated let's imagine the age wise average premium right so DF average age premium right so for this you basically take your data and do a group bu here you do the group bu based on the age column and for the expenses you calculate mean right so right now if I show you what the values are you can see for 18 years of age this is what the average premium is for 19 years of age this is what the average premium is for 20 years this is what the average premium is let's say instead of looking at these values numerically it's very hard for you to determine which is the highest which age group is paying the highest average premium which age group is paying the lowest average premium it's very difficult to identify that looking at the raw data so what you can do is you can convert this into a nice bar chart for uh uh for your purpose so for that you can simply use PLT do bar right and in this case you can specify the variable to be put on xaxis the variable to be put on the y axis as well so X can be a float or an array like object the coordinate of the bar right so here and height is basically float or an array like object so in this case what you can do is you can pass uh DF average premium dot we'll call the index and put it on the xaxis and D dot whatever is the value stored in the series be put on the y- axis so what you see here is on the xaxis we have the age of the customers and on the y axis we have the premium that they are paying the premium are stored in the series that's why I passed the series directly and the age is stored in the index of the series so that's why I extracted it from the index variable so indx attribute is going to return me the age of the different customers if you do not want to do it like this then you can convert the above series into a panda's data frame as well if you want to be a little bit more explicit so you can do something like this that is call the two frame function you check the shape head here I had shown this previously as well you can see here that this has been converted to a data frame but right now the age column is an index so what I'll do is I'll simply call the reset index function and I'll specify in place R and then you can check the data frame you can see here now we have two columns the First Column is the age column second column is the expense column and then I'll pass the age column on the x- axis and I'll pass the expense column on the Y AIS and it will still return in the same graph the graph is not going to change and this is what the bar chart looks like so you can clearly identify the lowest premium are being paid by customers of age 21 years and the highest premium are being paid by the customers which are oldest and in general the trend is up only that is as the age of a person is increasing the average premium which is being paid by that age group is continuously On The Rise right so you can see there's a general Trend that the premiums are going up with age of the customers as well right so you can see a clear pattern here as the age is increasing the average premium is continuously increasing with the oldest customers paying the highest premium and the youngest customers paying the lowest premium all right so that's pretty much all the plots that I had in the next lecture by the way in the next lecture we are going to practice yet another library in M uh for visualization in Python this library is called cbon the benefit of using cbon over mat plot Li is that cbon plots are much more visually appealing it's nicer the colors are good plus you get a lot more customer izations on top of the regular plots apart from that you have some Advanced charts which are by default created for you in cbon which otherwise you would have to do a lot of efforts in order to create the same in mat. we'll talk about details of cbon in the next [Music] lecture