Transcript for:
Python Pandas Overview and Features

Welcome to Amit Thinks YouTube channel in this  video course learn about python pandas pandas   is a powerful and easy to use open-source Library  built on top of the Python programming language   it is one of the most popular libraries for  data analysis and manipulation in python python   with pandas is widely used in Statistics Finance  Neuroscience economics web analytics advertising   and many other fields working with data sets  cleaning them and making them relevant for   data science is what pandas do in this course  we have covered the following lessons with live   running examples let us start with the first  lesson in this video we will learn about the   overview and features of pandas its usage okay  why it is important and its role in by python so   if you want to work around data analyze the data  and manipulate it then the Pandas library is one   of the best libraries of python it is built on  top of the Python programming language okay it   is easy to use and Powerful it is widely used for  different fields like statistics web analytics   advertising even Neuroscience okay if you have  a data set let's say you have a net data set   with the records of all the users and you want to  clean the data you want to remove the duplicates   then use pandas easily load and read data sets  into your Python program using the pre-defined   functions the built-in functions of pandas  load files of different formats data sets like   Excel CSV JSON XML okay as well as you can handle  duplicate find the duplicates and remove them all   this you can do with this powerful pandas Library  it was initially developed by Wes McKinney in 2008   let us see the features I told you before that  if you want to analyze your data or manipulate   it then pandas is quite important it is very much  useful in data science and data analysis you can   easily analyze and manipulate the data if you  want to read your CSV files do it with pandas   you can also find and fix the inaccurate data the  duplicates as I told you before completely clean   the data so that you can generate meaningful  insights from it just like what we do in data   science so it is one of the most powerful data  science libraries also if you have incomplete   data null data you can easily handle them with  pandas also you can in insert or delete columns   from a dataframe dataframe is basically a two-  dimensional tabular data with rows and columns   where you'll store your data easily add or insert  columns also remove it you can also group the rows   and Columns of a dataframe or series series in  pandas are basically a one-dimensional array   like a column in a table it can hold data of  any type so you can work around the series and   dataframe in pandas in this video we saw how  we can easily work around pandas how we can   easily understand pandas its features its usage  thank you for watching the video in this video   we will learn how to set up and install pandas  with PyCharm so we'll be using PyCharm to run   pandas program why because PyCharm provides a  community version a Community Edition which is   open source and freely available these are the  steps to install pandas with PyCharm you need   to remember that first you need to install numpy  also before installing pandas because pandas is   built on top of numpy okay at first you need to  install Python and pip why pip because pip is a   package manager it is used to download install  and manage python packages and Library so we'll   be installing Python and pip first then we will  install the PyCharm Community Edition that is   open source and freely available okay then we  will connect both Python and PyCharm that is we   will connect step one and step two after that we  will also learn how to run a first Python program   on PyCharm okay then we'll be installing numpy  because pandas need numpy after that we'll be   installing pandas both we will install on PyCharm  so that we can run our first pandas program in the   next chapter okay so guys let us start with  the first step to install Python and P let's start at first go to the web browser on Google type Python and press enter on pressing  enter the official website is visible   click on it here is the  official website now keep the mouse   cursor on downloads now your you can  see the current version python 3.12   is visible click on it to download it  will download it's only 25 MB download completed now right-click and click on open  to start installing python we clicked on the   exe file minimize now the setup started here  and at first select add python.exe to path   now click customize installation this shows  what we are installing it will install pip   also pip is used to download install and manage  python packages it will also install id id is   an ID to run python programs now click next  Advanced option it will install python 3.12   for all users it will also set the environment  variables okay and python will get installed in   the following location under program files  keep it at default can change it from here   by clicking browse but keep it as it is no  problem click install now the installation started guys we have successfully installed python click   close now let us verify the  installation go to start type CMD click open to open the command prompt here and type the command python  space hyphen hyphen version and press   enter this shows that we successfully  installed python guys we successfully   installed the current python version now  we will install PyCharm for that go to   the web browser I'm using Chrome you can  use any web browser here and type PyCharm   and press enter on pressing enter the PyCharm  is visible it is owned by jet brins click on it now we have reached the official website  here now you can directly click download here   or you can also click download let's click  here and you can see PyCharm Professional   is visible which is for a 30-day trial then  you need to pay but we are going for the free   version that is the community and here  it is you can download it from here it   is freely available and open source so  you can use it exe file download click   on it and the download will begin here it is  the download started it's only 418 MB let's wait we have downloaded it the  exe file for PyCharm Community   version free version right-click and  click open to begin the installation minimize the installation started click  next now it is asking where in PyCharm   will get installed it will get installed  in the following location keep it as it   is and it will take 1.5 GB click next desktop  shortcut yes we can create update context menu   you can add a folder to project so that you  can directly so that you can open any folder   as project rest keep as it is and click next  start menu folder yes click install and now   the setup will begin the desktop shortcut  also got created now let's wait for this to complete we have completed the PyCharm setup I  can also select run PyCharm Community Edition   and click finish so that it begins but  I'll only click finish so in this way   we can install PyCharm now let us open  PyCharm and connect python with PyCharm   can also go to start and type PyCharm  instead of going for the shortcut here   it is Community Edition click open  now PyCharm will open for the first time here in the terms are visible  I confirm click continue data sharing PyCharm opened for the first time you  can go to customize and here in select let's say   light theme or dark theme I'll go for light theme  and rest keep as it is go to projects click new project here and you can see  our project will get Sav here   if you want to change the name of  the project you can change from here let's say I'll name it Amit project okay now you need to set the environment  it will go for the virtual environment a new   environment which is fine the base interpreter  you need to set it is only showing python 3.9 and   3.10 click on the three dots program files  wherein we installed python 312 click on it select python.exe and click okay  here it is we have set python 3.12   now you can see we have set it  we have connected our python with   PyCharm okay remove this because we  are creating a sample program click create okay here is our project I clicked here for the menus okay now  this is our project this is our environment   now we need to add a new python file to run our  first Python program I'll right-click on Amit project click new and here in select python  file directly it will automatically go for   pi extension here and add the name  of your python file and press enter   it will create here it is  now you can directly run your first project let's say I'll just print  something okay I'll go for file save   all right-click and click run demo here  it is Studyopedia is visible okay that   means we successfully printed using  the print method in Python here it is Studyopedia you can also run using run here click  run demo or you can use Shift+F10 or Alt+Shift+F10   okay you can also increase the font by going to  file settings here and appearance is visible go   for editor and click font here in the font is only  13 let me take it to 18 and click okay click apply   then okay now the font will increase here it is  even the output font is also 18 and the editor   font also with that you can check the location  also here it is C users amit_ PyCharm projects C   users amit_ that is a username here it is PyCharm  Projects AmitProjects here is a file with that you can also go to the  file option and click settings   okay and select the project and The  Interpreter to verify again that you   have set 3.12 or not for our project and  yes we have set the same 3.12 for our project we installed Python and PyCharm  successfully and we also ran a sample   python program now let us install pandas  I told you before that to install pandas   we need numpy so first we will install numpy  then we will install the pandas Library let's   start to install numpy go to file click  settings after clicking settings you need   to go to your project here okay here is the  project name the same project click python interpreter after clicking click on the plus  sign here and type numpy that's it here it is   this is the official website just select this and  click install package now the installation will start we have installed numpy okay now type pandas here it is Click install package   and it will install pandas also  click on it now pandas will get installed we have also installed pandas  click close now here in you can verify we   have installed both numpy and pandas that's it  click okay so guys we successfully installed   numpy and pandas now we can start with our  first pandas program thank you for watching   the video in this video we will begin with  pandas and understand what is a dataframe in   pandas and how we can create it we will also  see some examples coding examples let's start   the dataframe in pandas is a two-dimensional  tabular data that is like a table with rows   and columns to create a dataframe we use the  dataframe method that includes the following   parameters okay that is what you want to store  in it how to label it with indexes okay how   to set column labels also if you want to add a  specific data type use the D type parameter if   you want to copy the data use the copy parameter  so these are the parameters okay we will see the   following examples to create a dataframe these  examples will actually help you in understanding   the concept of dataframe completely first we will  create a dataframe then we will access a group of   rows and columns after that we will access the  rows and columns by integer positions using an attribute with that you can also name your  own indexes instead of the default one we   will see in the examples also we will run  an example to completely iterate a dataframe   okay so let's start with the first example  that is how to create a pandas dataframe we   will use a built-in method that is pandas  do dataframe let's start with the first example here is our PyCharm we will be  creating our project here the pandas   project first we will enable the menus  click here go to view appearance and   select main menu as separate toolbar  okay so we are having this open source   PyCharm Community Edition so let us  create a new project for our dataframe file new project click on it now here add the name of the project here it is I have  named it Pandas dataframe   you can also see the path wherein it will get saved okay so the project location will   be the following C Users Amit_  PythonProjects that's it click create our project got created now we will  create files right-click python file because   you want to create a python file I'll type the  name let's say demo1 will automatically add the   .py extension when I'll press enter because python  file is by default selected press enter it added now we will create four more  files because we are having five examples demo two we have created five files python  files now let us begin with the first program we have also added the commands using  this okay that's it we will create a pandas   dataframe our first program first import  pandas add an alias that is we are adding   an alias pd so that we don't need to use the  complete pandas word again and again so first   let us create a data set which is a collection  of data let say I'll add the name data here and   within that I'll be adding my content let's  say we are having St we are adding Student Records we have added records here that includes the  student name rank as well as the marks of students   now we will create the dataframe before that if  you want to change the font go to file settings   now here in typ font that's it here it is the size  is 18 I'll just set it to 17 or 16 and you can   also change the line height if you want but I'll  keep it as it is apply okay now the font is fine   now I'll create a new object this is a dataframe  object and I'll just print using the pandas   do dataframe method I told you and within that  just add the data set that is the data and print it I'll also add a text student records  and for new line I have typed SL in two   times two new lines I have printed the  dataframe like this okay that's it go   to file save all right-click run demo1  let's see the answer here it is we have   printed it so we were having three columns  student Rank and marks this 0 1 2 3 4 are   the default indexes we will see how to change  the indexes also later and we have printed our   records three columns and five records  student names student ranks and student   marks that's it so we have printed our first  dataframe now guys let us move to our second example in the second example  we will access a group of rows   or columns in a pandas dataframe  using the looc attribute okay let's see here is our second example demo 2 okay now let us first start with importing our pandas  import pandas as pedd like we did before   we created an alias here now let us add  a data set we can take the same data set here it is we added the data set  student Rank and marks are our columns similarly create a new dataframe we will  add the data set in it use pandas do dataframe   method to create the data set in a similar  way add the data we have added the data set   but we want to access a group of rows and columns  here in what we can do we can add our own index okay we have added it okay now just print the records df now we will access the value in the   student column corresponding to the row a  label that is accessing a group of rows or columns okay that is corresponding to the row label  we will find the student value column for that you can use data frame. looc within  that mention what I just said that is   we want for row a and the student  column that's it nothing else and   you can directly print it I'll add a print  in here also you can add a text like we did before value for example I'm printing it okay  so we added and now we will print it but here   in you can see we can add like this this  looks fine now okay we can also fix it here okay now I'll just go to file save all  right-click and run demo to here it is   we have printed it the Student  Records we have printed first   and we have removed the default  labels for student column we have printed the value for Amit  corresponding to the row a label now let us see the third example in the  third example we will access a group of   rows or columns by integer positions in a  pandas dataframe for that we'll be using   the data frame. ioc attribute before that  we used the Lo now we will use ioc let's see our third example example import pandas as pd  we have created an alias now let us set the data set okay now we will take the  code here was our data set okay and we can also take this to print  it completely here it is what we can do I'll just fix this I'll fix this okay  now this is fine I'll do the same for all okay okay now here it is we have  printed our data set and added our   data set to the dataframe and also our index  and printed as Student Records that is the df now what I need to do I need to  access using rows or columns by integer positions let's see use the dataframe dot I told  you ioc within that now add the   numbers just mention the positions  1 comma 2 I need to print it also print okay slash and for new line value  that's it and another new line this looks   fine right-click and run demo 3 now it  will display you the group of rows and   columns by integer positions so this was our  dataframe and this is our output the value 1A 2 that is that is first row and second row  so it has printed first row and second row   this was the zeroth row this is the first  row okay based on index so it has printed   the following that is row b and row C if I'll  try to change it let's say I'll go for 3 4   let's say now what will get printed DNA fine  0 1 2 3 third and fourth d and e got printed   okay so this is how guys we can group rows and  columns and display by integer positions under   the ioc now let us see the fourth example  under this example we will name our own   indexes in a pandas dataframe using the index  argument we just saw it we will see it again now here it is let us see our fourth example  import pandas as pd now mention the data set okay here it is we have added we have added three columns with values okay now we will use the index  argument to set your indexes okay I'll   explain you now df a dataframe pandas do  dataframe add the data in it that is the   data set here it is now you need  to add your own indexes using the   index argument that's it within this add  your own indexes now I'll add let's say   student one okay I'll just copy this and now  student two student 3 student four student 5 okay so now we have added the  records and explained it with   indexes simply print print I'll now  print the dataframe that's it df   okay within that you can add student  records and two sln so that it looks fine I'll just go to file and save all  right-click and run demo for now let's   see okay we have printed our dataframe before it  was having 0 1 2 3 4 right the default indexes   but we can change the indexes this looks like  a better way to display a table record student   one record student two student three student  four student 5 that's it okay so this is how   we can name our own indexes in the pandas  dataframe now let us see the fifth example   in the fifth example guys we will iterate  a pandas dataframe using a for Loop that's   it we have shown how to display a dataframe  but we will iterate it using a for Loop let's see import pandas as pd okay now I'll just  take the previous example data set and I'll   also print indexes like this using the index  argument similarly like we did before that's   it our data set is here we have added  a data set inside our dataframe using   pandas do dataframe method and our own  indexes using the index argument but we   need to display it we you can display it  like this also like I did before student record or if I want to show the iteration I can do it iteration will display  your columns only the columns let's say if you want to display only the  column you can just type for Loop in within   that mention the dataframe in it that is the  dataframe object and this will print your columns print columns that's it  file save all right-click run demo 5 here it is displaying The Columns  using iteration student rang marks   we have displayed The Columns okay I  create a pandas dataframe to display the columns okay so in this way guys  we can easily work around the data frame we saw the following five examples  to create a dataframe access a group   of rows or columns to name our own  indexes and display only the column names in this video learn about dataframe its  attributes and methods so we discussed what is   a dataframe it is a two- dimensional tabular  data like a table with rows and columns it is   having some buil-in attributes and methods that  actually extends the function ity of dataframe   the following are the attributes and methods that  we'll be discussing with live running examples   dat types to return the current data type of the  dataframe okay like which column is having which   data type endm will return the dimensions for size  we'll be using the size attribute for returning   the dimensions we'll be using the shape attribute  to get the index the index attribute will be used   also to get the transpose of rows and columns  we'll be using the T property for methods we can   return the first nend rows using the head method  the last nend rows using the tail method so let us   see the examples the first example first we will  create a dataframe using the dataframe method and   then we will get the data types of each and every  column using the D types so let's let's see the example here's our open source  pycharm Community Edition so   let us create a new project go  to file new project name the project okay Panda dataframe attributes  and methods click create and the location   is the following for our project click create now it got created okay now we are having  eight examples so I'll be creating the   files let us begin with the first file  right-click here new python file enter   the name of the file when I'll press  enter it will automatically add the   py extension because the python file  is by default selected and here is a similarly create the rest rest  of the seven files right-click new python file so here are our eight files let us add  the commands here using the hash let me add it let us begin with the dataframe D  types attribute that will return the   data types in the dataframe  for that I'll first import pandas this is an alias so that  we don't need to mention pandas   again and again now let us create  our data set create a data object   and within that add your data I'll be  adding three columns student add the records let's say I'm adding records of five students okay now add the rank for the new column  comma and then press enter and within that add add rank now add marks of each student now we will create a dataframe  using the dataframe method we will   add our own index using the index  argument so dfpd that is pandas do dataframe within that add data  comma index argument to add a ro indexes that's it within this add  your rows let's say I'll be adding row a row b row so we have added the five  rows now let me print the records just mention the dataframe  name object and you can also add a text here Student Records add new  lines using two slns two new lines now the data types using data frame. D types print okay we have used the D types attribute file save all right-click  run demo1 here in you can check that   we have the data types here for student  both were integers so the following is   visible in 64 and here it is using the  D types attribute we can return the data types in the second example we will be returning  the number of dimensions of the dataframe using   the endm attribute or property let's see the  example demo2 we will be using the NM attribute   to display the Dimensions that is return the  number of dimensions of the dataframe import pandas as pd and alias now let us add the data set let us get from here okay so we have  a data set now data with three columns and   five records student rang marks we have  created a dataframe using the dataframe   method we have added the data here and our  own index we have displayed the Student Records okay we have displayed the dataframe  now we need to get the number of dimensions for that use dataframe.ndim that's it number of Dimensions okay you can also  add a new line here save all right-click   run demo2 here it is we have Student Records  three columns five records obviously it's of   two dimensions that's why it's in a table  it's in a table like form that is a matrix   form now let us see the Third example we'll be  getting the size of the dataframe that is the   number of elements in the dataframe  okay using the size attribute let's see so here it is Pandas dataframe  size attribute to get the number of   elements in the dataframe okay first  import your pandas and create an alias now within that add the data set let me take from here okay we will also display the data  set here is our data with three columns   and five records we have added our data  to the dataframe object here and we have   also setup our own index is using the  index parameter and we have printed a   dataframe get the number of elements  in the dataframe using dataframe dot size okay that's it print it we can add a text here message number of  elements okay and a new line also here also   file save all right-click and display it  run demo first we have printed our student   records and the number of elements in it  so this is having 15 elements right 1 2 3 6 calculate from here 15 elements 1 2 3 4 5 so it has displayed the number of elements now let us see the fourth example we will get the shape of the  dataframe that is to return the dimensions   of the dataframe in the form of a tle you  can use the shape attribute let's see the example here it is the shape attribute let me import the pandas import  pandas as pd we have created an alias now   the data set I'll take it from here again are  five records Student Records three columns   student rank marks and five records we have  displayed the dataframe using the dataframe   method and within that we have entered our  data set and our own index now to get the shape use df that is data frame. shape attribute   that's it and just print it  you can also add a message and two new lenses now I'll go to file save all  right-click and run demo 4 here it is   the shape is here five rows and three columns  5 comma 3 means five rows three columns so we   have printed the dimensions also using the  shape attribute now let us see the next example in this example we will display the   index of the dataframe using the  index attribute let us see the example here is the example for Pandas  dataframe index attribute import pandas   as pd and add the data set so that we can  display the index only here and we have   created an index using the index argument  here it is our data and our index is here   we added the data in our dataframe using  the dataframe method and we were having   five records and three columns we need to just  display the indexes for that use data frame. index print that's it dataframe index file save all right-click  run demo 5 you can see all the indexes are visible these were our indexes row a row b  row C Row D and row e row a row b row C Row   D and row e these are visible here and  we have printed it separately using the   index attribute now let us see the next  example in this example we'll be getting   the transpose of the rows and columns and  we will also understand what it is using the here it is Pandas dataframe T attribute let us import pandas and create an alias now add the data set here is our data and we have added the data in  the dataframe method and the index also using the   index attribute under the data we have or three  columns and five rows five rows will display the   five records and we have printed the dataframe we  created a dataframe using the dataframe method and   printed now we need to just get the transpose  transpose is the opposite okay that is rows and   columns will get converted to columns and rows for  that use df Dot and print it you can also add a   text here for more clarity and I've also added a  new line before and after that's that's it right click run demo so now you can check here the result is completely opposite okay  this was our actual dataframe and the result   is the following columns are in the place of  rows and rows are in the place of columns this   is how we can display transpose using the  T attribute in the next example we we'll   be working on the methods of dataframe in  pandas that is the first one beginning with   the head okay if we will use the following  head method it will display the first nend   rows that means by default it will display the  first five rows and if you'll mention any value   within the head method as a parameter then that  number of rows will be returned let's see an example here it is head method let us import pandas and add an alias also pd  okay now let us create a data set I am creating a data set here we have three columns  and five rows I'll add more to it for this example okay I've added one more okay now we have six records records  of six students with three columns that   is student name rank and marks okay  and we need to also add this for the   additional record that's it we have  printed it we will return the first   five rows by just using the head method  because it display the first five rows by by using only the head  method okay print data frame. head first five rows it will print by default this looks fine okay now I'll just go to right-click  and run demo 7 here and you can see we   were having six records now and it  displayed only the five rows the   first five not the F1 why because we  we have used only the head method now   you can also change it let's say I'll just  mention first two rows and two inside the parameter and here in we have only displayed the   first two one why because  I have mentioned two Under head here we have displayed the top end rows  using head method and the top two rows using   head method only but with the parameter  Valu that's it now let us see the next example we just saw how to get the top end rows  so to get the last 10 rows use the tail method   and the same concept okay let's see the example  we will print the top five rows using the tail   method and we will also add a parameter under the  tail method to get specific number of rows let's see here is our pandas dataframe tail method let   us import pandas first and add  an alias pd Now set the data set let me take the following I have displayed the data set here three columns  student rank marks and records of six students   with the student name rank and marks and we have  displayed the dataframe using the dataframe method   within that we added a data here from here  and the specific index names after that we   have displayed the dataframe using the following  print method now let me display the last nend rows df do tail print using the print method last five rows okay you  can now mention last five rows   here because by default it will display five rows run demo it here it is so  this was our actual dataframe   with six rows but we have displayed the last five rows okay to display it properly  I just forgot the slash in here right-click run demo8 here it is it  is visible now now let me display let's   say last two rows what I'll do I'll just mention two here it is last two rows are visible e andf   will be visible okay if you'll see your  exact row here it is enfr visible here in this video we saw how we can work around  the dataframe attributes and Method we saw   all these eight examples to understand the  concept properly thank you for watching the   video in this lesson we will learn how to  easily join dataframe in pandas we will   also see an example so if you have have  two different data frames you can easily   join the join method is used to join the data  frames in pandas let's say you have two data   frames with different rows and columns  you can easily join them let us see an example so we have this PyCharm Community  Edition which is open source and freely   available so I'll create a new project  go to file new project let me add a name okay it will get saved in  the following location just click create so we have created it we have a  single example so I'll just right-click   new python file add the name of the  file let's say demo move on when I'll   press enter it will automatically  add the py extension press enter so   here is our file we will create our  program to join two data frames in pandas import pandas as pd so we have  imported pandas and created an alias using   the as keyword okay let us create a data sets  I'll create data sets for both both the data frames I have added ID column now the student column ID here  is our student column add student name now let's say R number okay now data 2 for our  second dataframe let's say I'll add rank add ranks and marks these are the rank and marks of students  so let's say we have their records in different   data sets so we can easily merge them and  join them now we will create a dataframe   to create a dataframe we use dataframe  method so I'll be doing the same let's   say I'll name it dataframe 1 and I'll  be adding pandas that is pd.dataframe Method add data one in it print let's say I'll print it dataframe 1 now let me print  dataframe 2 okay data two will   get added here this data will  get in here in the second data frame we can first display it file save  all right-click run demo1 now we have our   two data frames here it is okay dataframe  1 and dataframe two we can add this looks   fine okay right-click run demo1 now  this looks fine now we need to join it join two data frames for that let's  say I'll name it resultant dataframe and   dataframe 1 dot join and in the brackets mentioned  dataframe two that's it this will join both of them print the resultant data frame joining two data frames okay right-click run demo1 we were having two data frames for Student  Records with ID student role number and rank   and marks the join method will merge all  of them here it is joined successfully so   in this way guys we can easily join two data  frames in pandas in this lesson learn how to   concatenate data frames in pandas let's say  we have two two data frames and we need to   concat them we can easily do it using  the concat method this will completely   concatenate the contents of the dataframe  let's see the example first we will create   two data frames and add content to it then  we will concat them let's see so we have   this PyCharm ID the community version which  is free and open open source so here and we   will first create a project new project file  new project let us add the name of the project pandas conad dataframe let's say  you can add any name it will get   saved in the following location  pandas concat dataframe Okay click create this will create a new project here  it it is but no files are there we want a   single python file for our program  right-click new python file add the   name let's say I'll add demo1 when I'll  press enter it will automatically add the   py extension why because the python  file is by default selected here it is now let us add a comment quickly here it is cting it to pandas dataframe  import pandas as pd we have imported pandas and   created a new Alias also pd so that we don't  need to add the word pandas again and again   we will just add pd now let us add the data  sets since we'll be creating two data frames   we'll be adding two data set data one for  the First Dataframe name let's say I'll add ID within that add the ID  of students let's say for an example so I'll add records  of five students so five IDs and now add student that means  the name of the students I'll add here now here it is Amit John Jacob David and Steve now role number of students now let us add the data for the  the second dataframe okay here in I'll add ID let's say continuing with the  previous one s06 s07 s08 and the   names for the next set of students  above we added five students now we   will add three students let's say  and we will concatenate them Ben Kane Rohit student now the role number for  the last three students which we just added we miss the comma here here and  also here we have two data sets now we   will create a data frames using  the dataframe method dataframe   1 is equal to pandas that is pd.dataframe  dataframe and within that add data 1 that's it we can also add indexes  using the index argument so   that we can add our own indexes  okay let's say I'll add student one okay index for five students student one 25 now print the data Frame data frame one okay now do the same for the second  dataframe let's say dataframe 2 pandas that   is pd.dataframe pd.dataframe dataframe method  and data to that's it let's say I'll add index also student five six sorry student  7 and student 8 okay total eight   students five in the first and three  in the second fine now I'll print it data frame two okay fine now concatenate create a resultant data frame pandas do concat dataframe one comma data Frame 2 print it let's say I'll print them  concatenating data frames and now I'll use the now I'll print the resultant dataframe  what I did I just used the concatenate method   and mentioned the data frames I need  to concatenate and printed the result   resultant dataframe file save all two here  now you can also add sln fine right-click   run demo1 now let us see the output again we  were having dataframe one with five students   student records and dataframe two with three  Student Records since we were having different indexes we added indexes for students after  concatenating them they got concatenated   exactly in the same order 1 125 is here this  was the first dataframe and this is the second   dataframe in this way using the concat method we  can easily concatenate the data frames thank you   for watching the video in this lesson learn what  is a series in pandas and how we can create it   we will see some examples also live running  coding examples series you can consider as a   onedimensional array just like you have a column  in a table it is also having labels that is it is   a labeled array that can easily hold data of  any type to create a series in Panda use the   series method okay and here are the parameters  data index D type name and copy okay the data   parameter is used to store the data in the pandas  series okay just like we saw in dataframe we were   having the data parameter there as well index if  you need to set your own indexes use the index   parameter to add a specific data type use the D  type to add the name of the series use the name   parameter and to copy the input data use the  copy parameter so we will see some examples   so that we can easily understand the series  method and learn how to create a series in   pandas guys here are the examples for series  first we will create a pandas series then we   will access a value from it after that we will  name our own indexes in a pandas series and   then we will learn how to access a value with  labels in a pandas Series so let us begin with   the first example that is how we can create  a pandas series using the series method let's start here we have our PyCharm  ID Community version which is   free on open source so let us  create a new project here for   our series examples go to file click new  project now add the name of the project here we have given the name Panda series you can   give any name the location of the  project will be the following click create our project got created here it  is now let us add our first program file   right-click new python file add the name  of the Python file let's say our first   program demo1 it will automatically add  the py extension because python file is   by default selected press enter now here  is our first program we have total four   examples so let me create three more  files in a similar way right-click new   python file demo2 and I'll press enter two  files created let me create two more files quickly now here we have our four files let  me add comments to them then we will start our program here it is guys our first  program create a pandas series   I've added a command using hash now let us import pandas import pandas as pd we have  created an alias pd so that we   don't need to write pandas again and  again for that we have used the as keyword let us add the data okay for our Series  so here it is I have created a new object data add some sample elements let's say  I added five elements integer now I'll   create the series create a series using the series method s is equal to let's say pd that  is pandas do series method and   within that add the data this is the  following alias we added display the series okay s that is this we have displayed  it go to file save all right-click and run demo one we have created our first series  here it is and the indexes got added   by default on its own okay so 0  1 2 3 4 are the indexes this will   get added on its own and it has  also shown the data type of the series guys let us now see the second program  in which we will access a value from a pandas   series using the following brackets this  is used to access a value from a series   you just need to set the index of the value  you want to display inside this so let's see here is our second program to access a value  from a pandas series okay first let us add some values I'm taking the data from  here so let's say the following   is a data first I'll import pandas  as pd I've added the data here the   default and we created the series  using the series method display the series and access the value  let's say we want to access   a specific value for that I'll just  first write the name of the series s within the brackets I'll add two let's say  to access the third value okay so this is the   index 012 that is the zeroth index first  index second index here it is 40 will get printed okay file save all run demo 2 we printed the second index that is  0o first second 40 Value in this way   guys we can easily access a value from a pandas series how to name your indexes so by default  we saw that 0 1 2 3 4 comes as an index what   if you need to add your own values for  index we can easily do it okay using the   labels you just need to set the labels using  the under the index argument let's see the example name your own index is in a pandas data frame so what I'll do I'll just  take this data and I'll print it here okay this is what we saw before simple so where we can now add the  indexes for this just like we saw in   dataframe after the data just add the  index argument and place your index arguments okay so we added five indexes for our  five records that's it now display the series we have to display it in in a similar  way okay print s you can add a text here message series the custom index labels now let's see what  will be visible file save all right-click run demo 3 okay we did a flaw this is fine now it should work and we ran it successfully custom label you  can see row a row b if if I won't add this let me   remove this now when I'll run it you can see the  difference 0 1 2 3 will be visible so to add your   own you just need to add the index parameter here  comma index that's it and here it is row a row   b row C Row D row e so in this way guys you can  easily add your own custom indexes let us see the   next example how to access a value from a pandas  series with labels okay so we saw before that we   added a custom index for the series so what if  you need to access any value from the series with   the custom indexes for that you need to refer  to the label that's it let's see the example here it is access a value  from a pandas series with labels okay let me take this code we added data to a series and also  added custom indexes okay here it is our   data data is added here and we  printed it now we will access a value referring the label write the series name and within  this first we added a number now we will add the specific index name that is referring  the exact label Row D that means the following now you can add a text  here value from a pandas series   with label Roi specific label file  save all right-click run demo for four here it is we added this for R 0 got printed here okay value  and we should remove the following it's   not required right-click run now okay this  was our series and this is the it value we   printed in this way guys we can easily work  on accessing a value from a pandas series with label so guys in this video we saw how we  can easily create a series in pandas we saw   four live running examples to understand the  concept in this lesson we will learn about the   attributes and methods of a series in pandas  okay so we saw what is a series we saw how to   create a series with some examples but now  we will see what are its buil-in attributes   and method that would actually extend the  functionality of a series in pandas let's see so Series in pandas is a one-dimensional  array it has some built-in methods for basic   functionalities like the following it also has  some attributes the following attributes let's   say to get the data type of your series get  the dimensions shapes also the index of the   series which we create using the index argument  with that we have some methods also to get the   first or last n rows also to display the  summary offer Series so when we'll see the   examples the concept will be more clear let's  see we will begin with the dtype attribute that   would allow us to return the data type  of the series okay let's see the first example here we have used PyCharm ID PyCharm  Community version which is free and open source   you can also use it let me create a new project  here go to file new project now now name the project let's say I'll name it Panda series okay I'll click create before that you  can see the location of your project click create now the project got created here and  we will add the python file right-click new   click python file name the python file I'll  name demo1 for our first program when I'll   press enter it will automatically add the py  extension here it is got created in   total we have nine programs so let me create  eight more files in a similar way right-click   new python file demo to two files got created now  I'll quickly create all the files first till demo 9 so guys we have created all the nine  files now let me add comments using the hash first we will begin with the Pandas D type attribute okay this is used to return the   data type of the series let me first  import pandas using import pandas as pd I have created pd as an alias okay the Alias is created as pd using the as keyword let me add the data first  okay this will get stored in the   pandas series I've created an object  data within that I'll add I'll add   some integers five integer elements for  our example or series now I'll create a series using the using the serie() method so  I add an object pandas that is pd.series()   method and add the data in it that's  it we have created a series now I'll   display the series and later on I'll add  the dtype attribute so I'll just print the series okay now data type will  print it just use the series.dtype this is what we did dtype attribute here now print it okay here we can add series data type file save all right-click run demo one the following is our series with  our five elements and the index got   added on its own the data type here it is  in 64 okay obviously integer elements are here okay so in this way guys we  can return return the data type   of the pandas series using the D type  attribute now let us see the second example in the second example we will  return the number of dimensions of the   series using the series. ndim  attribute okay let us see the example here it is demo to ndim  attribute import pandas as pd okay now I'll take it from here  the data and I'll display   it like this so here is our data we added it here data okay Series S we printed it the  following series and our data was having five   integer elements now I'll print the dimensions  using the ndim attribute mention s.ndim that's it and we can also print a message dimension now go to file save all right-click run demo to so here is our series and it has printed that this is for  one dimension that is the dimensions are one   in this way guys using the ndim attribute we can  display the dimensions let us see the next example example in the third example we will return the   number of elements in the pandas series  using the size attribute let us see the example okay import pandas as  pd now let us take the data from here data is the following  and we added five elements   and printed the series using the  series method now we need to get the size that means a total size of the  series so here in we are having five   elements so five should be the output  series dot size that's it and print it now series do size okay we will return the number of elements  in the series save all right-click run demo3 here and we were having five elements I told  you output will be five we can add a slash in here that's it we printed at series size  five so guys in this way we can easily find   the number of elements in a series using  the size attribute let us see the next example name attribute this will return the  name of the series using the name attribute but   first we need to set the name name using the  name parameter of the series method so let's see here it is demo4 name attribute  import pandas as pd now I'll show   you the same example and I'll first add the  name of the series because we still haven't   added it here okay so I'll just add it  our data is here five elements’ series   and the data got added in series in the  series method itself you can add the name attribute and mention the name of the series  let's say I'll add a random name my number   series that's it display the series we need  to display this name that's it for that s do name print it okay s. name go to file save all right-click  run demo4 so our name was my number   series and the output should be the  same here it is series even in the   print itself it is showing the name  but we displayed it using the name attribute my number Series so this is  what we named here so using the name   attribute we can display the name  of the series let us see the next example in the next example we will  be using the has Nan's attribute to   return true if your Panda series  is having nine values that is not   a number so we will be using the  has n's attribute let's see the example here in let me import pandas and  I'll take the same example let's say the following and I'll add it here the data  and the name of the data we printed the   series using the series method added the  data let me add a nan to add Nan we'll be   using numpy so if you remember before  installing pandas we installed numpy   also that's a necessity because pandas is  built on top of numpy okay so I added Nan here now I'll check for Nan and here in you can see after adding this  it automatically added the it automatically   imported it because if you won't import  it there will be an error since we already   installed it it imported on its own now  series do has Nan's attribute that's it print does the series has Nan I'll go to  file save all right-click and run demo 5 okay yes the series was having Nan  and has Nan's method will also show   that it's true that we have Nan here  so in this way guys we can easily walk   around the has Nan's attribute  using both the numpy and Pandas library in the next example guys what we  will do we will display the index of the   pandas series but first we will set the  index because the default index visible   in a series is like 0 1 2 3 4 so if  you want to add a custom index you   need to use the index parameter of the  series like we did in dataframe let's see okay import pandas as pd  let me take the example from here demo6 so our data is here five  values okay and we also added a name   let me add index attribute here okay now we  can add indexes the array for index here it is let me add something like number one five values right number  three number four number five okay I'm just displaying a sample example  nothing else okay we added the index and we   displayed the series like this using the series  method we added data then index and then the name Now display the index using the  using the series do index attribute print it go to file save all okay you can mention return  the index right-click run demo 6 here it is our   series and we added custom index label num  one num 2 num 3 num four num 5 for our five   values okay and it displayed here series index  using the index attribute first we added using   the index parameter then we displayed the indexes  using the index attribute now let us see the next example head method so we will be explaining the  head method of the pandas series it is used to   return the first and rows of the pandas  Series so if you won't mention anything   under the parameter it will display the top  five rows and if you want let's say top three   rows just mention the number three inside it  okay let's see the example to understand the concept he it is head method let me take this from here we have a data here with the  five values and we printed the   data we created the series using  the series method and added our data okay we will also add some more values here   for our example okay because we will  be displaying the first five five rows for that use S do head  method and add nothing in the parameter the first five rows of the  series okay that's it return the first   n rows n would be five by default you can  see I have added nothing in the parameter   right-click run demo 7 and here it is  we added Seven Elements before integer   elements in the series when we used the  head method it displayed only the top   five rows okay now let me display the top  two rows let's say for that just add the parameter right-click run demo 7 here it is okay series then we displayed the first  five rows then we displayed the first two   rows only okay using the head method with  the parameter value to this is how guys we   can work around Ed method to display  the first n rows let us see the next example in this example we will do the  opposite that is we will return the last   10 rows of the pandas series using the tail  method similarly if you won't add anything   under parameter it will display the last five  rows if you'll add let's say three under tail   method it will display the last three rows of  the pandas series okay let's see the example to understand okay import pandas as pd we will  return the last five rows let me take the same example I'm Having Eight elements now  okay and I have displayed the   series created a series using the  series method and displayed it now   I need to get the last nend rows use  series. tail method okay and print it first I'll print last five rows because  I won't add anything under the tail parameter file save all right-click run  demo it now you can see last five rows   are visible we were having eight  elements and last five rows are visible now I'll display the last two rows using  the tail method and I've mentioned the   parameter is two similarly right-click  run demo8 now first last five rows will   be visible then last two rows will be  visible because we added two on the tail   in this way guys you can return the last  n rows of the pandas series now the last example info method it is used to  display the summary of the pandas   Series so let's see what all gets included in the summary here is a demo 9 okay now I'll take an  example of the index one okay so that we get the complete summary we have   the index also and name here also  okay so here in the data with five elements then we have created a  series using the series method   and printed the series now get the  info series. info that's it print it series summary okay go to file save  all right-click and run demo 9 okay so complete info about the series is visible this is the series the name  was my number series data type was in   64 class is the following Panda series  class index five entries num one to num   five which is fine five non-null values  okay data type N64 memory usage is also visible so we have displayed the entire  information about the series using the info method so guys in this video we  saw how we can easily work around   the attributes and methods of  the pandas series we saw nine examples in this lesson we will learn how to  combine two Panda Series so let's say you have   two Series in pandas and you want to merge them  combine them you can easily do it for that we   have the combined method within that you need  to also add a specific function that function   will compare both the series and will display  the largest values or the smallest values from   both the series so when we will see the example  the concept will be more clear let us see the example here we are using the PyCharm ID the  the free and open source PyCharm Community   Edition okay now let us create a new project  go to file new project add the name of the project here is our name you can  add any name and here is the path   of the project okay the complete path click create directly now the project got created  we need to add a file okay a python   file right-click new python file  let us name the file I'll name   it demo1 and when I'll press enter it will  automatically add py extension because the   python file is by default selected  here it is let me add the command okay to begin with import pandas so here we have our pandas import pandas  as pd we have created an alias pd so that   we don't need to write pandas again  and again we achieved this using the   as keyword now guys let us first set the data  for both the series beginning with the first series let's say I'll add five  elements now for the second series we have added five elements for the second series   also that is the data now we will  create a series using using this data series 1 pandas do series data 1 okay  we have used the series method to add a   data now do the same for the second series  series 2 and add the second data here it is now we can display the series so we have printed both the series okay series  1 and Series 2 now we will add a function for   combining them that is to find the largest value  so I'll create a function in Python we create a   function using the def keyword df keyword okay  let's say I have two values I'll be comparing   the values one by one so I'll use the the if  Loop X1 greater than X2 if X1 is greater than X2 you need to return the first  value obviously and if it's else   if X1 is is less than X2 then return the X2 value that's it we will now combine for that I'll be creating a new  object Rees and within that I'll combine   both the series series 1 do combine add Series  2 that's it and the function also obviously   the function is important for the decision  okay series 1 now it's fine now display the result R yes that's it you can  also mention a message after combining now I'll just go to file and save all okay I did a flaw I need to add  Series 2 we are combining one with   two now right-click and run demo1 let's  see the output we have the following two   series series 1 and Series 2 and it  will fetch the largest value here it is okay how this worked it compared  both the series because we added the   following function our series 1 which is  the largest value from 10 and 25 it's 25   from 20 and 5 it's 20 from 40 and 75 it's  75 from 80 and 95 it's 95 and from 100   and 45 it's 100 the output will display  the largest value by comparing both the series here it is 25 20 75 9500 so in this way guys we can combine  two series and add a function okay it   use a specific function for the  decision which was mentioned by us as a parameter of the combine  method that is the demo here in   this way guys you can easily work on the combine method to combine two Panda Series so guys in  this way we can easily combine two Panda series   okay and we have used a specific function for  the decision which we mentioned as a parameter   of the combined method that is here demo here  it is def demo() function demo thank you for   watching the video in this lesson we will learn  how to work with the categorical data in pandas   so it is basically a pandas data type okay which  is corresponding to categorical variables in   statistics basically a categorical variable takes  on a fixed and limited number of possible values   examples you can consider like Gender Blood Type  and others in this lesson we will see two examples   first we will learn how to create a categorical  Series in pandas and second we will see how to   create a categorical dataframe in pandas okay we  can use the pandas do categorical also pandas do   categorical method also but we will use the  data type the dtype parameter of the series   and dataframe methods to create a c category  that is a categorical series and categorical   dataframe respectively okay let's see the first  example how to create a categorical Series in pandas here we have our PyCharm ID the PyCharm  community version is free on open source I'll   create a new project go to file go to  file new project now add the name of the project let me name it Pandas categorical data okay you can add any name and  the project will get saved here create here it is our project got  created so we are having two examples   so we need to create two python files  right-click new python file add the name   of your python file when I'll press  enter it will automatically add py extension okay because python file is  by default selected here press enter we   have created our first file now let me create  our second file right-click new python file   add the name demo to pressenter and we have  both the files now let me add the comments quickly now let me create a program  so that we can create a categorical   Series in pandas first import pandas  and add an alias to it using the as keyword now let us create a categorical series we can directly use pandas that is pd. categorical okay but we can also add  the D type parameter we will see the same so I created a new object now I'll use  pandas do series method to create a series   okay and after that I'll make it a categorical  series by just typing D type parameter and setting the data type to category so here I'll  add my values okay let's say I'll add demo values okay I have added values here five  values now I'll just display the series or categorical series  okay just display s that's it okay I'll go to file save all  right-click and run demo1 here our   categorical series will be visible and  you can see the data type is category   and it is showing that we are having four  categories that is pqrs pqrs it is showing here in this way guys we can  easily create a categorical Series so we created the categorical  series successfully using the D type parameter now let us see the second example  where we will create a categorical dataframe   using the D type parameter we will  create three categories let us see the example here is our second example to  create a categorical dataframe import penders create a categorical data frame okay here it is df or object pandas do datf frame that is  pd.dataframe to create our dataframe   now now within this we will add our data and set categories and in the end we will type  the D type parameter we will set the D   type parameter for the data type as category  that's it so we can directly use the pandas   do categorical function also but I'm but  I'm creating it in an alternative way so   let me create my first category here let's  say the category name is cat one okay and   I'll be setting a list in it using the  list method let's say pqrs some random values first got created let me create the  second category and the third also now I'll change it to CAD 2  for the second category and CAD 3   for the third category and I'll also add  some values PQ RP here and some random Q   this would be fine and the data type  we already set to category good to go now display the data frame one by one I'll be displaying the data type   of each column we have three  columns here Cat 1 Cat 2 cat 3 go to file save all right-click run demo two okay we have our three categories here and a  data type I have displayed and it is showing that   the data type is category which is fine I just  wanted to show it here I did the mistake typo   each no it's fine right-click run demo to here it  is in this way guys we can easily create a data frame so guys we saw how to create  a categorical dataframe also in this   lesson we saw how we can create a  categorical series and data frame in this lesson we will learn how to work with  categories in pandas for that we'll be learning   how to append new categories and how to remove  a category first we will see how to append new   categories and then we will see the second  example to remove a category let us begin   with the first example in which we will use  the addcore categories method to append the   category let's see the first example we are using  the PyCharm Community version a free and open   source ID okay we will create a new project Cod to  file new project within this add the name of the project okay you can add any name and it  will get saved in the following location   click create here we have created our  project now let us add files right-click   new python file add the name of the Python  file demo1 when I'll press enter it will   automatically add py extension here it  is we have created we have two   examples so I'll create the second file  right-click new python file demo 2 press enter I've added a second file  also let me add the command here we have our first program let us import pandas import pandas as pd pd is an alias now let us create a categorical series okay s is equal to pandas that is pd.series and within this we will add a series and set it   as a category type dtype is equal to  category okay now I'll add the data here okay I've added the data now I'll print the series okay now now append a category  okay now I'll be using the add_ categories okay now I'll type add_categories to   append let me add T here okay and that's it display the updated category s okay go to file save all right-click run demo1  and let's see the result first we were having four   categories okay here it is pqrs now we added one  more category appended it's visible in the end   and we have five categories now in this way guys  we can easily add new category to a categorical series we saw how to append a category now we will   see how to remove a category using the  remove_categories method so let's see the example demo2 okay remove  a category import pandas as pd okay we can take it from here I'm taking  the data from here that's it here it is we   have categories here set using the dtype  par meter and we have displayed the series now s is equal to S do cat. remove  underscore categories so let's say   I'll remove R from here and I'll just print the series okay updated category s let's see file  save all right-click run demo2 we were   having four categories initially pqrs here  it is four categories now we removed one of   the category and we have three only remaining  because we deleted R okay here it is we remove   the r category in this way guys you can  remove a category in this lesson we saw   how we can easily work around categories we  created a category using the D type parameter   we appended a new category then we also saw an  example to remove a category in this lesson we   will learn how to read a CSV file in Python  pandas we can easily read and access a CSV   file with pandas for that the read unor  CS V method is provided by pandas it is   a built-in method so here in we will create a  CSU file and we'll read it using this method   we will create the following CSU file let us  first create it I'll be creating it on the desktop right-click on the desktop new add  an Excel worksheet let's say I'll name it students I'll change the extension. CSV  here when I'll keep the cursor here it   will ask me to change the extension or  not go for yes if you're not getting   the extensions here what you need  to do go to here and view here in   show and you need to enable the file name  extensions this works for Windows 11 okay that's it I did that so the extensions were  visible now we have created students. CSV click on it I have opened it okay now now let me add a demo content to it  let's say student student Rank and marks okay I have added and rank I'll add here marks so I'm just creating a basic data   set okay CSV file generally files are  very huge but I'm showing an example here okay that's it okay now it's visible so here is my  data set I'll go to file and just save it I'll print this using the read_csv method okay we will see these three examples first  we will read the CSV using read CSV then we   will also show the top end rows and the last  end rows of the dataframe that is our CSV file   the first example we will be using the same  CSV file which we just placed on the desktop   students. CSV and we will use the reor CSV method  let us do it we have this PyCharm ID the free and   open source Edition PyCharm Community Edition  so let us create a new project file new project   add a name to your project read CSV it will  get created in the following location click create it got created now let us add  a new file since we are having three   examples so I'll be going for three  files right-click new python file add   the name to your first file it will  automatically add the extension you just need   to press enter now here it is got created  create two more files new python file demo to we have created three  files let me add the commands okay we have added the  comment I'll import pandas as pd we have imported pandas and added an alias here also we will input our CSC  file which we just created we will load this CSV in a  dataframe using the read_ csv now we will create a dataframe object I have   created df is equal to pandas that  is pd.read_csv and now at the path here it is right-click you can copy  as path from here or you can go to   show more options and click on copy as  path okay now right-click and paste your path now I'll print it print the data frame that will include our CSV file save all right-click run demo1 this  may show an error here it is an error so how   to fix it there are multiple ways and  depends on your system what will work   and what won't so I'll show you every  possible way first Type R here okay now Run Okay this still won't work  so remove the r add a path like this double slash now right-click and run demo1  again if this won't work you need to go to your   CSV open it okay go to file save as browse here  in select it as CSV Ms do CSV let's say I'll   name it students new okay I'm giving a new name  go to tools web options here and go to encoding   select utf8 click okay and Save now we have our  students new. CSV add it here right-click run demo one it worked you can see so in this  way you can work around your CSV and   display the CSV data it automatically added the indexes and this was our CSV okay you can close it now right-click run demo1 and  you can see we successfully   displayed reor CSV data in this  way guys you can also achieve the same guys we saw a CSV file  and read it using the reor CSV method now guys we will display the top and rows  of a dataframe using the head method okay if you   won't mention anything in the parameter  it will return the top five rows else if   you'll add any parameter let's say two I'll  add so it will return the top two rows let's see okay so this was our CSC file we were  having five records let me add two more records open this okay now save it the that's it this is our CSV now go to your program Now demo2  okay here in we will display the   top end rows by default it  will show five rows so let's see we will take it from here okay okay and we will we will  display I'll right-click and you   can see seven records are visible we  want to display the top five records so top and rows okay for that what I'll do I'll   just mention data frame. head  that's it and I'll just print it I told you by default it will  show the top how many top five R okay I'll just run it again right-click run demo to so we were having seven records  and it is not displaying the top   five records why because we have used  the following now let's say I want only   two rows so just mention two here that's it  right-click run demo2 and it will display two records top five rows top two rows  and all the records seven in this   way guys we can easily display  the top end rows using the head method now let us see the last example in which  we will use the tail method to display the last   10 rows if we won't mention anything in  the parameter it will display five rows   last five rows else let's say if I add two it  will display the last two rows let's see the example here it is now we have our CSV here  we just saw how to create and how to run it display the last n rows type df do tail and that's  it it will display the last five by default okay right-click run demo 3 we are having seven record CS in our CSV now  only the last five are visible because we have   used the tail method let's say I need to display  the last three rows for example so I'll add three   here that's it right-click run demo 3 so we have  displayed it last three rows using the same method tail in this lesson we we saw how  we can read a CSV file we saw how   to read how to display the top end  rows and how to display the last 10 rows in this lesson we will learn how to read  an Excel file in pandas we will read an Excel   file and perform operations on it to read an  Excel file we use the read_excel method that is   a built-in method of python pandas okay first  we need to install open pyxl Package then we   need to use the read_excel method so since we  are using PyCharm ID for this for this course   so I'll be installing it on PyCharm ID first let  us create a sample Excel file I'll be creating a   sample seven rows Excel file generally Excel files  are quite huge but for a sample example for easier purpose we are just showing this  small Excel file okay let us create it go to the desktop let's say  I'll be creating a new Excel file   on the desktop right-click new Excel I'll name it cricket double click I opened it now I'll add some content to it okay Cricket let's say player rank points  okay so let me add some names cricketer names rank let me add it and I'll just  select both of these and drag it   points let me set the points as  well so this will be our sample Excel I've added the points that's  it now click save here minimize we have created this Excel file with the  extension Excel SX now we will install   the open py Excel package and begin with the  following programs the first program is to read   an Excel file then we will display the top end  rows and then we will display the last 10 rows   I'll load my Excel file in the dataframe begin  with the first program so let us read our Excel   file using the pandas do read excl method  we will load the Excel file read the data   and store it in a pandas dataframe here is our  PyCharm ID we are using the free on open source   PyCharm Community Edition let us create a new  project for it file new project add the project name okay okay this should be fine you can add  any name here is the path of the Excel click create we have created our project now let us  add a file right-click new python file demo1 it   will automatically add the py extension because  python file is by default selected press enter   here it is we created now create two more files  for our other examples demo2 right-click new   python file demo 3 press enter we have created  all three files now let us add the commands also so here is our first example to read the Excel file first  import pandas as pd we have also created an alias now guys I told you to read  an Excel file we need open py xcl   Package okay so we will install it go to  file click settings here and we have our uh packages we have the packages  installed we installed numpy and pandas you just need to go here itself  this was our project name okay we just   went here after clicking settings  interpreter and click the plus sign   here and type the package you want open  py Excel right when you'll click here   you can see it is a library to read  write Excel files fine click install package we have installed open pyxl successfully  click close now it will be visible here here   it is we installed it click okay now guys we  can easily work around our Excel file let us begin first we will input the Excel  file load the Excel in the data Frame dataframe object we have  created it we use pandas that is pd. readxl okay now mention the path okay so our path is on a desktop  here in right-click on your file you   can click click copy as path or you can  go to show more options on Windows Lev   and click copy as path on Windows 10 also  you can find this here we have copied the path right-click paste here is a path you can print the Excel file records now that means d  okay and you can also mention a message here let me save it file save all right-click  run here we have our error okay for Excel we can Type R here let's see right-click   run demo1 this may fix it no still  not fixed or I can remove this and type okay now right-click and run  Dem one let's see now it is still not fixed okay the same error do the  same thing which I did before for CSV open file save as browse under  Excel it's fine tools web options select UTF F okay and you can  give it a new name let's say I'll give it save you can type Cricket new  and rest looks fine run demo one permission issue I'll copy this cut and uh let me paste it here or I can directly  click paste here okay I'll close it first I've copied it now the path is here it's in e Drive run demo1 it's fixed now okay we have fixed  it so in this way guys you can read your Excel file so you may find some errors while  working on Excel or CSC file so I have   shown you ample options to fix it  okay so this should work in 2024 okay so guys we saw how to read an Excel  file using the read undor Excel method we   will display the top and rows of a dataframe  that is wherein we loaded our Excel using   the head method if you want to display the  top first five rows then only use the head   method if you want specific number let's  say want two rows so mention head in the   parameter mention two and only the top  two rows will get return let's see the example here it is display the  top P rows of the dataframe in pandas import pandas Spd okay now let me take it from here I'll  input the Excel file and we placed it in   e Drive okay Excel file records are displayed  here now we want to return the top end rows for   that I'll use df do head and I won't Place  anything inside it because I want the top five top five rows right now here is our Excel file okay it is having  six records okay the top file will get displayed top and rows file  save all right-click run demo two these were our records and these were top five not the last one so the following top five is visible here so this is how we can work  around head method we can also display   specific rows let's say top two I want I'll  just mention two here in the brackets and   that's it right-click run demo2 here it  is we have displayed the top two okay we saw how to work around the head method  to display the top end rows now we will   see how to display the last end rows using  the tail method it works in a similar way   but it will display the last n rows  if you want to display the last two   rows only you can use the tail method  and in the bracket mention to let's see import pandas as pd let us load our Excel and print here is  our Excel it will display the last five rows   that means it will ignore the first one so only  these will get printed let's see now return the last n rows print df do tail method right you can  add a message also like we did before last five rows run demo 3 and the last five rows will  get printed okay the last five rows will get printed using the tail method  similarly if you want to display specific specific number of rows so let's  say I'll add three last three rows mention   the parameter as three under the tail method  right-click right-click and run demo 3 and we   will be displayed with the last three Ben Rohit  and Kan and these were the last three Ben Rohit   and Kan so guys in this way we can easily  work upon the tail method so guys we saw   how we can read an Excel file in pandas we read  an Excel file displayed the top 10 rows as well   as the last 10 rows of the dataframe in this  lesson we will learn what is indexing in pandas   okay with that we'll also see some examples  indexing basically means to index and select   specific data in pandas let's say you have  the following CSV file and you want to get   a specific record let's say the record of the  student Amit you can easily get it or you want   to get a specific column you can easily achieve  this for this video we will use the following   students. CSU file in the previous lectures we  have seen how to create a CSU file how to add   data to it you can use Excel also to create  a CSU file add the data and and save it as do CSV the following CSV file we will consider  here are the records and we will work on this   students. CSC file okay indexing basically  means to select specific rows and Columns   of data so here we will consider a dataframe we  will input a CSV file and load the CSV file into   the dataframe the following operations  will be performed using the indexing operator first we will use the  indexing operator then we will   use the looc attribute and the last example  will be to work on the ioc attribute let's   see the first example here we will use  the indexing operator to retrieve the records we have our PyCharm ID here we  have used the PyCharm community free   and open source Edition so let me  create a new project go to file new project here add the name of the project pandas indexing you can add any name the  location of the project is the following click create we have created our project pandas  indexing now let us add some files right-click new name your file I'll name it demo 1 and the .py extension  will get Auto automatically added why because   python file is by default selected press  enter here is our first file now   let us create two more files right-click new  python file demo 2 press enter right-click new   python file demo 3 press enter here it is we have  created three files now let me add the commments quickly let us see the first example to  use the indexing operator for indexing   let me import pandas first  import pandas as pd pd is an alias here here we have imported the pandas   and created an alias using the  using the as keyword now let us input CSC file and load the CSV file into the dataframe  the dataframe object type pandas that   is pd. read CSV because we you  need to read the CSV here in add   the path following is the path e Drive  okay or you can directly go to right click and select copy as path or  go to show more options and select   copy as path the following is also  visible on Windows 10 this is Windows 11 right-click paste add double slash to run it display the CSV file records or data frame let me print this right-click run demo one so we ran it successfully and  here is our dataframe our CSV loaded successfully check from here same  data we will use the indexing operator let me create a new object Rees this will  store our output and use the indexing operator the following the record you want to retrieve add the column okay here you can also set the the  column for the indexes so I'll add a specific   column that means student always the student  will be visible because we have set it as an   index if we will remove it the default 0 1  2 3 4 indexes will be visible now the marks   will also be visible because we have retrieved  it using the indexing operator now print add a   new line and just mention the following let's  see the output I told you that here are all   the records we wanted only marks records and  we have also set student as index initially   okay that's why it's visible like this now  we are provided with our Marks here using indexing in this way guys we can easily work on  the indexing operator now let us see the second example guys we saw how to work with the indexing  operator and we retrieved the specific column now   the next example we will retrieve a single  Row Record using the looc attribute let's see here is a second example okay now  I'll just take the code from here till   this because we have imported CSV here  successfully and we have set the column   for for the index as student so our index is  now the student column that is the student name now we need to retrieve a single row let's see which row we will access use df  the same dataframe looc operator within this   mention which record you want to access I want  the record for student okay print just print this stuff that's it right-click run demo2  the following were our records and   the student was our index here and we  wanted the records of only the student   with name Amit so here it is rank one  and marks 95 the same is visible here we can easily retrieve a single row  using the Lo attribute in pandas now   we will see our last example to  perform indexing in pandas using   iloc operator this will allow us  to retrieve the rows and columns by position the following is our example to  perform indexing using the iloc attribute   I'll just load the CSV and add it to the  dataframe okay similarly we did this for   the first two examples and our index is  the student column the following was our file we retrieve the following  Row in the last example in this   example we will retrieve the rows and  columns by position let's say we want   Row three records so what I'll  do I'll create a new object R df dooc for the third row we will use the  second index print RS that's it let's   see the output right-click run demo 3 we were  having a complete dataframe we loaded our CSV   and we wanted the for the third row so we  did this using the iloc we just added the   index David Marks 80 rank three third  one 1 2 3 David Marks 0 rank three is visible so we easily saw how to work with the  iloc attribute to retrieve the rows and columns   by position guys in this video we saw how we can  work with indexing in pandas we saw how to work   with the indexing operator the looc attribute  as well as the iloc and we retrieved records in   this lesson we will learn how to select multiple  columns in pandas let's say you have a dataframe   and you want to select multiple columns at once  you can easily do it okay with that if you want   to access more than two columns set it as a  range we will see both of these examples in   which we will select only two columns specific  two columns then we will also see how we can   select more than two columns in a Range let us  begin with the first example in this we will   create a dataframe and to select specific columns  mention them under the indexing operator like this   okay just mention the name of the columns  you want to select that's it let's see the example here we have used PyCharm ID the  free and open source PyCharm Community   Edition let us create a new project  file new project add the project name select multiple columns and here is the  path of the project we have created click create here is a project now let us create a  python file right-click new python file here   we need to name demo1 on pressing enter py will  get automatically added because the python file   is by default selected I have pressed enter  and demo 1p is visible create another file we   have two examples right demo 2 and here  it is we have two files demo 1 and demo two now let us add the commands quickly let us see the first example how to  select two columns first we will import pandas   and create an alias so that we don't need to use  the word pandas again and again here is our Alias pd we have added this using the as keyword  here it is now let us create the data for a   dataframe I'll mention data here  you can add any name now add the columns within this add the  records let's say I'll add five this will have student name here is the rank here is the marks now add a data frame pandas that is pd. dataframe  pd. dataframe and within that mention   a data that's it we have created a data frame now print the data frame or you can directly say Student Records okay I'll use df and in brackets I  need to select two columns let's say   I need to select the rank and Max  column so directly mention rank and comma marks that's it do not mention  the columns you don't want to add and print this selecting only two columns okay go to file save all right-click run demo one okay this was the first letter was in  capital now it's fine right-click run demo one we were having student rank marks  columns we only wanted Rank and Mark   so the following is visible the indexes are the   default 0 1 2 3 4 visible here and here  also selected two columns and displayed it we saw how we can select specific two columns  from a dataframe in the second example what we   will do we will select multiple columns in a  Range let's say we want columns from 3rd to fifth for that mention the following two  colon 5 we have used the dataframe do   columns for this let us understand this  with an example second example import pandas let me add the data set  from here and I'll be adding   more of them now okay student  rank marks also add let's say ID okay we have added ID also comma let's say I'll also add role number let me add address just for demo okay we have added it comma okay that's it  now we have our six columns we will display   it right-click run demo to okay we have  missed it we missed the comma right-click   run demo to here we have our records  okay now we will just try to retrieve   the records let's say I'll just type dataframe  in the bracket I'll type dataframe do columns okay and within that mention two colon 5 and print it this will select columns third to fth right-click run okay 2 to 5ifth means third to fifth  okay 2 to 5ifth will retrieve role Rank   and marks okay here it is in this way  we can easy select multiple columns in a Range okay third to fifth 1 2 3 third  to fifth roll rank marks okay so here   we have set the dataframe as  df you can also mention it   as dataframe and then you need to just  set dataframe here as well as here also okay for an example we have named it  as dataframe you can use df or any name   and the same worked for the first example in  this video we saw how we can select multiple columns we saw two examples first we selected  only two columns then we selected multiple   columns in a Range using the colon operator  in this lesson we will learn how to add a new   column to a pandas dataframe first we will  create a dataframe and then we will insert   a column to an already created dataframe so  we have two ways that means we will see the   following two examples we will add a new column  using the insert method this will allow you to   add the column data the column name as well  as the location of the column that is where   you want to place the new column in the second  example we will assign a new column using the   assign method but in this the new column will get  automatically added to the end okay let us see   the first example I told you we will add a new  column using the insert method this will allow   you to add the location of the new column the  name of the new column and the data of the new   column that means the insert method is having the  following parameters the first will allow you to   add the location wherein you want the new column  the second will be the name of the new column   and the third will include the values of the new  column here it is okay now let us see the example here we have our PyCharm ID we are using  the free and open source PyCharm Community   Edition okay go to file create a new project  file new project add the name of the project here I've added Pandas add  column you can add any name   it will get saved in the following location click create we have created it now let us add a new python file  right-click on the project new python   file add the name of the Python file let's  say my first file will be demo 1 it will   automatically add the .py extension because  the python file is by default selected press   enter here it is demo 1 create a new  file again right-click new python file   demo to press enter okay so we have a two  python files now let us add the commands also okay let us see the first example we will   add a new column to a dataframe  using the insert method import pandas we have added an alien pd using  the as keyword now add the data for your dataframe here it is first I'll add ID then I'll add student column rank marks okay let me add records of five  students so I have added ID now add the name I'm adding the name here okay now add the rank that's it marks so we have created our our data now create the dataframe let's  say I'll create an object dataframe   like this I'll use pandas that is pd do  dataframe method and within that mention   the data that's it dataframe created  successfully you can print the data frame also mention a message like we are having Student Records okay now we will insert a new column  I told you using the data frame dot dataframe dot insert within that  first mention the location wherein you   want to place I'll mention two then the  name of the column I have placed column   name and then the values okay it  has also shown the syntax on its   own on PyCharm ID okay I have added  all the details now what we need to do I'll just print it print the dataframe mention the message updated dataframe that's it now  go to file save all right-click run demo1   let's see what we did first we created  a four column five row table for Student   Records then we added a new column that is  role number and we place it at the second index at index second that means third position  ID student third third position role number why   because we added two here that means index  two means position third in this way guys we   can add a new column guys we saw how we can  add a new column using the insert method we   also placed the colum at a specific location  in the second example we will learn how to   add a new column using the assign method we  will add a new column to an already created   dataframe but the new column will get  added to the end so this is the purpose   of the assign method just mention the column  name and the values that's it nothing else here in we have our second example import pandas and add an alias now we  can take the records from here we added four column four columns ID student rank marks now  we need to use the assign method to add a new column okay mention dataframe okay create  a new object for the output rsdf   let's say within that mention dataframe do assign method under this mention the  name as well as the values like this okay I have added it mention it like this okay this  looks fine now now display the updated dataframe print that's it data frame updated dataframe go to  file save all right-click run demo to here what we did we added the records  four columns okay then we printed the updated   dataframe now we actually did a minor  flaw we just printed this now we need   to print rdf that's her resultant dataframe  again right-click run demo 2 here in I told   you using the assign method we can display the  new column at the last location okay we added it so guys in this lesson we saw how we can add a  new column using the insert method that will allow   you to also set the location of the new column and  using the assign sign method that will allow you   to place the new column at the end of the current  dataframe in this lesson we will learn how to   delete rows or columns in a pandas dataframe okay  for that we use the drop method using this method   you can remove a specific row or column under the  drop method you need to mention whether you want   to delete a column or a row let let's see how  first we will see how to drop a column then we   will see how to drop a row so these two examples  we will cover in this concept let us begin with   the first example in which we will learn how  to drop a column using the drop method okay I   told you we'll be playing around the X's okay if  under the drop method we will set xes is equal to   1 it will drop a column or we can also set X's is  equal to columns to achieve the same let's see the example here we have our PyCharm  ID we are using the free and open   source PyCharm Community Edition let  us create a new project go to file new project add a name okay pandas delete rows  columns you can add any name   okay and the location of the project is the following click create here we have created right-click here now to create  a new python file right-click new python   file mention the name of the file I'll  mention demo1 you can mention any name   when I'll press enter it will automatically  add the py extension because python file is   by default selected I've added  right-click new python file demo2 press   enter and now we have created both the  files now let us begin first I'll add commands okay I've added the commands now let   me import pandas as pd pd is an  alias we have added using the as keyword let me add a data  set first a data for the data frame I'm adding the ID first now the second column let's say  student that will include the students name then rank and then the marks okay we have added the marks also  now that's it create a dataframe we have   created an object here dataframe type  pd that is pandas do dataframe method   and within that mention the data  that's it we have created a data frame display printed Student Records drop a column okay let me create a result  in dataframe and object I'll mention   dataframe.drop method and within this I'll drop a column okay let's say I'll drop the marks column   for that mention the column name  that's it and then mention the A's I have marked X's parameter as columns  because I want to delete the marks column now mention the result printed df was having a result okay this looks fine I'll go  to file save all right-click run demo one okay marks Max m is capital fine right-click  run demo1 here in we were having four columns now   we have three columns because we have removed the  Marx column by just mentioning axis as columns you   can also mention it as one let's say I'll mention  one and it will work in a similar way right-click   run demo1 four columns and now three columns we  have deleted the max column so you can mention   one here or columns in this way you can drop a  column from a pandas dataframe we use the drop   method and the X's value was set to one or columns  to drop a column now in the next example we will   delete a row using the drop method only but in  the brackets we will set the axis to zero that   would be the rows axis so you can mention zero  or you can directly mention index to delete a row here we have our second example import pandas  as pd now let us create our data for the dataframe   I can take it from here data and I'll print  it I have created four columns now I'll drop a row for that create a resultant data Frame data frame. drop and within  this mention the row with index   two that means the third row will get  removed so mention two and X is equal   to I told you you can mention  index here or zero and that's it just print the result R df  will be having a result you can mention a message dataframe after removing a row go to file save all right-click run demo  to here I told you I was having five   Row Records and four columns okay I have  removed the second one that means second index that is the third position 1 2 3 I have  removed the Jacob one from here you can see   the Jacob record student Jacob's record removed  in this way we can work around the drop method   to remove a row you can also mention zero here I  told you X is zero directly I'll right-click and   run demo2 and you can see the same result will be  visible so we can see the jackups result deleted   so in this way guys we can delete a row in this  video we saw how we can easily delete a row or   column using the drop method what you want to  remove depends on what you add in the access parameter in this lesson we will learn  how to iterate over rows and columns in   pandas so to iterate our rows and  columns we will see some functions   some built-in functions provided  by pandas first we will iterate   our rows and then we will iterate over  columns let's begin with iteration over rows okay to iterate over rows  we will use the following two   methods eer rows and and the second one iter tles okay we will see two examples one for  eer rows and the second for eer tles let's see here is a PyCharm ID we are using the free  and open source PyCharm Community Edition go to   file create a new project for this lesson name it  I'll name it let's say pandas iteration because we   are learning iteration here you can mention any  name and the project will get saved here click create we have created it pandas iteration  now let me add two files right-click new   python file name the file let me name it demo 1  when I'll press enter it will automatically add   the py extension why because python file is by  default selected press enter now create a new   file again right-click new python file demo  to and we have created two files demo 1 demo two now we will learn how to iterate  our rows let me add the commments first Pandas iter rows method to iterate over rows let   me import pandas first I'll also add  an alias import pandas as pd pdS and alias now I'll add the data okay data set for  our dataframe I have created a data object the First Column I'm adding records of five students  started with ID comma student this will add the name I'm adding the name of five students next let me add a rank  also the final column for our example now create a new object and we will create  a dataframe I'll be adding pandas that is pdM pandas do dataframe data now print the data frame you can mention here Student Records now we will iterate our rows I'll mention for Row in data frame do iter rows so we have used the iter rows  method now mention print here and print the row   colon we miss the colon file save all right-click  run demo so it will display the rows one by one   okay here it is here is a dataframe and here are  the rows it displayed it one by one okay I should add display the row we have displayed the rows one by one okay this looks fine now I've added a  new line so this is a particular row so we saw how to iterate over rows  using the iter rows method in the next   example we will will iterate over rows  using the iter tles method as the name   suggest each row is returned as a  python tle object so let us see the example here is our pandas ites method import pandas okay now consider the data  let's say I'll take the following same data okay three columns and we added the  data to a dataframe using the dataframe   method and we displayed the dataframe  here now I trate over rows using the it tles for Row in data dataframe our dataframe  name was the following doer tles that's it colon and print the row that's  it here you can also mention a message let's say display records as a tle object file file save all right click run demo two here you can see our dataframe with  three columns and five records and here   is a tle object we have iterated now  all the records are visible as a tle object we saw how to iterate over R using it  tles it returned a python tle object for all the records now we will see how to iterate over  columns using the items method this method will   allow you to iterate over each and every column  and the result will be displayed as a label object   that will include the name of the column and a  column object that will include the column values let's see the example we will see how to iterate our columns  using the items method this will display a label   object that will include the column name and the  and a column object with the column values import pandas create create a data  set let me take it from here paste we have our data with three columns and we have records for five  students okay we have created a data frame now I trate over columns use the for Loop for a comma B so dataframe do items will  allow you to iterate over columns that's it colon and  print a 1 by one print B go to file save all right-click run demo 3 now let's see what is the output first it  displayed the entire dataframe why because we   printed Student Records the dataframe then one  by one we have printed the columns that included   first the column name then the column value okay  it's written here column name this was for the   First Column then came the second column and  then the last column so one by one it iterated it okay we can also mention this here iate over each column okay this is fine okay so we have mentioned it  here I trting The Columns one by one so we saw how to iterate our columns  using the items method in this lesson we   saw how to iterate our rows and  columns in pandas in this lesson   we will learn how to sort the data  in pandas for that we have various methods here we will use the sortore values method  to achieve the same two examples will be consider   first first we will sort the pandas in ascending  order then we will achieve the same in descending   order let's begin to sort the dataframe in  ascending order which is the default we use   the sortore values method okay for a sending  sort you don't need to mention any value in the   parameter of the sort underscore values method  because ascending is default okay if you want   to sort by a specific column name then mention  it in the by parameter of the sortore values method let us see the example we  will sort the dataframe in ascending order here we have PyCharm ID  we are using the free and open   source PyCharm Community Edition let  us create a new project go to file new project mention the name of the project I'll mention Panda sort you can mention  any name it will get saved here click create we have created our project now let us  add a file right-click on the project   new python file mention the name of the  file I'll mention demo1 you can mention   any name when I'll press enter it will  automatically add the py extension so I   pressed enter and is visible now  create another file because we have two   examples right-click new python file demo  2 and press enter we have created demo 2 also okay now let us begin with  the program first I'll mention the commments let us begin with the first program I'll   import the pandas Library  first import pandas as pd now let us add the data set I'll  add the data for my dataframe so   I'll create an object data let's  say within that let me mention   three columns first one will be the  student that is the student name let's say I'll add the records of five students rank I have set the rank marks 95 70 so I'm setting the marks   I'm placing the marks for the  five students for an example now I'll create the dataframe let  me create a new object let's say Rees okay let's say df for the  dataframe now pandas that is pd   do dataframe mention the data in it  that means we have created the data frame we will now add the indexes we have seen in   the previous lecture how to add  an index so we are doing the same here now display the records df okay we can also mention  a message here that means   Student Records okay now we will sort we will sort in in ascending order okay by default this  is the default you don't need to mention the parameter but we need to use the by  parameter I'll show you df do sortore   values okay within that I'll set only  the B parameter because I want to sort   according to a particular column so I'll sort  according to the rank column that's it stating   that which student will be on the top and  which will be in the bottom and just print it file save all right-click run here is our data frame unsorted now we have sorted it in ascending  order according to according to rank so the   top will be Amit and and in the last  you can see the student David because   his rank was five okay in this way  guys we can sort in ascending order by default guys we saw how to sort the dataframe  in the default ascending order using the sortore   values method we sorted according to the  rank we have just set the rank column in   the by parameter that's it let us see the next  example in which we will will sort the Pandas   dataframe in descending order so what we will  do now we will set the ascending parameter and   we'll set it to false that means the opposite in  descending order and the Same by parameter will   be used to sort according to a specific column  like we did in the last example let's see the example here it is in descending order import pandas create an alias now let us create our data our data  for a dataframe let me take it from here here we have created our data  and added it to our dataframe that's   it our data was having three columns  student Rank and marks with five records   records of five students and we also  added an index for a dataframe using   the index parameter and we printed the  dataframe that's it now we need to sort in descending order okay we will also set the  by parameter so I'll just use df do sortore   values method set the byy parameter what is  the by parameter this will sort according to   a specific column so we will sort according  to the rank column this rank column okay and we will sort in descending order  by just setting ascending parameter to false that means descending  right okay now just print this that's it go to file save all right-click run demo to now we will see whether we have achieved or  not Student Records three columns and records   of five students and now we sort it using  the rank in descending orders opposite so   the fifth rank is for David and the first rank  is for Amit so we successfully sorted our data frame in this lesson we saw how we can sort the  pandas dataframe in the in the default ascending   order and in descending order in this lesson we  will learn how to handle duplicates in pandas so   if you want to find and remove duplicates from  rows in a pandas dataframe use the duplicated   as well as the dropcore duplicates method both  of these are built-in methods of python pandas   so let's say you have a data set with a lot  of rows and columns and you want to find the   duplicate records maybe it got added while  inserting data okay so you need to find them   and remove them both of them can be easily  achieved we will see the same in this lesson   both of these methods can be used on a pandas  dataframe or series we will find and remove the   duplicates beginning with the first one we will  find the duplicates using the duplicated method   so it will find the duplicates and will return  a dataframe or series with true and false values   that means if the row is a duplicate true will  be returned so for our example we'll be using   a smaller data set so that it's easier for you to  understand it let's see the first example to find duplicates here we have our PyCharm ID  we are using the free and open source   PyCharm Community Edition so create a new project   go to file new project add the name of  the project let's say I'll add pandas duplicates here is the location of the project   okay you can add any name  for the project name pandas duplicates click create we have cre created the project  here it is now we will create a file   right-click new python file now I'll  add the name of the file you can add   any name I am adding demo1 so it will  automatically add the py extension when   I'll press enter Because the python  file is by default selected press enter okay so is visible we have two   examples so I'll add another  file right-click new python file demo to press enter so we have our two files let us start with   the first program to find the  duplicates I'll add a commment first now let us begin with  the first example first I'll   import the pandas Library import  pandas and I'll create an alias pd now let us create the data set we will create  the data set for our pandas dataframe so here it   is Data object I have created now I'll add  my data I'll add three columns let's say   first one would be student that will include  the student name so let us add the student name comma Now now at the rank student rank then the marks okay here you can check we have added  duplicate records so let's say by mistake these   records gets added so we need to these duplicate  records so that we can remove them for that we'll   be using the duplicated method let's say the data  set was really huge then this method is really helpful now we will create the dataframe df  object we have taken type pd that is pandas   do datf frame within that mention the data  that's it so we have created a data frame here now you can print let's say I'll print Student Records okay now find the duplicates I'll  take an object and I'll mentioned   dataframe.duplicated and I'll just  print the Rees okay that is our output   you can mention describing duplicates  because it will mention true or false values codo file save all right-click run demo one here was our student records and Amit  was repeating it was was a duplicate so   it has mentioned that the following is a  duplicate of this okay so it has mentioned this that is a true value we found a duplicate  okay using the duplicated method here it is so guys we saw how we can find duplicates  from a dataframe using the dup duplicated   method in the next example we will remove these  duplicates using the drop underscore duplicates method here is our second  example remove duplicates   using the dropcore duplicates method import pandas first okay now mention the data set let's say let's say I'll take the same data set I'll create the dataframe I've copied now  I'll paste so this was our data with three   columns and we added this data to a dataframe  using the dataframe method and printed it here   so a duplicate record was the following Amit one  which is repeating here also and here also so we   will remove okay for that let me mention  Rees is equal to df do drop _ duplicates   this will remove now I'll print Rees that is  the resultant dataframe after removing the duplicates new dataframe after moving duplicates  okay find save all right-click run demo to here I'll show you okay we were having  five records and three columns two of them   were duplicate that is the following was  a duplicate of the first one this method   will remove the following you can check here  after John there was Amit but it got removed   and directly David came in here so guys  if your data sets are really huge then   this method is actually a blessing  so that you can find and remove the duplicates we removed the duplicates using the  dropcore duplicates method in this lesson we saw   how we can easily handle duplicates handling  duplicates in a dataframe or series means to   find and remove the duplicates we did that using  both these built-in methods in this lesson we will   learn how to clean the data in pandas cleaning the  data means basically to work on an incorrect data   to fix it or the data can also have null values or  it can be a duplicate data in pandas we have some   built-in functions to fix such incorrect data  in this lesson we will consider the following   demo. CSV file okay and we will try to fix it  the data in the demo. CSV file is having null values and here is a demo. CSC file okay you  can see we have some empty values we will work   on this data in the previous lectures we saw how  to create a CSC file using Excel Microsoft Excel   okay we also saw how to read it and we also  handle duplicate data okay now we will see the   following examples using the built-in functions  of pandas and we will clean the data using the   is null method not null drop any and fill any  method all of these have different properties   okay here we will try to find the null values  and we'll replace them with true here we will   find the not null values and replace them with  true and the following two are basically used   to drop the rows or replace the null values  with a specific value let's say want to set   a value 100 for the null values in our CSU file  you can do it with fil method let us start with   the first example to clean the data using the  isal function find the null values and replace   them with true so what about the non-null values  those will get replaced by false let's see the example here we have our PyCharm okay we are using  the PyCharm Community Edition which is free on   open source let us create a new project go to file  click new project here and add the project name you can add any name to the project and  the following is the location of the project click create here's our new project let us  create a new file right-click new python file add the the name of the file let's say  I'll add demo 1 it will automatically add the py extension because the python file is  by default selected press enter and we   have created our first file since we are  having total four examples I'll create   all the files quickly right-click new python  file demo to demo 3 and the last demo 4 okay   we have created all the files now let  us add the comment and create our first program now let us start with the first  program pandas isal method I'll import   pandas first import pandas as pd pd is our alias now let us input our CSC file then we will load this in our dataframe  okay this is the dataframe object df is equal   to pandas pd. read CSV to read a CSV here and  add the path okay to get the exact path go to   your file right-click this is Windows 11  you can copy this path or if you're having   Windows 10 okay I clicked on show more options  on Windows 10 the following will be visible you   need to just click on copy as path go to your  project right-click and paste the path that's it one more slash that's it and just display the CSC file records okay what  you need to print the df dataframe that's it now we need to find and replace the  null values with true using the is null method okay take a new object rdf let's  say is equal to dataframe do is null that's it okay now return the new dataframe  print df you can also type two string now I'll just mention a message file save all run demo one okay I did a mistake it should  be rdf because the new dataframe is   the following it looks fine run here  you can check the following was our   dataframe CSV with two null values and  the null values will be replaced by true   and rest will be replaced by false so you  can easily find the null values using this method guys we saw how to use the Isel method the second example includes how to work  around the notnull method what this method   will do it will find the not null values  and replace them with true the opposite   of the previous method and for the null  values it will return false let's see the example here it is using the  Isel method import pandas okay input CSV file or what I can do I can take the complete complete code copy paste okay we have input the the  CSV file using the read CSV method and   we have also printed the dataframe in which it was loaded now we will replace the  notal values with true create a   new dataframe rdf the resultant  dataframe dataframe do not null   okay now return the dataframe the new  dataframe because it will be having our output R Sdf you can also add two string here now mention a message new dataframe that's it go  to file save all right-click run demo to okay now now these were our  null values under points and it   it is now replaced by false and rest  of the values are true okay so the   opposite of the previous function  is null method okay sorry not null method guys we saw how to work with the not null  method to find the N null values and replace them   with true let us see the next example in in  this example we will use the drop na method   this method is used to drop and remove rows  with null values okay let us see the example   three using the drop na method okay we have the  following null value so this will get deleted now let me import pandas import  pandas as pd we have imported a library I'll load our CSV file from  here and we'll also print it here   it is we have loaded a demo. CSV and we  printed it after loading it in the data frame find and remove rows with null values rdf we have created a new  dataframe for the result dataframe   do drop na that's it now what we  need to do return the new data frame that is rdf you can also  mention 2core string okay now just mention a message after removing rows with null okay this  is fine file save all right-click run demo three now we can check the following was our  dataframe or CSV and 6.1 and 4.5 frequency will   get deleted you can't find them after 3.2 you  will directly have 1.2 here it is 3.2 1.2 so we   have deleted the null values so if by mistake you  added such values or it got added you can easily   remove them if your data sets are really huge then  these functions are really helpful guys we saw how   we can use the drop na method to drop and remove r  go with null values we deleted the entire row now   the last example in which we will use the fill na  method to replace the null values with a specific value demo 4 fill any method okay in the  brackets you can see we have a value this   is the value we want to fill in place of  the null values okay import pandas Spd now take your dataframe your CSV  loaded and add it in the data frame paste okay we have loaded our we have loaded our CSV using  the read unor CSV method and added   it to the dataframe and printed the data frame okay let's say I'll Place 111 instead in place of   all the null values so null values are  here triple 1 will get placed here take   the resultant dataframe Rees df df do  fillna okay and add the value triple 1 you can add a message also okay after replacing null with a specific value  go to file save all we have added one one1 run these two were null values okay we have  replaced them with 111 can add a slash and   also new line also here okay this looks fine  now right-click run now it will be displayed properly right-click run now this looks fine okay  we have replaced it with triple 1 so in this way   guys we can work on data we can clean the data  we saw how to replace null values with with a   specific value we replaced it with triple 1  you can add any value guys in this video we   saw how we can use the buil-in functions of  python pandas to clean the data we found the   null values and replaced it with a specific value  we also saw how we can display true in place of   null or non-null values thank you for watching  the video in this lesson we will learn how to   perform operations on Text data in pandas if you  have Text data in your series or dataframe you   can easily perform operations on it for example  if you want to convert the entire text data to   lower case use the lower method if you want  to convert the entire text data to uppercase   use the upper method if you want the same data in  camel case use the title method you can also get   the length of each and every element using the  Len method also count the nonempty cells using   count and if you want to search for any value  in a column use the contain method okay so in   this lesson we will see these six examples  and we will cover the following six built-in   functions of python pandas this will allow us  to perform operations on our Tex data that is string let's start in the first example  we will focus on the lower method that   will allow you to convert your text  data to lower case let's see the first example here we have our PyCharm ID  we are using the free and open source   PyCharm Community Edition so create your  new project go to file new project add   the name of your project you can add any  name I'll type let's say pandas string operations and the location of the project is the following the project will get saved here click create here is our project okay now we  need python files for our six examples   I'll create the first file right-click  new python file enter the name of the   file here I have mentioned demo 1 you can  mention any name it will automatically add   the py extension because the python file is by  default selected now just press enter and you   can see is visible and the path is also  visible similarly create five more files because   we have total six examples of six built-in  functions in pandas right-click new python file demo 6 that's it now  we will focus on our first   file and we will use the lower  method let me add the comments also now let us start with the first  example at first I'll import pandas   Pandas library and I'll also create  an alias pd so that we don't need to   write pandas again and again to  create an alias I have used as keyword so let us create the data we will store this data in the pandas Series so I'll create a new object data let's say here we are adding mixed text that is mixed case text here we have five names  in different cases now create a series okay let me add the object  let's say It's s let's say it's series Panda that is pd. series okay and I'll  place this data in it that's it we have created a series Now display to display the series just display this object now we will convert the text data to lowercase for  that I told you just mention series do strr do lower that's it you can also mention a message file save all right-click run demo one here we have our series data in  different formats you can see Trent Martin is having mixed case Trent is having upper   case and we have converted  entire data to lower case so guys we saw the first example to  convert all the text data to lower   case using the lower method let us see the  second example in the next example we will   do the opposite that is we will convert  the entire text data to uppercase using   the upper method let us see the example okay  we will use the upper method import pandas   the pandas Library import pandas as pd let  us take the text Data from here right-click copy paste we have entered a data and  created a series here let me do some changes uh let me do some other changes because   I'll convert everything to  uppercase okay let me convert now series do St Str Dot Upper method we have converted the text to upper case save all now all these elements will get  converted to uppercase right-click run demo to here it is the series with mixed  cases and now we have converted it to uppercase guys we saw how to convert the  entire text data to uppercase using the   upper method let us move to the next example  in this example we will use the title method   this title method will allow you to convert the  entire text data to camel case what is a camel   case now a camel case allow a text to have the  first letter in capital so let us use the title method here we have our demo 3 title  method import the library import pandas as pd okay now let us take the  data for our series and print it copy paste okay now we have a data we added the  data to the to the series and that's   it we printed the series now we will  convert it to title let me do some changes okay let me keep it as it  is and rest let's say I'll change   now now we have mixed cases here so that  we can understand the concept now let us convert just mention series. St do title and we can also print  a message camel case data file save all right-click run we ran it and this  was our series data mixed case and now we are   having the title case that is Jacob the first  letter is caps in Amit also the first letter   is caps works for other elements so in this way  guys we can convert a text to camel case we saw   how we can use the title method to convert  our Text data that is our series to camel case in the next example we can  get the length of each element in   the series using the alen method let's  see the example we will use the alien   method import the library import pandas  and create an alias now let me take the data paste let me do some changes so I have done the changes we have a  data here we added this data to our   series and using the series method  we have created our Panda series now we just need to get the length of each element mention the text let say length save all right-click and run here is a series and the length of  each element the following length is 10   the following my name Amit Diwan its length is  10 in this way for each element we found the length guys we saw how we can get the length  of each element in the pandas series using the   alien method the next example includes the count  method in which we will count the non-empty cells   for each column or row in a series using the  count method let's say we have five elements   in a series and two of them are Nan values  null values then the output will be three that   is the cell is having three elements obviously  because those are non-empty cells let's see the example here we have our count  method import pandas as pd pd is   an alias get the data let's  say I'll take the following data I'll paste it here and for the following I'll mention np. n  and it automatically added the numpy library   also if you remember before installing pandas  we needed to install numpy so I just added NaN   using numpy and it automatically added the  library let's say I'll also remove this and   let's say this and I'll mention numpy so the  output should be three three non null values okay okay series. count now I'll go to file save all  right-click run demo5 and now you   can see that we were having three  n null values therefore the count   is three in this way guys you can also  find the count of elements in a pandas series we counted the non-null values using  the count method our last example includes   searching for a value in a column we will achieve  this using the contains method let us see the example the contains method import pandas add the data right-click copy right-click and paste  so we have a data here sample data we   created a series using the series  method and added this data now search for a specific value series .str.contains okay let say  I'll find Amit where it is located so the correct answer  will be displayed by true does the specific value exist in our series Amit yes right-click run here it is Amit is visible here  okay and the same is visible true in   this way guys we can work around the  contains method to find a specific value we worked on the contains method to  search for a value in this example we saw   how we can work around the string operations on  Text data we saw these six examples we worked on a series in this lesson we will learn how to  perform the datetime operation in pandas   the date time operations include to get the  current date and time to get the specific day   of a week or an year to check whether the  year is a leap year or not or to check for   a day that is whether the day is the last  day of the month or the first day of the month okay so in this lesson we will cover the  following operations okay total nine operations   nine datetime operations to work around  date and time and to understand the concept completely the first example includes getting the  current date and time we will use the Timestamp.   now method for this this is a built-in  method of python pandas let's see the example here we are using PyCharm ID it  is having a free and open source PyCharm   Community Edition so create a new project  go to file new project add the project name I have added Pandas date time you  can add any name and the location of the   project is the following click create we  have created the Pandas datetime project now to run our python file right-click and  create a new python file right-click new   python file add the name of the file let's say  the name I'll add is demo1 .py extension will   get added on its own since python file  is by default selected press enter in   this way we have nine examples total so let us  create all the files quickly eight more files first we will get the current date and time  for that import pandas and create an alias   we have created pd as an alias so that  you don't need to write pandas again and again okay now we will get the current date and   time pandas do time timestamp do  now okay and we just need to print it this will get you the current date and time go to file save all right-click run demo 1 here it is the current date  is 29 December and time is 4:38 p.m. guys we got the current date and time  using the method now we will   get the day of the week using the pandas  do day of week attribute let's see the example demo to get the day  of the week let us import pandas set a timestamp using the pandas do timestamp pd do Timestamp so I have  set the time stamp like this with the   year month and hour you can also  set it like this and print the timestamp okay now guys display the what you want day of the week day of the week okay for that I'll directly   print timestamp .day of week that's it  and you can also mention it here like this day of week go to file save all right-click  run demo to let's see day of the week is 4th okay in this example we saw how to get the  day of the week using the pandas do day of   week attribute in the next example we will  get the day of the year using the day of year attribute we will get the day of the year   import pandas pd we will add a  time stamp let me take it from here okay we have set the  year month day as well as R okay we have printed the date and time display the day of the year timestamp do day of year also you can mention day of year file save all right-click run demo 3 okay 29 December and day of the year is 363 guys we saw how to get the  day of the year using the day of   year attribute now we will get the  number number of days in a month   using the built-in pandas do days  in month attribute let us see the example get the number of days in a month so the   month which we have set December it  will display 31 is the answer import pandas take the time stamp we have set the date to be 2023 and the following is the month and day  and we have displayed the time stamp Now display the number of days timestamp .days in month days in the month okay right-click run it's visible the date and time month is   December so the days in the month  will be 31 31 days in the December month so we got the number of days in a  month using the pandas do days in month   attribute now we will check whether the year  is a leap year using the pandas.leap_year attribute let's see the fifth example  check if the ear is a leap ear okay import pandas now we will take the time stamp check for leap here timestamp.is_leap_year  that's it and you can also enter a message is this year leap year file save all right-click run demo 5 here it is this is a leap  year no 2023 is not a leap year in this way we saw how to check for any year   that whether it is a leap year  or not using the is_leap_year attribute now we will check whether  the date is the last day of the month   using the pandas do is underscore  month underscore end attribute let's see check if the date is the last day of the month in pandas import pandas add an alias set the time stamp let's say we have the time stamp and we  have displayed the date and time check if the date is the end of the month print timestamp dot is  underscore month underscore end is this the month end let's see file save all  right-click run demo 6 false let me set it to   31 and it will show yes this is the month  end run yes true in this way guys we can   easily find whether the date whether  the date is the last day of the month month we use the pandas.is_month_end  attribute to find whether the date is   the last day of the month that is the  month end now the seventh example in   which we will check if the date is the first  day of the month using the is_month_start attribute check if the date is  the first day of the month import pandas import pandas and create an alias now guys get the time stamp  display the timestamp using pandas   do timestamp and we have set it here  now what we need to do we need to check if the date is the first day of the month is_month_start start you can also mention a text here is this the first day  of the month that is the beginning file save all right-click run demo7 is this the first day of the month no but what  we can do we can mention day one now this looks fine okay right-click run demo 7 is this the first day yes we have  set it true this was December 1st 2023 in this way guys we can find whether  the date is the first day of the month or   not it has displayed true now we will find  whether the date is the last day of the year   using the pandas dot is underscore  ear underscore end attribute let's see here it is check if the date  is the last day of the year in pandas import pandas first and create an alias now add a timestamp okay we can mention the day as let's  say 29th December and we have displayed the date check if the date is the last day of   the year print timestamp do is  underscore year underscore end okay file save all right-click  run demo8 now this is not the   last day of the year I can set it  to 31st December and now true will   get printed right-click run true 31st  December is the last day of the year okay guys we saw how to check if the  date is the last day of the year using   the pandas do is_year_end attribute now the  last example in which we will check if the   date is the first day of the year using the  is_year_start attribute let's see the example okay check if the date is the First  first day of the year in pandas import pandas timestamp data right-click copy paste and check if the date is the first day of the year print timestamp.is_year_start is this the first day of the year file save   all run and the date was 31st  December obviously it will be false it is the last day of  the year not the first day   so what I'll do I'll just type 2024 month one day one right-click run it should display true and  true is visible so guys we can easily find the how   to check if the date is the first day of the year  we achieved it using the pandas do is_year_start   attribute in this lesson we saw how we can work  around date operations in Python pandas we saw   nine examples thank you for watching the video  in this lesson we will learn how to remove wi   space or specific characters from a text data  in a pandas series or dataframe so for that   we have the following three built-in methods  and these are provided by python pandas if you   want to strip white space or specific characters  including your new line from the left and right   of a string you can use the strip method if you  only want to strip white space or new line or   specific characters from the list left use the  L strip and if you want to achieve the same from   the right side use the r strip method okay so let  us see these three examples in the first example   we will use the strip method so let's say your  string is having a having a new L character or   some other characters on the left and right side  let's say in a tab then you can easily remove from   both the sides using the strip method use this  method in your series or dataframe let us see the example we are using the PyCharm ID PyCharm  is having a free and open source version   that is the community version so we are  using the same open source open source   version create a new project go to file new  project here in add the name of the project okay let's say I've added the  following name you can add any   name and here is the location of the project click create we have created the project now let us  add the python file wherein we will create our   code right-click new right-click new python  file let me add the name of the file I have   added demo1 you can add any name when I'll  press enter it will automatically create a   new python file with the py extension because  the python file is by default selected here I   pressed enter and it created demo1 now create  two more files because we are having three   examples right-click new python file demo2 and Now  demo3 now we have our three files let us add the command here we have the strip method  in Python pandas okay now let us import pandas we have created an alias here pd okay and now I'll add the  data for the series for our Panda series here is the data let's say the following is a data okay I'll add some characters on the left and right /t now I'll just create the series series is equal to pandas do series method that's   it and add the data so we have  created our series easily Now display we have displayed the series okay we are also having correctors on the   left and right which will get  removed by the series dot strip method okay guys we need to remove the Slash  and /t as well as this so I'll just   mention it inside the following okay now  right-click run demo1 now you can see we have we were having some special  characters here so I just removed   it using the strip method you just need  to add those characters here that's it guys we removed the special  characters specific characters from the from the left and right side in our  Panda series using the strip method now   let us see the second example in which  we will strip from the left side only okay so here it is import pandas now I'll take the data from here and print it paste now we are having trailing  and leading special characters specific characters strip from the left okay  now I'll use print series. st.l strip okay within this mention what you  want to remove okay let me add \n \t here also now I just want to remove the  following only \n \t from the left side save all right-click run now exclamation mark \n  and \t will remove from the   left side not from the right side  you can see the output here it is we have removed it from the left side we achieved the same using  the Lstrip method now the r strip   method in which we will remove the white  space as well as specific character new   lines from the right side in a pandas  series of dataframe so let's use the RP method here is our example for our  strip it will remove from the right   side import pandas as pd we have  imported our library now add the data okay we will keep the same example and  we have created a series using the following data now let's say I'll I'll write remove correctors from the right side okay that's it within the  bracket mention what you want to remove I'll do the same I want to remove this and  this and also this from the right side only using   R strip file save all right-click run demo3  you can see we have removed from the right side we have removed this this and  this you can check here from the right side in this lesson we saw how we can remove  white space and characters from the left and   right side we focused on all these three built-in  functions of python pandas in this lesson we will   learn how to group the data in a dataframe after  grouping the data we will perform operations on   it first we will work on the groups concept  then we will perform aggregation operations   on it so here in we will first split the data  into groups then it will be iterated and we   will view the group and perform aggregation  operations on groups like getting the mean   of the group data Let's see we will begin  with the first example to split the object   and combine the result in the first example  we will use the group by method to split the   object we will group the rows or columns into  specific group so in this coding example we   have three columns player rank and year and we  will Group by the player column let's see the example we are using the PyCharm ID  okay the PyCharm is having a free   and open source Community Edition  we are using the same now let us   create a new project go to file new  project here and add the name of the project you can add any name Okay click create the following  is the path of our project click create our project got created here it is visible  now let us create a new file right-click new   python file name the file I have named it demo  1 you can add any name python file is by default selected it will automatically add the P  extension because the python file is by default selected press enter we have  created our first file we have   five more examples so I'll create all the  files right-click new python file demo2 demo3 demo4 demo5 and demo6 okay here  are the six files and now   let us add our code and run it first I'll add the commands Okay we have added the commands  we have total five examples so I'll just   delete it right-click right-click delete  okay now start with the first example import pandas now let us create a data we will add this  in the dataframe let's say I'll add object name data add the first column comma second column rank third column ear okay that's it we have created  our data now create the data Frame data we have added  the data in the data frame Now display the data frame okay Cricket player  records now we will group the data we will group the data on the player column okay for that create a resultant object  that is RS here data frame. Group by Method   and set the column in it now you can display  the first entry using the first method after grouping Rees do first this will display the first non-null  entry of each column okay go to file save all run now we have displayed a dataframe first   okay and the dataframe is visible  and the first non null entry is visible guys we saw how to use the group by  Method now we will iterate the group using   the foreign Loop okay we will iterate  through the group player one by one   here is a second example iterate the group  import pandas now we can take the data from here okay now paste it we have our data here  with three columns and six records we   have added it to the dataframe  and we have printed now what we   are doing Group by player we did this  in the last example example also let me   add a new object data frame. Group by  bracket add player we have grouped by player now I trade for name  comma group in your output   means group RS the following in which we grouped add name here and print the name one by one and   then the group let's see the output  file save all right-click run demo two first the dataframe is displayed  then players are displayed by one by   one so there were two players by the name Amit and David was only a single player and John  were two also Steve a single name so we grouped   it and displayed iterated one by one using  the foreign Loop so guys we use the foreign   Loop to iterate through the group Player we  created a group using the group by method in the next example we will view the group  using the groups property let us see the example view the group import pandas create an alias add the data paste it here we have three columns  I told you before the data is added here   to create a dataframe three columns  and six records and we have displayed   the player records now we will Group by player and display that is view for that directly you can   mention like this df do group by so  we have grouped it by the player dot groups that's it let's see what is  visible go to file save all right-click run here and we have shown a  dataframe and we have grouped it like this we have viewed the group so it has  shown the index value in the brackets   zero and second are Amit David is third  John is 1A 5 that is following 1 comma 5 is four okay so this is how we can view  the group guys we saw how we can use the   groups property to view the group now we  will perform the aggregation operations   on groups using the a g method we can get the  mean or even get the size of each group using   these operations so we will see two examples  first we will get the mean of the group data   and in the fifth example we will get the  size of each group okay let us see first   we will get the mean of the group data for  that first group the data and then use the   agg method with the mean method so here  in we will use the numpy mean method let's see fourth example get the mean of the group data import pandas okay now get the data let's say we are  taking this data we will also add additional data copy paste now we have a data  with three column let me add one more column let's say points I've added the points we have printed  the dataframe first we have created   the dataframe using the dataframe method  and added the data then we have displayed it now use the groupby to group let me add a new object and group by  using the let's say column year now this time now use the agg method to perform aggregation use groupRes that is  our object output for the result   within that mention points we are getting the  average of points so it should be points the   exact name here it is .agg() in Brackets  mention numpy.mean np. mean now for this import numpy as np okay we already installed  numpy if you remember for pandas pandas is   built on top of numpy so before installing  pandas we installed numpy and created an alias np and we have performed the  aggregation you can mention here mean file save all run demo 4 here it is okay Cricket player records we  displayed the dataframe and after that the mean is visible okay here is the mean what I can do I can add some more values to the  ear okay that means let's say I'll set it to 23 now I can right-click and run okay now it's fine we have displayed  the mean in this way guys we can perform   aggregation operations we will not perform the  next example we will now see how to get the   size of each group with aggregation okay we will  group the data using the group by Method like we   did before okay and then we will use the size  attribute to get the size of each group let's see here it is get the size of each group import   pandas as pd okay we have  imported now get the data set right-click copy right-click paste okay here is  a data with four columns and we have   printed the data inside a dataframe  that is we have created a dataframe   using the dataframe method and and  we have displayed it now you can see group can group the data create a new  object let's say data frame. Group by   within that mention the column through which  you want to group that is player in this case aggregation is performed using AGG and numpy do size attribute Returns the size of each group let me print now group do aggregation that  is a g in the bracket mention numpy do size but   we haven't said this so this is the let's say  I'll set it here numpy as np we already installed   numpy if you remember pandas is built on top of  numpy so we installed numpy before installing   pandas in the previous lectures we have created an  alias here for numpy and used it to set the size that's it okay go to file  save all right-click run demo 5 so what we did we just printed the  dataframe first and then we displayed the size that is the size of each group we have two  players with the name Amit and two players with   the name John and Records also two with Amit  and two with John that's why it's visible like   this okay so we have displayed the size of  each group using the size attribute in this   lesson we saw how we can group the data using  these examples we group the data using the   group by method then we have performed all these  operations including the aggregation operation as well in this lesson we will learn how  to understand the statistics operation   in Python pandas for that we'll be using the  statistical functions these are the statistical   built-in functions provided by python pandas  you can easily apply these to a pandas series   or dataframe okay so here are the built-in  functions for example if you want the sum of   the values use the sum if you want the count  use the count method Max Min for getting the   maximum and minimum values respectively to  get the mean use the mean method for median   of the values you can use the median method  STD is for standard deviation and describe   is to return the summary statistics for each  column we will work around these statistical   functions one by one let us start with the  first function that is the sum function as   the name suggests the sum is used to return the  sum of the values let us begin with the first example here is our first example we are using  PyCharm ID PyCharm has provided a free and open   source version PyCharm Community okay so here it  is let us create our first project go to file new project here and enter the  name of the project let's say I added Panda statistical functions and here is the location of  the project click create to create the project we have created the project now  we need to create python files let us   create the first file right-click  on the project new python file click on it add the name of the Python  file let's say I'll add demo1 when you'll   press enter it will automatically add  the .py extension because python file   is by default selected press enter here we  have created our first file you can check   the exact path of the file we have seven  more examples that is total eight built-in   functions so I'll create all the files  quickly right-click new python file demo two we have created all the eight files let us  go to the first file let me add the comments also before beginning the program let us start with the sum method  import the pandas Library we can   also add an alias to it so that we  don't need to write pandas again   and again directly mention pd now we  have created an alias using the as keyword now let us create a data set okay  you can name it data or I'll be adding marks   of students okay so let me mention marks  now let me add the marks of let's say math subject I'll add the marks of six students comma let me add the marks  of another subject science let's say now English okay we have added the marks of  students for the following subjects okay now   let us create the dataframe we will create the  dataframe using the dataframe method pandas do dataframe in the brackets mention your data that   is the following marks now we  have created the data frame display display the dataframe and Print it Now we will display the sum of marks in each column for that use the sum function  that is dataframe that is df sum that's it we can also mention a text here sum okay  go to file save all right-click run demo one here we will see the sum of marks we  have the following dataframe with the marks   of students in math science and English six  records and here is the sum of each column   the sum is calculated like this 90 85 98 80 55  78 and this for Science and English marks are calculated we saw how to work on  the sum method to get the sum of the values the next example to get the  count of nonempty values for that   we'll be using the count method let us see the example second example count method in Python pandas import pandas and add an alias  now mention your data okay let's say   I'll take the same data and I'll do  the changes now select it display it copy paste now let me add some empty values okay we have created our Max data  and printed it we have added it to   the dataframe also added some nonone  values we need to count the non empty   values okay that means for maths it will  be five because the 61 is none let me display okay we are counting the we are  counting the non-empty values in each column print so data frame. count method count of non-empty  values file save all right-click run here it is our data frame math is having five non-empty values here it is so five is visible in the same way for Science   and English okay so this is  how we can work on the count method so guys we return the count of nonempty  values using the count method now return the   maximum of the values using the max method  if you need to get the maximum values let's   say that is maximum marks for example so  you can use the max method let us see the example Max method in Python pandas import pandas Spd okay now add the data set so here we have the the marks  of students in maths science and English we have added the marks to  the dataframe using the data frame method and then we have printed it return the maximum of the values data frame. Max method using Max method okay save all  now right-click and the maximum marks   for each subject will be visible right-click run now the maximum marks in mathematics was 98 in   science it was 96 and in English  It was 95 and the same is visible here we return the maximum of the values using the   max method we return the maximum marks now  the opposite return the minimum marks that   is return the minimum of the values  using the Min method let us see the example Min method okay import pandas Spd now let us add our data to  the dataframe let me get the same okay we will add it to the data set we  have added marks to the dataframe   and created a dataframe marks in math science and English return that is display the minimum of marks in each column for that I told you  use the dataframe do Min method that's it okay minimum marks file save all right-click run demo 4 here the mark marks for math  science and English are visible   the minimum is 55 here the minimum  is 59 here and the minimum is 65 here and the same is returned  by the Min method here it is guys we return the minimum of the values  from a column using the mean method now the   next example here we will get the mean  of the values okay using the mean method   so we can get the mean of the marks using  this method according to our data set let's see mean method in Python pandas  import pandas add an alias now add your data here is our data and we have added it  to the dataframe also so using the data frame method okay three columns  now we will get the average   of mathematics marks science marks and English marks print data frame. mean that's it okay go to file save all  right-click run demo 5 to get the mean here we have the mean the  mean marks for mathematics are   81 for science it's 80 and for English  it's 80 we calculated using the mean method okay guys we returned the we returned the  mean of the values using the mean method in Python pandas now we will get the median  of the values for that we'll be   using the median method let us see the example get the median import pandas here is the data right-click copy and paste  it here and we will get the median of this data   which we already added to the dataframe using  the dataframe method and created a data frame display the median of marks in each column what is a medium it is  the middle of a series of values okay file save all right-click run columns math science English and these  are the median values for all the three columns guys we return the median of the  values using the median method in Python pandas now in this example we will return the   standard deviation of the values  using the STD method let's see the example here is the STD method import Pandas Library add the data we will add the same data frame the marks of three subjects pasted here it is marks  of three subjects and added this   marks to the dataframe and displayed the data Frame data frame. STD printed get the standard deviation using the STD method okay  go to file right-click run demo 7 and here we have the standard  deviation columns math science English okay and here is the standard  division for all three men Math Science and English guys we returned the standard  division of the values using the STD method here is the last example describe method if   you want to return the summary of  each column then use the describe method here it is describe method import pandas okay now add the data okay let me take let me take the following data some empty values paste okay we have the marks for  math science and English and these are the columns with some empty values also Now display the summary using the describe method summary of go to file save all right-click run demo 8 here is the summary of Statistics  here is the summary okay it displayed   all the statistical functions count  mean standard deviation min max and   the rest of them okay so this was  the summary using the describe method we have used the describe method  to return the summary statistics for each   column in this lesson we saw how we can  work around the statistical functions   we worked on the following eight  functions provided by python pandas in this lesson we will learn how to plot  in pandas to plot we will use the plot   method and the Matplotlib Library the  Matplotlib Library is having a p plot   module which will be used for plotting and  to display the figure in the end we will   use a PIP plot. show method first let  us install Matplotlib and run our first   example that is how to plot a dataframe in  pandas after that we will run the following   examples to plot a histogram then a pie chart  scatter plot and area plot let's begin with   the first example how to plot a dataframe  in pandas we will also install Matplotlib l so we will run our programs on PyCharm  ID PyCharm is having a free and open source   version that is Community Edition we are  working on the same let us create a project   go to file new project mention the name of the  project let's say I'll mention Panda spotting   and here is the location of the project you  can add any name to your project click create our project got created now let us add  the python file right-click new python   file add the name of your python file let's say  I'll add demo 1 when I'll press enter it will   automatically add the .py extension because  the python file is by default selected press enter here is a when you'll  keep the mouse cursor it will display the location of the file okay if you  remember we have five examples so for   that I'll create more files right-click new python file okay now let us add the commment here is the command let us plot a dataframe now   I told you to plot a dataframe we also  need Matplotlib so let me first import pandas so we already have pandas we need to  also install Matplotlibs you can directly   mention Matplotlib here and when it will show  this you can just keep the mouse cursor and   install Matplotlib or go to file settings go to  your project our project was pandas plotting right   the name go to Project interpreter go to file  and just type matplotlib for plotting and click install it will install now we have installed Matplotlib close and   now it will be visible here also  okay here it is Matplotlib click okay now there won't be any error you can see now  we will directly use the I told you we need to use   the pyplot okay so I'll use matplotlib.pyplot  it is the module and we will create an alias   let's say plt so that we don't need to write  this again and again now let us add the data set I'll add a sample data set let me add the temperature values now for for wind now for precipitation this is the sample data now we have four columns you can say in a data set okay now directly create  a dataframe object and create   your dataframe using pandas do dataframe  so this is the method to create a data frame now I told you before to plot  we will use dataframe Dot Plot and then plot that is plt do show to display file save all right-click run demo one here it is we have displayed our  first figure and these are the legends if you want you can save the figure from here okay let me mention that's it save close it here it is here is a figure okay so in this way guys plot a dataframe plot a dataframe  in pandas we also used Matplotlib and the pyplot module guys we've worked on the first  example to plot a dataframe now let   us move to the second example to plot  a histogram histogram is basically a   graphical representation to display frequency  distribution we will create a histogram for   that we will use the plot method and place  the same dataframe within the dataframe we   will mention the column through which we  need to prepare the histogram and under   the kind argument we will mention the his  hist to create a histogram let's see the example import pandas okay then import I told you we  need to import Matplotlib we installed   it in the previous example so we installed  it for the entire project create an alias   for the Matplotlib PyPlot module  plt add a data set for the data frame I'll take it from here paste so here is our data then we created a data frame now we will PL a histogram on the basis of the humidity column do plot and within this you need  to mention hist to prepare a histogram   that's it now display the figure using  the show method I told you   that is what is plt it is an  alias for our pyplot module here it is plt that's it file save  all right-click run demo to here we have our demo2 based on the humidity  values that means we have prepared a histogram we created a histogram using the  hist value under the plot method now let us   see how we can display a pie chart pie chart  is basically used to display play data in a   pictorial form that is divided into slices so  you must have seen a pie chart while watching   a cricket game a match in which it is shown that  which area of the ground got how many percentage   of runs so that is represented using the pie  chart okay we will use the plot. pi method to   achieve this so we will draw this on the basis  of humidity under a data set let's see the example demo 3 import pandas first because we will  create a pandas dataframe also create an alias import met plot lab and the pyplot  module because we will use this Library   also create an alias now enter the data set  let's say I'll take the complete data set from here copy paste okay so here is a data set prepare the  pie chart on the basis of the humidity column for that okay we need to use dataframe.plot.pie  method and set the Y axis to humidity you need to also set the index here to  display that in which city the humidity level was that particular value so I'll just mention the city since we are having 10 values so I'll add 10 names okay so we have represented the  index labels so these are index labels basically we have seen the index examples  in the previous lectures also in which we   use the index argument to create indexes we are  doing the same here and it gets created using   the pandas dataframe now we have created a pie  chart using method data frame.   now how to display it use the pyplot okay  matplotlib.pyplot so just mention that's it go to file save all  right-click run demo 3 so the   humidity levels according to cities  will be visible here it is it's visible   maximize and here it is okay pie chart  is visible with all the correct index labels we created a pie chart using the method now we will create a scatter   plot it is basically represented by a DOT  so if you want to display the relationship   between two variables you can use the  scatter plot so similarly we will use   the kind argument and we will mention it  to scatter so that we can easily create   a scatter plot here you need to set both the  X and Y axis we will set the temperature and   humidity respectively for our X and Y  axis under a data set so let us see the example prepare a scatter plot okay import pandas  first and create an alias this is for a data frame then import Matplotlib and also create an   alias we have created a plt as  an alias now create your data set okay this is fine without the axis  so we have again created a complete data   set we'll be using only two columns from  here but still we have created it now to plot I've told before that I need to use  the plot method and within that mention   kind parameter as scatter to create a scatter  plot also you need to mention the X and y- axis mention x-axis as temperature and y axis as humidity okay and display the figure using the  Matplotlib module that means the alias   Go to file save all right-click and run demo here  it is we have created a scatter plot on the basis   of temperature and humidity this is basically a  relationship between two variables humidity and temperature now we will see how to create  an area plot for that we'll be using the   plot. area method it is basically used to  display quantitative data visually okay so   you can see an area plot as an area  filled with colors or textures that   is specifically the area between the  axis and line okay so we will use the   plot. area method let's see plot and area  plot import pandas for a data create an alias import Matplotlib and it's pyplot module now  let me take the data set okay here it is here is a data set  and we have created a dataframe   now we will use the complete data  from the data set to create an area plot data frame. plot method do  area that's it and display the figure plt was an alias for our  pyplot module under Matplotlib   so we are using the same file save all right click run okay it is showing an error because this  is not a method this is fine now right-click   run no error and here is our area plot  I told you it is filled with the colors   or textures and the same is visible it's  visible that temperature humidity wind and precipitation so in this way guys we can  create an area plot using the plot do   area method guys we saw how to work with  the plotting in pandas we first started   with plotting a dataframe then we saw  these four examples to plot a histogram   pie chart scatter plot and area plot we  used both pandas and Matplotlib libraries