hey guys welcome to today's session on data analytics using Excel now data is everywhere right from texting someone to buying something online or even posting something on social media every single thing involves creation of massive amounts of data every minute every industry be it Healthcare e-commerce technology or Finance has data at its Center of operations so now it becomes increasingly important to analyze this data so that we can find some meaningful Trends and opportunities for the future data analytics in the recent times has been the most sought after domain to make sure you are ahead of this competition we here at Great learning have brought this wonderful course so that you can begin your data analytics journey and land your dream [Music] job if you haven't subscribed to our Channel yet we request you to hit the Subscribe button and turn on the notification Bell this is done to make sure you don't miss out on any of the new updates or video releases of grade learning and of course if you enjoy this video show us some love do like the video knowledge increases by sharing so make sure you share this video to your friends family and whoever can make use of it and at the end of it make sure to comment on this video if you have any queries and suggestions and I would be more than happy to respond to all of your comments before moving on let's talk about our learning objectives so we'll start by discussing three different phases of a data analytics project and have a bird's eye view on each one of them then we'll discuss about the importance of data cleaning and the different ways in which we can clean data next we'll talk about the basic functions and formulas that are a must to know when it comes to data analysis furthermore we'll be talking about different ways in which we can sort and filter data in Excel next we'll learn about two different types of lookup functions vertical lookup and horizontal lookup functions then we'll know what is conditional formatting and also how can we make color scales data bars and also highlight cells then we'll know about data validation how to create drop- down list and fix length in a Cell then we'll know about what are pivot tables what is the the importance of pivot tables and how can we use them to analyze data in the end we'll talk about data visualization using Excel we'll learn how to make different types of charts like tree charts scatter maps map charts Etc phases of a data analytics project now the journey of raw data from to meaningful data can be categorized into three phases the First Data cleaning second is data analysis and the third is data visualization now for dat data cleaning our first step would be to set up a problem statement that would be aimed in solving a specific business problem of your client now what is this problem statement for example a phone manufacturer would like to know how many customers actually bought their product while seeing an ad on YouTube now this would involve data the Second Step would be to perform an initial analysis on data so we need to check whether the data is complete consistent and accurate next we'll check for any misspellings in typos so the raw data can consist a lot of errors and mistakes that need to be removed as soon as possible so we would remove any misspellings typos or even duplicates row can be duplicated columns can get duplicated so we need to remove that as well last is checking for bias now what is a bias for example let's assume that your data set is drived from a survey conducted by a food company to know how good is their food so one of the questions in that column was how good was the food so this statement is inaccurate because it contains the word good now which is biased towards the positive side of the company and would not give them a clear picture of how their product is which is why it becomes essentially important to remove these kinds of bias as as soon as we get the data next phase is data analysis so this is the core of your operations since your data is already cleaned and organized in this phase you will be playing around with data to find some meaningful insights so you'll start by deeply understanding the problem statement finding appropriate relationships between data sets and also adjusting and formatting data by using certain methods and techniques like lookup functions conditional formatting data Val validations and using pivot tables going forward we'll look each one of them in depth next phase is data visualization now Studies have shown that human beings are more adaptive in understanding via visual representation now data visualization provides your analysis to be delivered to your clients and stakeholders in the most efficient manner so that they can make well informed and thoughtful decisions and also find some greater underlying Trends and opportunities for the future so in this phase we'll be learning about different types of charts like Scatter Plots tree charts map charts bar charts and many more next up is data cleaning let's assume that you are given a data analytics project now as a data analyst your first objective is to make sure your data set is out of any irregularities misspellings or typos consider it like a bowl of salad that you are preparing the ingredients would be salt pepper fresh vegetables and some sauces but now if the vegetables are not fresh your final product would be dull and unappetizing which you won't even eat which is why it becomes essentially important that your effort does not go in vain same is the case with the data analytics Journey if you do not incorporate data cleaning process then your final result would be hampered and even consisting of a lot of irregularities that your stakeholders won't even like which is why data cleaning plays essential role to make sure your end results would be in line and uniform with the stakeholders expectations moving on we look at some of the ways in which we can clean data now Microsoft Excel has been the most dominant platform available in market for decades now and it still stands at number one so we'll do all our analysis henceforth in Excel so giving you a basic tour so this space is called a spreadsheet space then we have different tabs on the top right so these are known as ribbons so you can perform various activities choosing these so this is a data set verticals are called as columns horizontals are called as rows the first row is called a header or a column heading so this is the heading of different columns that we have here so now we have this raw data with us and we need to clean it so in order to do that we'll perform certain task so the first one being we need to check if there are any duplicates present so I'll look around it but I don't think so I see any but uh since it's very confusing this way we'll Excel already has one function called removing duplicates so we'll click on data here and then we'll click on remove duplicates so it will automatically select the entire data set and you can also check which all columns do you want in our case we'll Click on each one of them so that entire data set is selected then we'll click on okay so our naked eye could not see that there were duplicates but there were so there were two duplicates that got removed and there are 2 three unique values left so we'll click on okay so the duplicates have already removed now this is a very small data set uh for understanding purposes but while you will analysis data you would not get this small data set ever so then certain things like remove duplicates becomes really important to know next up we'll just glance through it and we'll see if there are any irregularities present so I don't see any but here we go so in this 14th tab we see years of experience as 90 so now let's just think so 90 is not feasible value as year of experience so it would be a mistake on the client's part or from where you received the data set so you need to cross examine it so whether it is 9 or 19 or something like that so you need to check with you should not make any changes to it you should ask your client first that what is the correct value and then make changes and also while you're doing this changes you should also maintain a log book or you should just create a duplicate of the raw data just in case if your data cleaning you miss certain values so that you can recover it later okay so another thing is I would arrange my salary employe column from let's say lowest to high highest so this is from low to highest this is s function so lowest to highest I see that 1 lakh 1,000 7002 here employee of the salary 999,000 is below it so I don't think so that this was correct so there's a gap in between so considering all other salaries they are in five digits one is in six digits so it can be a possibility that is this empty space is by mistakes we'll remove that empty space and click okay right then we'll again click on this and do sort so this gets sorted but again you need to check with your client that your the removed space was correct or not so these are certain things that can be an issue when you analyze it in the later stage so it becomes essentially important to correct everything in advance at the first phase itself so next we move on to functions what are functions functions are predefined instructions or formulas that perform a specific calculation in a particular order in a spreadsheet so now what is this instruction or formula that we are talking about so it is basically in mathematical terms a calculation that is performed for a specific purpose and that can be anything for example it can be addition subtraction multiplication or even finding an average in order to perform these if you do it manually it would take a lot of time and effort you physically have to put the plus operator between every single digit to perform a sum so in that case Microsoft Excel already has some predefined formulas so that you can save your time and effort now let's talk about how do we write functions so basic syntax of function would comprise of three major components the first is start of a function so in order for Excel to know that you are inputting a function it needs to start with an equality operator then we need to Define what kind of calculation we need to perform so it can be some Maximum minimum average or anything that we want we need to input that after equality then comes the range of Cs so after uh the parenthesis we would input the range of sales so range what it is let's look at this example in which it's written E15 and E20 now E15 would be the first cell and E20 would be the last cell for the data set we want to do that specific calculation so in this case 50220 would be E15 5644 would be E20 Now by clicking on the first cell and dragging it towards the last cell will give me the range of the cell then we close the parenthesis and then press enter so it would give us the average of this data set now coming to what are the major types of functions that are required in data analytics process so that would be five different types first one being count if length Left Right concatenate and trim so these five operators are mostly used to analyze large chunks of data moving on we'll look at each one of them in depth so the first one is countif so countif is a function that counts the number of cell that matches a specific condition so it its syntax would be equals to the start of the function then count if then the range in which we want count if to operate and the condition so what is the value that we are searching for for example in a class of around 20 students let's say we have three students whose name is abishek so and we need to find how many times the word Abhishek has repeated so we'll use countif function for that we'll start off with equality operator then we'll write count F open parenthesis and the range of the function so in our case the range would be the total number of students comma the condition so condition is abishek close parenthesis now the output would be the total number of times abishek has been repeated in the entire data set this saves us a lot of time and effort now let's move on to excel platform and explore count if little more so this is a spreadsheet which contains a data set on movie titles and the language in which they are made so our objective is to find the total number of movies in English French and Italian so for that we'll use count if function so we'll write equals to count if open parenthesis our range so our range is a language column so from Germany we'll drag all the way through Italian comma inverted commas criteria so we need to find how many movies are made in English so we'll write English e n g l i s h close inverted commas close parenthesis and press enter so now we got total movies that are made in English are five similarly for French we'll do count if equals to count if open parenthesis the range of the function so the range would again be the same from German to Italian comma criteria French F yes close inverted commas close parenthesis press enter so French movies are also five and we can count that as well 1 2 3 4 and five yes it's correct similarly for Italian count if open parenthesis the range Germany to Italian Germany to Italian comma open inverted commas Italian and press enter so total movies made in Italian is six so this is how count if reduces our time and effort for long data say this is comparatively very short data set but in cases of long data set it saves us a lot of time and effort next is length function so length is a function that Returns the length of a text string now let's assume that a cell contains a word India now India has total number of five characters i n d i a so now this functions would return be the total number of characters in that particular cell so in this case it is five so the syntax is equality L en Len in parenthesis the cell in which we want the function to act next we'll look at another function called left and right function so similar to length function left and right function returns specific number of characters but from left and right side in a Cell so it will tell the total number of characters from the left or the right side so it syntax is equality left or right in parenthesis the cell in which we want the function to act comma number of characters from each side now let's look at these two examples in Excel so here we have the same spreadsheet of movie title and language now we'll know how do we use it so we'll put equality sign and we'll write l n Len open parenthesis we'll select the language cell German and then we'll close the parenthesis and press enter so now German has total six characters G E R M an so it will give me exactly the total number of characters in a particular cell we'll try one more example uh let's say we go for this so we'll type here equality L en Len open parenthesis we'll click on cell A2 and close parenthesis enter so total number of characters in Atlantis is 8 so we got the number eight let's just say one more example so I want Miss Jerry total number of characters in Miss Jerry so if we count manually it is 1 2 3 4 5 6 7 8 9 nine characters let's see what this gives I'll type equality alen Len open parenthesis click on cell A2 close parenthesis press enter now here seems to be an error that it gives 10 characters now we need to know that length function also counts the space in between so this is a point to be noted next we'll look at left and right function so now this is a spreadsheet in which we have the name of states and the counties with their PIN codes now let's just say I want to extract the PIN codes from this so I'll first type inin code that I need to extract now I can see that PIN codes of are of only six digits so I can use left function to extract this so I'll write equals to lft left open parenthesis the cell B2 comma since pin code is of six digits so I'll write six and close parenthesis press enter therefore I get the PIN code of each County so if I click on this plus sign over the right bottom corner and drag all the way through the spreadsheet I'll get the PIN codes of all the counties present so now I don't have to manually write everything if I need the PIN codes of these so this makes our work easier next we look into concatenate function so this function combines multiple teex cells into a single cell so for example you have your first name and last name now you want a cell in which you it should be full name so this concatenate function helps join two different cell into a single cell so it syntax is equality concatenate open parenthesis cell one comma cell 2 closed parenthesis next is trim function so trim function helps remove extra unwanted space from the cell so a lot of times in raw data there's lot of unwanted space within the cell which is why it could hinder your data analysis going forward so trim function syntax is equality trim open parenthesis cell one close parenthesis now we'll see these two functions on Excel so as we can see this spreadsheet has first name and last name written and we want full name so in order to use concatenate function we'll use equality we'll type in concatenate open parenthesis first cell comma second cell close parenthesis and enter so now it gives me a full name but as we can see there is no Gap in between there's no space in between but what if we want space so for that we'll use equality concatenate open parenthesis cell one comma Now in inverted commas we'll put a space which will tell Excel that yes you want a space in between comma last name close parenthesis and enter so now it gives you space in between your full name now if I hover my cursor towards the right bottom and drag all the way through so it will give me the same operation in the entire sheet so this is about concatenate next we move on to trim so this is the same movie the title spreadsheet now we can see there's a lot of unwanted space here here here and here so in order to remove that I'll use a trim function so I'll create an extra column now here I'll write equality t r IM trim open this cell close enter now I'll drag I'll hover over the right bottom and drag all the way through so now we can see all the errors here here here all the cells that had empty spaces at the start are removed next is sort and filter functions so what is sorting and filtering sorting is basically rearranging your data in the in your spreadsheet in the way you want so it can be ascending or descending you can also Define your own set of rules by using custom sort option now let's look at these examples in Microsoft Excel so here we have a spreadsheet of employee ID birth date age city name Department job title and gender so now we need to sort this data according to whatever we prefer so let's say I want the city name to be in ascending order so I'll click on any of the sales in this column and click go to the data ribbon and click on lowest to highest so it will give me the names in ascending order alphabetically now we if I want it to be in the reverse order reverse alphabetical order so I'll get that as well now if I want my city name to start from descending order from uh reverse alphabetically and also I want my birth date to start from the oldest to the newest no now how would we do that so we'll go to sort option here so it will open a tab so as we can see it automatically selects the entire spreadsheet now in this we can put a preference in the way we want so our first preference is to make sure the city name is in uh reverse alphabetical order so I'll put in sort by city name we can choose any other but we want city name sort on now you can choose cell color font color cell values Etc but here I want cell Valu so I'll click on here next I want what type of order I want so in this case I want reverse alphabetical so I'll go from Z to A right next I also want that my birth date should start from the oldest to the newest now I'll add a level and then my second preference would be birth date again we need to do it by cell values because I haven't created any other cell color or font color so I'll go with cell values next I'll do it newest to oldest or oldest to newest I want oldest to newest so I click on that and I'll press okay so now the city name are in Reverse alphabetical order as we can see and the birth date are from the oldest to the newest so we can see that here as well from starting from 1942 going ahead till the last section of 1994 so here this is the use of sort function now if I need to sort by different color sets so let's take an example of that as well so let's say I want my department to be color coded so I'll put produce as let's say which color click on H ribbon I'll choose yellow then I want the meat stab to be green next I'll use the bakery as red as red now I'll go again I'll click here now I'll go again to data and sort now I want I'll clear all of these now I'll add a level my first preference here would be to code it by Department name next I have color coded it so I'll click on sell color okay now Excel automatically detects what all color I have chosen so I have created three colors red green and yellow so it shows all three of them here now I can click on any one of them that I want to be at the top so I click on red on top on bottom I'll choose on top then click okay so the red color coded scheme gets on the top of my column next if I want to create I want green to come up next so what will I do I'll go again go to the sort tab now I'll create my second preference then by again Department name cell color this time I'll choose green on top okay so after red my second preference is green so and so on and so forth I can do multiple color sets and sort accordingly next we'll know about filter fil functions so filter functions basically Narrows down the way you want your specific data to be viewed so let's say in this example this is a movie set in which we have the title ear Jer language director and average ratings if it has got now I only want to see let's say English movies what all movies are in English so now sorting It Out by English would be one option but to make it more easier I'll click on this filter tab so it will automatically create a drop- down list in which we'll have different types of values that are stored in that particular column since we need only English we'll unselect everything else and click on English and hit okay now we'll only get the movies that are in English language and nothing else similarly if he wants any other language let's say French so we'll unclick English click on French and press okay now we'll only get French movies so this makes our job little bit easier on what we want to do with this data next we look into lookup functions the first one is V lookup or vertical lookup so the function that searches for a certain value in a column to return a corresponding piece of information of another column in the same data set so I'll explain this with an example let's say that you are working in a firm that is currently restructuring and you are assigned with this task to arrange the different salaries of employees according to their departments now you the data set that you are given only only has employee IDs and the corresponding salaries now your task would be number one to find who all employees work in what all departments and number two extracting those salaries from that employee IDs so in order to extract that from the employee ID if you go one by one manually it would take you a lot of time but this can be cut short through lookup functions so let's see how that works first we look at the syntax of this so the syntax would be equality V lookup open parenthesis now V lookup functions have four different parameters the first one is look of value second is table array third is column index number and the fourth is true and false we'll know what that is 1 by one so the first one is the lookup value so lookup value would be the thing that we already know so in our case it would be employe ID second is table array so this would be selection of entire data set third would be column index number now what it is let's see so let's say that in the data set you have two columns number one is your employee ID and number two is the salary okay so in order to extract the salary that would be in column two so you would put the column index number as two and true and false so true is if you want a approximate match and false is if you want an exact match of it so now let's look at we look functions in Microsoft Excel now we have a spreadsheet which contains employee IDs years of experience and salary of the employee now we need to find the salary of employee and we know what all employee ID is there so employee ID would be the thing that we already know and to find is salary employee now let's say that I want to find uh a salary of this cell 2161 so I'll type here 216 61 and I want to know the corresponding piece of salary so that corresponding piece of salary is 9014 90140 so now with looka functions we want if we just put the employee ID the corresponding piece of Sal salary should automatically come right so we'll use V lookup in that case so in the salary column I'll use the syntax equality V look up open parenthesis now the first is the lookup value so our lookup value would be employee ID so I will type employee ID here so I'll click on this cell comma second is a table array so our table array is the entire data set so I'll select the entire data set next is I'll use the column index number so since the salary employees is is in column number three so I'll put as three comma Now I want now it says that you want an approximate maths or exact math so I would go with an exact math so I'll write false false close parenthesis and enter so it says na since I have not input any employee ID so let's cross check it so let's say 21 2126 right so I'll type here 21 26 and press press enter so now it gives me 49,942 Let's cross check so the employee ID 2126 has a salary of 49,942 absolutely correct so this is the way in which we use look up functions and we can type any other thing let's check of uh let's say 3773 and press enter 9,29 let's see 73773 9,29 is it's absolutely correct so in this way we can use it we can use it for years of experience as well so let's say I want to type experience exp exp so I can use the same function we look up open parenthesis look up value is again I'll put the employee ID to find the years of experience comma we'll select the entire data set as a table array comma so years of experience is in colum number two so I'll put two comma I want an exact math so I'll write false or instead of false I can also put zero that is also exact match so I'll put zero and close parenthesis and press enter so now it says that 3773 with a salary of 9,29 has an experience of 5.4 Let's cross check check it 3773 99029 has the Sal has the experience of 5.4 years so this way we can use lookup functions and it is very efficient to use in longer data sets as well next we'll talk about H lookup so similar to V lookup H lookup is the same except instead of column it searches row here so rest Remains the Same the certain value in a row would return a corresponding piece of information and uh the syntax of it also Remains the Same but instead of V we have H over here and also uh the column we used to search in V lookup here in this case in hookup we look at horizontal rows so the column would be replaced by row so let's look at the example in Microsoft Excel and know more about it so this is a spreadsheet here we have a table but now instead of columns we have rows what does that mean so generally we used to have column headers over here but now in this case these are horizontal headers or row headers we can call so the table is horizontally now similar to V lookup we'll know the price of different office supplies using its lookup so this is the cell in which we need to find we'll we'll type the values the known value the known name like stapler pins sheets pens erasers or finders so we'll get the price of it automatically so let's click on this price tab press equals to H look up H look up parenthesis open now it says lookup value so our lookup value is this cell comma our table array is the entire tables range comma row index number now instead of column index number we have row index number so our first row would be this second would be this third this and fourth this so we we want the price of it which is in the fourth one so we'll write here four comma now we need a exact match so we'll type false or zero whatever suits you then parenthesis close and press enter so since we haven't typed anything it shows in a we'll write here pins press enter it shows me the price of pins right we'll look at one more example so let's say I put stapler press enter it gives me the price of it so similar to V lookup hookup also works exactly the same but in this case we have horizontal tables you would not get much example of this while working on data sets but uh still as an example you need to know that hookup also exists now let's just move into conditional formatting so what is conditional formatting as the name suggests conditional formatting is nothing but formatting of data sets using certain conditions input by the user so by giving certain conditions only the values of the cell that matches that conditions would be altered and the test would remain the same now you would ask what is the use of conditional formatting let's assume that you are a teacher in a 200 strength class now you want to figure out who all students achieved a grade of more than 90% in order to do so manually it would take a lot of time and effort on the same time if you use conditional formatting it would make your work lot easier and lot simpler saving you a lot of time and effort let's just look at some of the features that conditional formatting has so the first one is highlight cell rules so it highlights cells as per the selected conditions or rules so you can find anything that is greater than equal to a certain value that you want to input as a condition the second one is top and bottom rules so in this selected cells with the highest and the lowest value can be altered as per the selected condition the third one is datab bars now it applies a visual bar within each cell the length of the bar would be the magnitude of the value within that range the next is color scales so it is similar to datab baras but with color gradients with one color assigned to the lowest value and the second color assigned to the highest value the next is icon sets so instead of color scale it gives certain icons like increasing decreasing and Etc now let's just look at conditional formatting on the Microsoft Excel platform all right so in this spreadsheet we have a data set that describes up the number of companies different companies and the quarterly revenues of those companies so now we will do conditional formatting in this data set so our first objective is to find the companies with Revenue per quarter less than $40,000 so for that I'll select the data set now I will click on the home ribbon and conditional formatting so since I need it less than so I'll highlight sell rule options I'll choose then less than option now uh in this first colum I will type the value that I want to be less than so it would should be less than 40,000 so it would be less than 40,000 and then uh what characteristics I need so should it be a light red fill with dark text or yellow fill green fill or custom also we can choose but I'll go with green fill with light dark green text so I click on okay so now I have all the companies whose quarterly revenues less than $40,000 highlighted in green right so it becomes visually clear of how many compan IES are there so in order to do one more thing I can do uh now the second question is find the bottom 10 companies per quarter okay so quarterly I need to find what all companies have are in bottom 10 when it comes to revenue right so I'll click on quarter one I'll select the range click on conditional formatting okay let's first undo what we did for our first objective so I'll just so in this case I'll just select my first quarter and click on conditional formatting now top and bottom rules so I need bottom 10 items so I'll click on bottom 10 and I can highlight it in let's say yellow fill with dark yellow color I'll click so here it is already mentioned 10 right so for example if the question would be bottom five so I could just say five so now it will show me the bottom five companies in particular quarter right I'll click I'll go to 10 and click click okay so and the same thing I can do here now what happens if I select the entire data set and do bottom 10 companies so then in this case it will give me the bottom 10 companies of the entire data set but since the question asked per quarter so we have to individually select the column of quarter individual quarter to do our analysis okay so next we'll Explore More on this so so let's say conditional formatting what are datab bars okay now according to how large the value is in particular cell the gradient will be filled inside so let's just say I'm selecting little clear ones let's select the red ones now uh when we look at the values here $2,584 so it's a very small value as compared to the entire data set so it gives me a small bar inside the sale in case of these three companies in in their fourth quarter their values are comparatively High when compared to the entire data set so that is why it has a higher bar gradient per cell right so in this way we can distinguish as well now see this 999,000 is almost full because it is too high of a value relatively to the other sales uh here uh whereas this 11,000 it's quite low it's one of the lowest I believe so that is why it is slightly filled right so this is how we use data bars we can also use uh color scales so see let's see what it is first I'll just undo it now I'll see color scales so it will also give me different color scales or depending on uh how the values are the greater value so it is divided into basically three parts uh top half bottom half and this thing I can actually put colors I can put new rule actually so I need uh let's say two two color format right uh so I'll go for let's say three three color scale so now it will divide the entire data set into three parts and for the value we can choose so in this dialogue box we can see select a rule type so format a sell based on their values this is what we will click click on it it says uh what should be the minimum lowest values okay so yeah I'll go with percentile lowest 10 % uh will be red the around midpoint around 50% would be yellow and the highest values let's say percent highest 10% would be green I'll click on okay now it will tell tell me the close to the top ones would give me dark green the lowest one would give me red and to the midpoint would give me yellow so this is how we do our color scaling and we can also do lot of things like icon set so I'll just undo it so I'll just okay so in this case we can do icon sets so icon sets are basically different shapes and indicators that we can put we can use I'll just choose this one okay so it says hash it is because the column length is not enough so I'll just slide it a little bit this also or I can just double click so it gets bigger so now it gives me what values are so this is the midpoint so it shows a horizontal Arrow this is on the lower half side this is on the upper side so it gives me distinctions between the so basically the entire data set all this condition formatting is relative to one another so how many values in the entire data set would be so it would compare those values accordingly and set the symbols right so yeah we can clear rules also clear rules of okay data set also we can do clear rules of the entire sheet so it would go blank and uh also we can create new rules such as let's say format only values that are above and below certain things so above above format so I can give them different I can border I can fill colors in this so I'll say blue color I need to fill okay okay so the values that are above average would be highlighted in blue and the rest would not be affected so different rules we can set as well of whatever things we need so this is how conditional formatting works we can also find uh average values above average below average uh percent in percentage also and these all things we can do next is data validation so data validation is a very interesting tool which allows you to control what can or cannot be entered in your spreadsheet so data validation is a very important tool when it comes to data handling process like collecting information so let's say that you are rolling out a survey in which you want the user to fill only certain values in particular columns so data validation allows you to do that so let's see how do we manage data using data validation okay so here we have Microsoft excel in this spreadsheet we have a table in which we need to fill certain columns so let's say the first one is name so in this column let's assume I only want the name should be a maximum of 20 characters long so for that condition I need to choose this range click on data data validation so here the tabs appears so in that settings tab we need to fill the criteria that needs to be filled for the particular range so I'll click on since we need characters so I'll choose text length minimum character one maximum character uh let's say 12 and say okay now let's check this out now if I type a Word let's say sing so it gets filled yes filled but if I try to fill something that is large let's say uh abake Paul click on enter so now this says that the value does not match the data validation restrictions defined for this set so since this name is more than 12 characters long the input is invalid so now we'll retry and we'll put something a little less let's say we click enter now it satisfies the condition so in this way we can set certain conditions for this column similarly we can make a a drop-down list for gender column so in order to do that I'll again go to data validation colum press data validation now I will put allow as a list so now I need to select the source so I can just write here uh let's say mail comma female and then press okay so I only selected one column I did not select the range but I could select the entire range and do that but in this case I can choose it's male or female so it will automatically come here in whatever we choose so there are only two things we can choose we can't write anything here that would be invalid so in this way we can choose list now comes uh we can put whole number or decimal numbers in marks criteria so we can set the range from what number to what number we want in that particular column so let's do that as well let's see data validation allow only whole number or decimal so since Marks can be in decimal like 10.5 12.5 50.5 so we'll choose decimal for in between let's say 0 to 100 and then we'll press okay so in this case let's see 1 enter 10 enter 55.5 enter everything is being taken by Excel but if we choose anything beyond our range let's say 115 so it gives me an error that this value does not match the data restrictions so we'll retry and put 1.5 which satisfies in this way we can do it but let's say say if it's a survey and you're rolling out to the user how would the user know what values can we fill so the error me message should be customized right so in that case we can do a thing let's say data validation so this condition we set so we can put here error alert the title would be let's say in valid input input and uh error message would be please input values between one between 0o and 100 and press okay now we go beyond our limit and press enter so it would give us the error message that we customized so our title was invalid input and the description was please input values between 0 and 100 so in this way the user can know that what values we need to feed right or we can do one more thing so we can just do again data validation or we can just put an input message title uh let's say input input message only value ose between 0 and 100 and press okay so in this case if I go here so it will give me the input message that only values between 0 and 100 are allowed so the user would beforehand know what values we need to feed before getting an error message so in this way we can do that and we can also set definitely we can set our data validation time also we can set here date on time so I'll set date between let's say date of birth between 0 1 10 1998 2 0 1 uh 12 2,000 and press okay so now they can do it in the same way 01 1 2 1999 and it will be okay entered so in the same way we can do that now let's come to uh another most important concept so what if we need to create a drop- down list for for the subjects so the number subjects are these math English physics science physics Hindi and biology so if I need to make a list what was our method we used to write here maths comma English comma physics comma Hindi comma biology right so we'll click on okay so then it used to come this all the details that we can fail now let's say that the data set is quite big that you can't manually put every single name with comma in between if you want to make a list so in that case we can put the source as another data set link so let's see how to do that I'll just go back I'll click on this I'll select the range then I'll click on data validation I'll allow H it as list the source now I can actually select the source so I'll click on this I'll select all these and press enter so now the range is selected and I'll click on okay so I will I will automatically get what the uh secondary link that is generated the data set from there will be retrieved here so let's say physics or we can set anything maths but it has a disadvantage so in that case you cannot delete your primary data set so let's see if uh we delete what happens let's say I delete physics okay and now we see the drop- down list so now maths English blank Hindi biology so it will remain blank over here so that is why this is one of the drawbacks in which we can't delete a primary data source next is Pivot tables so what are pivot tables pivot tables are used to sort count Group total or reorganize stored data in a table so now you would ask me what is the difference between a normal table and a pivot table so pivot table can categorize rows and columns into different sections that you would need to do now in case of a normal table you cannot categorize into rows and columns you can but you would have to use sort and filter and then manually uncheck all the selected options and select the ones that you need but in case of pivot table it automatically summarizes your data in the way you want so it change it helps change rows into columns and columns into rows and also the pivot Fields options so you can group them and do Advanced calculations like uh sum average Maxima Minima and different other things as well so now we look at an example of how pivot tables help us so let's go to Microsoft Excel so now here we have a spreadsheet in which we have a data set so this data set is about different products being sold uh so in this case we have a product line defined so it it is classic cars motorcycles and a lot of other things so first of all this is a type of data set that you would encounter when you do data analytics so this is a classic example so there there would be a lot of columns that you would encounter and N number of rows so in this case let me check how many rows are there it is close to 3,00 2,800 close to 3,000 Ros so this is the type of data you'll be dealing with it can be much more it can be 10,000 Ros as well but now let's talk about the point pivot tables so here we have uh quantity mentioned order number mentioned price sales order date month in which the these were sold different years in which these were sold and uh the name of the customer their contact details address state county territory country postal code Etc so these are the things that are actually given to us now our purpose is to create a table in which we can find what are the different product lines that were stld in different months so an example of what is our uh objective I'll show you this type of pivot table so in this case we have the sales of different uh product lines in different months so this is how a t pivot table is so we'll make it together going forward but uh this is our objective of what we need to do different rows are mentioned in the column here in the rows here row headers and different mon different product line are mentioned in the column headers here and the grand total is given in the last column and one grand total is given in the row header so what is this now in January uh let's say classic cars were sold at 3 lakh 3,000 right now in this case the grand total of January of all product lines is given here now the grand total of all the different months of all the product lines are given over here right so this is how a pivot table looks like now let's build one together so this is again our raw data and now we need to create a pivot table out of it so we'll click on any cell in the spreadsheet and click on insert now now when we click on insert we have an option called pivot table so we'll click on pivot table here dialog box open now in this case we are asked certain questions so select a table or range now Excel automatically detects uh your table so as you can see in the data set you would have a running line right running box sort of thing here at the edges so it automatically selects all the the spreadsheet data set now it would ask me which where do I want the pivot table to be whether it is a new worksheet or existing worksheet now if I in case of new worksheet it will add another sheet uh to this Excel file and uh in existing worksheet it will add it beyond the last column so it will create a column on the right side beyond the last column so uh in this case we will select a new worksheet because it is convenient so while clicking on new worksheet now we'll get something like this so it will be a column it will ask to build a report choose the fields from the pivot table field race now pivot table Fields is this dialogue box so we have the the fields are basically the column headers so in this case our topmost line the column headers are our Fields so we we can choose whatever fields we want below it are four different columns so the first column would tell us uh what filters do we need so basically you can choose any filters that you want you can drag all the columns that you need below uh in this case it would be selected now for our objective we need to find uh number one the sales or in different months and of different product line so uh let's say so columns I can put any rows also I can put any according to what I need now the value column could only support numerical value so it only takes numerical values so for example we need sales so this is the most clear option so I'll just put sales over here now something appears over here we'll see what it is now this is the sum of sales of everything nothing is put on rows and columns so it would not show anything so let's say that I want my product line to be the column okay let's say row so so it will come over uh on the row headers the product line so these are the total categories we have classic cars motorcycles plane ships trains trucks vintage cars so these are the categories of product line we have in total in in the data set now if I need months right so I'll put it in columns so now it will give me the sum of sales in January of classic cars vintage cars shapes and everything and it is divided into months right now I can do different thing as uh we saw our objective is we need to create it in rows should be our months and product line should be our columns uh in our pivot table we'll just switch it so I'll put here I'll interchange it so now it becomes exactly the same that our objective was so the sum is mentioned everything is mentioned in here so we can make it more worthy if we add uh the currency symbol so it would be much more legible now let's take an example of how we are going to do it now if we would look closely it gives me months and total sales okay so let's go back in our data set and look at some details so uh in this case we have different months different categories of product line and different sales right but now if we look closely there is an ear ID as well so now we have different years like 2003 I can see four and five I can see so in this case while we made pivot table so our January would consist of three years combined together that would be 2003 4 and 5 so in order to distinguish in terms of years we'll just remove our month ID and we'll put in year ID so now we know in 2003 4 and five how many cars was what what was the worth of cars that were sold and the grand total of the cars sold in all the different product lines and also in total uh different years 2003 4 and 5 now let's say I want to know that in the January of 2003 how many cars were sold right so in that case what we need to do we need to put month ID below the row ID now it would distinguish first on the basis of year so as you can see first on the basis of year then it will distinguish it will give me a list of okay uh January 2 uh it is not in order but we can put it in order that's not an issue but in January how many cars were sold so this data would not be available so it is just empty but the data that is available in the uh in the data set it will put it here and make the grand total of it now in similarly in case of 2004 and 2005 now let let's just say that I don't need the sales I want the quantity how how many was sold in particular months and years of the different product line so in this case I'll go again to the pivot table fields and unselect I'll drag the sum of sales back to the field option and click on the quantity ordered in values since for the sum of sales we clicked on the dollar symbol now we need to unclick it so I'll go to the home ribbon and in case of currency I'll put General no specific format so it will give me the quantity of different things sold in different months of different years so in this way I'll get the quantity so many more things we can do uh when it comes to the use of pivot tables so this was one of the examples third and the final phase of your data analytics journey is data visualization now considering your data analysis is done and you are fit to represent your data to your internal and external stakeholders now internal stakeholders are someone it can be your boss your senior managers the top brass of your company and external stakeholders are your clients for whom this project has been allotted now considering the time constraints they might have in their day-to-day work they cannot go through your data sets one by one so in order to make them understand more quickly and in a more efficient manner we will use pictorial representations now these repres presentations can be charts right it can be pie chart it can be bar chart line chart and different Scatter Plots and even some Advanced charts like map charts entry charts so considering all these scenarios now it becomes your responsibility to make sure all your analysis is pictoric represented so let's move on to data visualization the first chart that we will be talking about is a pie chart now pie chart represents part to hold relationship so since it's a very basic type of chart in Excel we learn about it with an example so let's say you have three friends and all of them have apples right so you have the maximum amount of apples that is four the other three have three apples two apples and one apple respectively now in order to represent this data pictorially what you would do you would make a pie chart right so all your three friends and including you will be represented on this pie chart now since you have the maximum amount of apples that is four you would get the maximum chunk so this area would be yours with four apples your friends would get the one with three apples will get a slightly lesser chunk three apples the one with two apples will get slightly smaller slice and the one with one apple will get the smallest slice so this is how a pie chart is represented based on the value in the data set the more the value the bigger the slice or the bigger the percentage in of that particular slice so it has three major components uh pie chart the one is area the second one is central angle and the third one is Arc Length so Excel does it automatically but let's take an examp just's look at what this is so area would be this chunk here central angle is the angle it makes from the center and the Ark length is a circumference of that particular slice so these are the three major components of a a pie chart but Excel does it itself so you just need to feed values and Excel will make a pie chart for you the second chart that is the most important and most widely used is a bar chart now bar chart uses bands of different heights to show different amounts so that they can be distinguished so as you can see one side the y axis here has uh values numerical values in it and x-axis has basically the categories in which it is divided so this is represented in x-axis you can do Vis Versa also then it will be a horizontal bar graph this is a vertical bar graph so you can compare different uh like months or different categories of food items or any number of things so this is generally used in areas where you want to find how much difference is there between two categories so this is the most effective one there the third one is line chart now line chart represent shows the changes over a certain amount of time so it connects one section to the other section in case of bar grass there was no interconnection between these things but in a line chart you can see in January to February you have a line drawn over here so this gradient shows what is the difference between the two months right so this is how line chart operates so each categories are along horizontal and the value is over vertical so this is the most favored method to do because otherwise if you do vice versa in line chart it would not make sense right so it is also preferred when one parameter is non-numeric which is the horizontal one and the other one is numeric the vertical one right now let's look at these examples in Excel now let's look at the example of these charts using these tables so this table I used it during the hookup session so I'll use it again again so we here we have office supplies product ID quantity and price mentioned so in order to make pi and bar charts let's select the range of price and office supplies then click on insert and then uh as you can see under charts here we have different types of chart you have PIV with charts we have map charts and all these types of charts that are represented here so I'll choose pie chart so I'll click on pie chart then they it gives me an option to choose either 2D Pi 3D Pi or dognut so in this one I'll choose 2D Pi chart so I'll click on that now it makes it selects it highlights me the different range that we selected and automatically mentions it over here now in this case I can cannot see the percentage in which it is divided right in order to do that I'll go on chart Styles right I'll click on the downward arrow and choose whichever ever one I want okay so I see that this one here is clear okay so I'll choose this one right so now I get percentage I get the different categories in which it is divided and I get get the name now as we can see this is not the name of the chart right so I can double click and change to by chart or something or pie chart prices something like that so I can change it to whatever I want and uh so this is how it is represented now coming on to next chart that is uh bar chart so I'll just delete it I'll click again on this the same thing I'll select both the ranges that we selected earlier then click on insert and here the first section is insert bar chart so I'll click on the download Arrow now again it gives me different options so I can get 3d effect I can get horizontal this is what I mentioned you can use horizontal bar charts as well or vertical bar charts so 3D horizontal 3D vertical I think vertical looks much more clear to understand so I'll use 2D I click here so now I again get this chart and since it is not mentioned any percentage or it's quite clear 500 here but if I want to change certain things I can again go to chart Styles I can change how visually they appear and different see this is the quantities are the price is mentioned in the bar itself so I can choose whatever this one looks pretty good I'll choose this one so now I can see that it is clearly mentioned everything I'll change it to probably price bar chart something like this but as you can see there are no mentions of uh what is in y axis and what is in x-axis so it is not mentioned okay I I want to put something here x is on on the axis I want to put a category or something like that so how to do that so we click on on the left side here add chart elements so I'll click on that access titles so now I can have a horizontal access title I can have a vertical access title right so we have all learned in our Junior classes that the access titles are an important ones so I'll write here price I'll write here category so in this way we can make bar charts so in order to make a pivot chart let's take an example so for pivit charts this is a data set that we analyzed before so I'll use the same data set so we made this kind of pivot table over here right so I'll use this in to make a a line chart and this will also give you an example of what pivot charts are so making a chart from pivot table is called a pivot chart so let's see how this works so it's basically basic charts but in pivot table I want different uh months to be represented via line chart in a pivot table so I'll go to again again insert then uh so since we are already in pivot chart we would not click on pivot chart but we can use different uh charting pattern so since I want line chart I'll use uh 2D 3D area does not look very distinctive so I'll use 2D line okay so here also as you can see if I hover over this one it sales stacked line so it shows me the sum of sales and uh the months right if I click on this it also shows me the sum of sales but with a different uh y- axis so the values are different the various bars in which it divides is different so in case of percentage also I can get so in this case you have different percentages mentioned so I I can click on any one of them I'll click on this one now let's say I only selected this this range but it gave me all range so in pivot chart it gives you all range me you can actually filter it from this filter option here so if I unclick uncheck all and I only click motorcycles and click okay so it will only give me the data of motorcycles in every month and uh its values right and here also the pivot chart also only uh summarizes it into motorcycles and months so this is very good here in which we can do it you can actually change the month also if you want to see a particular month we can do that as well so this is how a line chart and a pivot chart works next we'll look into two of the more advanced charts that are not commonly used in our analysis but are a must to know so the first one is tree chart or tree map as we say so it provides a hierarchial view of data so just as the example shown in the picture so we can see this is the example of tree map and here we have different countries and some value in which they are arranged okay this value I would know because it is the data set which which I am using but this tree maps are not used to depict the exact data that is in the data set so if you just want to know okay which proportion is maximum to which within the hierarchy so this would tell us that so the first one uh is India so so India has the maximum amount of value in a particular cell that that's why it has the maximum maximum area second is Saudi Arabia then Russia Germany Australia China and America and Spain so the ones with bigger values are towards the left side of the map and the ones with are less value or towards the right side so that is why it gives a good comparing proportions within the hierarchy the second chart is called map chart so it is used to navigate values and shows categories across different geographical regions so in this case uh we will use only the names of countries to represent any data set so the countries will be highlighted according to the values so this is most commonly used by uh let's say World Bank or uh United Nations categories to represent datas across the globe right so and also some of the companies who are involved in logistical supplies Imports exports use these kinds of maps to represent data let's look at the examples in Excel so here we have a spreadsheet in which there are different the countries and GDP per capita are mentioned here so in order to create tree Maps or map chart Let's uh select the columns and uh then click again on insert and here we have the first one here it represents hierarchial chart so this is a tree map so we'll click on tree map now it automatically uh extracts the values from the range we have selected and makes a tree map out of it now in this case uh we can change the the layout according to our wish so I can choose any one of the ones in chart Styles so as we can see the ones with the greater value Qatar and us are towards the left side while the ones which are quite small are towards the right side right okay so I choose this okay I can make changes from this plus sign here if I want to see something if I don't want to see something or or if I want to change the layout how it is styling here also you can change it and color also if you want the scales to be of a different color multicolor how do you want to change it it's upon you so you can do all these things using this now coming to the second type of chart that is the map chart so we'll undo whatever we did okay again we'll select the data set then we'll click on insert and uh towards the middle of the charts we have a thing called Maps so if I click on this field map so it will generate the countries and their values according to how much it is there so we can see in this case all the countries which are mentioned are are in the map right so we can change the layout again from charting Styles we can create whatever uh we desire and also we can change the way it is uh there so wait I'll just if you double click on the chart you will get format plot area so you can actually change the fill how it is filled you can put a solid gradient fill or pattern fill so in case I want the countries which are not selected here I can actually change make them the way uh I want so okay it is not represented in this but in this case I can do it pattern fill yeah I can change the way it is so the ones which are not selected will get uh change so the ones so it is clear for us to know which one of them is selected right this was all about map charts now let's look at what we have learned in this course so far so we started by knowing the three different phases of a data analytics project then we moved into knowing what is the importance of data cleaning and what are the some some ways in which we can clean data next we move into knowing different basic functionalities of excel so we learned about count if function trim function length concatenate Etc then we sorted and filtered data using in Long data sets and also analyzed the data using V lookup and X lookup functions then we learned about conditional formatting datab bars color scales Etc then we knew about the data validation techniques how do we validate the data next we learned about how to make pivot tables how to analyze data using pivot tables lastly we learned the importance of data visualization in Excel and also learned about some basic graphs like pie chart bar graphs and also some Advanced ones like tree chart and map charts so this was all about the course hope you learned an amazing bit of knowledge by this if you haven't subscribed to our Channel yet we request you to hit the Subscribe button and turn on the notification Bell this is done to make sure you don't miss out on any of the new updates or video releases of grade learning and of course if you enjoy this video show us some love do like the video knowledge increases by sharing so make sure you share this video to your friends family and whoever can make use of it and at the end of it make sure to comment on this video if you have any queries and suggestions and I would be more than happy to respond to all of your comments