Transcript for:
Analyzing T20 World Cup Data for Earth

T20 cricket world cup was finished just few weeks back with England claiming victory over Pakistan and today we are beginning a cricket data analytics series using same T20 world cup data. We will begin this project by scrapping data from ESPN Crickinfo website, then doing data cleaning and transformation in Pandas and then eventually building dashboards in Power BI. Before we begin any technical work, let's look at the exit problem statement and our special stakeholders that we got in this project. The whole atmosphere is charged with one type of heat.

We like your cricket. Surrender Earth. You need to fight 8 billion people for that. We destroy. Then you get nothing.

Negotiate. You defeat us in cricket, you get Earth. If you lose, join me as an intern. Deal. Tony we got to save earth give me your best 11 best 11 bots data analyst No, cricket players.

As you saw, Planet Spota has challenged Planet Earth to play cricket. And Nick Puri has assembled the secret agents of field to work on this project to find out the best 11 players based on the T20 World Cup cricket data. Tony Sharma is in charge of this project.

He is not only a senior data analyst, he is also a cricket subject matter expert. So in the next video, you will see what kind of algorithm Tony Sharma comes up to pick this best 11 team who can go and defeat the aliens. At the end of the video we are going to give you a challenge based on this project and by working on that challenge you will be able to win exciting prizes so make sure you watch till the end.

Nick this is the requirement you gave me you said we don't know the strengths and weaknesses of our opponents but give the best level from a planet so this is what I'm going to do. I'm going to give you a team that will score 180 runs on an average. At the same time, this team will be able to defend 150 runs.

So you have a margin of 30 runs to play with. Do you think that's enough for you? Yes, this sounds like a good target because although aliens have not played cricket, they have a technology where they can learn things really fast. Okay, so this is how I've done this. I've made different positions for players and I've selected parameters for each of them.

I'm going to quickly show these parameters to you and let me know if you want to add some parameters or anything. This is just an idea I want to give on how I'm going to select the team. The first ones are the openers. They are going to open the innings.

These are the power hitters. So they will be hit. the balls out of the park and they will also score runs it's just not about hitting so that's how i've selected the parameters batting average strike rate uh boundary percentage very important they should be scoring 50 percent of their runs over 50 percent percent of their runs by boundaries because in the power play, the fielders are going to be inside and they should be eating outside the inner circle. That's the plan and they should be giving us at least 50 runs this partnership and within the first five overs.

Got it. So we'll basically get players like Shahwag, correct? Nick Shevag isn't playing anymore. Oh, okay. I see.

Okay. Alright, then we have the middle order or the anchors. So here, yeah, they won't be hitting the ball as hard as openers, but they can shift gears and hit the balls if needed. So they will have a better batting average.

They will bat for longer time. That's why I've also included the average balls faced. You know, they also bat for longer time. bat you know like aggressively if they want so those are the kind of players you'll have here i'm going to select three plays for this position so overall we have five players now so these five players will give us at least 120 plus points in the in the 13 to 14 overs that's the plan i would like to see virat kohli in this fire list oh he's playing i think he's certainly a part of his team great and of course we are we are making this you asked uh the special requirement is to consider the world cup t20 2022 for uh you know for the selection of the team so i'm select i'm taking only that particular data for that okay all right and uh so this is going to be an interesting role uh we i'm gonna get one player for this role a finisher role so if you are chasing this particular person should be able to hit like crazy go berserk but if you lose wickets early like if you lose a little order so badly So this person will be able to stabilize the innings and order the rest of the lower order batsmen. So that's the kind of player we're looking for.

We're looking for more of a batsman here than a bowler. So, you know, ideally I would need an all-rounder here, but again, with more, with a batting all-rounder, rather a bowling all-rounder. And this will leave us with five more players and my seventh and eighth player will be all-rounders or the lower order batsmen.

So mostly these can be spinners. and because the last three slots I'm going to keep it for fast plays. So these spinners are the ones who can also hit and they can hit without thinking like they come to bat they just start hitting so that's the kind of players we want because they will be mostly coming under around 17th or 18th over and so that's why I've selected you know kept the parameter in such a way the batting average should be at least 15 and you can see the strike rate is more than one point.

So even if they score 15 runs, they should be scoring that in around 8 or 9 balls. And you can see the bowling economy is really good. It's less than 7. Which means if these two players bowl all the 20 overs, they will be giving only 140 balls. Which is good for us because our team target is to keep the defense under 150. And their strike rate is less than 20. Which means if they bowl 20 balls, they will get one wicket at least. If they bowl all the 20 overs, they should be getting at least 6 wickets.

Nice. Yeah. And here comes the specialist fast bowlers because we need to rattle the team. We need to rattle the sport arms.

You know, I don't think they should be having capability to face these bowlers. This is the full pack we have got. They are threatening.

They can take wickets at least every 16 goals or even less than that. this is the parameter i've said for them and they can ball very fast and they can also you know ball dot balls over uh 40 percent which means if they ball four overs around 10 of that will be a dot ball which is great for a t20 game exactly dart balls are so precious for t20 yes and with this 11 players nick i'm pretty sure will save the earth what do you think The algorithm looks pretty solid to me. I'm excited to see the final results which our Power BI dashboard can produce.

Yeah, so this is already in progress. So I'll be, you know, I'll be showing you the, you know, the final dashboard and we can do an analysis together to pick the final level. Looking forward. Thank you.

Thank you, Nick. See you then. Bye.

Folks, we had our resources on GitHub, but we have moved to this resources section. on codebasics.io because the file sizes were getting very big. So go to codebasics.io, click on resources here and you will be able to find all the files here. So here you can click on download and then you can just create an account for the codebasics website and log in. Once you're logged in, you will be able to download all these files.

Now you can download whatever files you need. Sometimes people may not need all the files. although we are going to provide download all button later but this way you can download just the required files here for example i'm going to download my web scraping code and once it is downloaded you can open it unzip and here you will be able to see all those JS file that we need in bright data similarly you can download a rest of the files as well I'm a better girl, we doing better, if I ever lose, I don't know If I ever lose, I don't know, we doing better, even though it doesn't count Bigger, more down, faster, boom, I'm going to the maximum, I'm going to the maximum, I'm going to the maximum, I'm bigger, more down, faster, boom, I'm going to the maximum, I'm going to the maximum Folks, why so much fuss? Just use something ready-made. For example, BrightData.

BrightData uses proxy networks for building web scrapers that work seamlessly. They have various solutions such as residential proxies, web unlocker and so on. They also provide ready-made data sets.

The tool that we are going to use specifically is called data collector and if you use the link which is given in the video description below it's a special link for code basics you can log in here it says work email but don't worry you can use your personal email id and still log in just select some values from these two drop downs and you will be able to create an account once the account is created if you look at the dashboard you will find that you will have $15 balance. I want to thank Bright Data for giving this $15 free credit for all Code Basics viewers. For our data scrapping work, we need hardly $3 or $4. So this is more than enough. You can use this credit for your other projects too.

We are going to capture four types of data for our project. Number one is this. detailed match results table.

So we will scrap this entire table. When you click on this particular scorecard link, you will get a detailed bowling and batting scorecard. So we are going to grab all these tables as well.

And then when you click on any player, you will get player specific information. So we'll grab few fields from this also. I'm going to click on user dashboard and go to something called a data collector. Now, data collector is web crawler basically, which will go to the website and collect the data.

I have already created the four collectors. So this one is for the match results. So let's say this page, you want to grab that, right? Let's look at that code and understand how that works. So when you right click and say inspect, it will show you the HTML tags for that particular page.

And you can click on this and let's say you are getting. this particular table right or this particular let's say cell when you look at this see this is one row this is second row third row and so on and this particular table is inside this t body and the table class is engine table so now let me show you my collector so i will go to my collector here and say edit code and in this one you will see the interaction code as well as the parser code. Now this is a JavaScript code.

JavaScript is better suited for web scraping because a JavaScript code runs inside the browser. So it becomes easier for that. So here this particular link that you're seeing is nothing but this particular page. So what I'm saying in my collector is go to that page and then collect parse. Now when you say parse it is going to execute all this code.

Now, once again, this is a JavaScript code where I am locating the engine table table. Then I'm going into tbody and then tr.data1. So if you look at here, see there is engine table, then tr.tbody and then tr.data1.

And this one is an array. So if you go through these array one by one, you'll be able to get all these records. So that's exactly what I'm doing here.

See? the first element is team 1 second is team 2 then winner and so on so if you look at any row here see the first one is team 1 second one is team 2 you can see on the left side then the winner the margin the ground and so on so that's exactly what we are doing here and then it Grab all this data into math summary array. This is JavaScript array Basically and it will return that okay, so let me just run this here It will take some time, but you will see that in the browser the page is loaded So it executed this navigate function and here and now it is collecting all this information So if you look at it in the output you will see this information is now available as a JSON and if you download that JSON file this is how it will look. Now let me just format it so that you can see it properly. You can see that now you know you have team one, team two, who is the winner, margin, ground, match date and the scorecard.

So we grab this particular first table in its entirety. See we have the entire match result for T20 World Cup. Now, the way Bright Data Collector works is they're going to use a smart proxy network.

OK, and using that infrastructure, it will do web scraping so that you have seamless data collection without having to worry about some website blocking your IP and so on. Because it is sort of like VPN. It is using different IPs and you will not have.

any trouble in your data collection process the kind of trouble that you have when you are using plain python script if you're new to bright data what you can do is you can use one of the templates so for example i can click on develop a self-managed uh collector and you can use one of these templates see here you are collecting data from uh quora for example here you are collecting the data from youtube okay so you use that template and you can run that collector you will get an understanding of that I have provided the code for all these four collectors on github the link is in video description below so if you look at the web scraping codes see here is the batting summary here is the balling summary so these are the code that you have you have interaction code and you have parcel code now let me show you batting on summary web collector so here I can go to batting summary click here and then advanced section, edit the code. Here what we are doing is this first page that you have is this page only. Okay. So you're going through that match summary page and then you are collecting all the links.

All right. Which links? So these links, see, when you right click and say copy link address, you get this particular link, right?

So we are collecting all those links in this particular code. All right. Then we are calling next stage. So in the next stage, it will execute this particular code. Okay.

And this will be the link of that mass score. And if you want to see it from the previous run, I have seen, I have all these links, which I collected from the first stage. And in the second stage, what you will do is you will go through this scorecard.

So for example, this is my scorecard. And if you inspect this particular page and let me click on this element and go here. right so if you do this one you will find that you are collecting data from a table called ci scorecard table and if i check that in my collector here you will find that here i am collecting that particular table and going through my first inning second inning etc This code is not as hard as you think. It is just HTML.

You need to have basic knowledge of HTML. You're going through those HTML elements and you're just trying to grab data from it. So you're going through entire list, all the scorecard one by one, and you're putting that summary here.

Okay. And when you click on this, it's going to take time. What it will do is it will load this page.

Then it will go to this link. Then it will, you know, grab all this. tables the batting tables then it will go back it will it will go to the second link collect all the tables and so on when you click on this it will do a sample run for one or two matches but if you want to run this collector uh you know for the entire data collection what you can do is you can click here and you can say initiate manually so it will run the whole collector You can also set it on a schedule or initiate by API in some Python code as well.

I'm going to show you match results execution because it doesn't take much time. So you click on this and in the delivery preferences, you know, I'm going to type in my email ID here. So it will deliver the data as a JSON to my email ID.

Alright, so just say initiate manually, start and see it is starting. Now it will run internally. It is using smart proxy framework and it is going to ESPN quick info website, grabbing that data and once data is ready, it would have sent me an email.

All right, when I check my email, I got. the match results and when you click on download results it would have downloaded this particular json file and when you look at this json file see i have the entire mass summary okay so this way i ran all the collectors and grabbed all the data and i have put that data on github in t20 json and csv file all right so this is json files if you want to directly get that data you can get it from here but i Highly recommend that you use Bright Data for data collection because data collection is super important part of any data science or data analytics project. When you log in, make sure you are seeing this $15 credit and then you can go to data collection, collectors and create your first collector by clicking on this open IDE.

So when you do that, you can just say start from scratch. You are creating a blank collector. You're not using any of these templates.

And when you. see the javascript ui what you will do is you will use the code which i have provided you again check video description carefully for all the download instructions you will have this file for example t20 world cup match results.js so open that file and it has two sections interaction code and parser code so for interaction code i can just copy paste this here and you know use that code here Whereas for the parser code copy paste just that particular portion here in the parser code And that's it. This is your collector.

It is ready. Okay, and you can run it verify it You can just say finish editing you can give the name to that collector So, let me just say finish editing here So it takes few seconds and then your new collector is ready. You can edit collector name. You can call it whatever t20 World Cup match summary Okay, you saved it if you just cancel that icon you will see that collector now you can run that collector Manually if you want to edit the code you can once again go inside So this way you create four different collector again I have provided the code to you if you need any help you can click on the help button You can do learning you can watch bright data tutorials on YouTube as well We are done with web scraping part and the files that we extracted from esp and cricinfo are available in t20 json files folder.

Now if you want to use these files ready made check video description below you should be able to download these files and if you look at these four json files for example batting summary we have this batting summary json element for each match for example nabibia versus sri lanka so there is one element in this array Then the second element in this array is UAE versus Netherlands and so on. Now when we pull this data in Power BI, it would be beneficial if the data is available in CSV file. Something like this where you know it's just a single flat table and you have cricket match and the corresponding batting score right and then see here also there is UAE and the batting score and so on and here is the batting position.

So we need to do transformation basically we need to transform this json file into this particular csv format you'll also see an additional column for example if the player is out or not out in the json file we did not have that information okay uh all we had was let's say if dismissal is blank that means the player was not out but if the dismissal had some string that that means the player was out So we have to do this data transformation and python pandas is probably the best way of performing this transformation. If you don't know about python programming language you can go to codebasics.io super affordable easy python course for total beginners you can follow that. Then for pandas search code basics pandas tutorial you find my playlist which is very popular and just watch first you know maybe six or seven videos. and that's it it will take you less than one hour assuming you have now some python and pandas knowledge now let's uh start a jupyter notebook and uh work on the transformation i went to my cricket analytics folder and launched jupyter notebook by running this command and it looks like this see i am at a location where t20 csv files and json file directories are at the same location i am going to create a new python 3 notebook and import some necessary libraries we are going to use pandas and other library called json and then i will open the first file which is the match results okay so i will say match results as f and then data is equal to in JSON you can just say JSON dot load file pointer and it will load that data for you so let's print what kind of data it loads so here I have the math summary remember the math summary table so let me show you the JSON file so that you get an idea so my math summary is this one where basically it's a match who won the match right so between this two Namibia won the match by this run and so on and if you look at match summary element see it's just one element and if you look at this array this array also has one element okay so let's look at that element so first of all data array has only one element you can confirm it by hc printing just one element and then data 0 and then match Summary if you print that's your main list where you have all the match results So what I can do is I can create a data frame out of it and I will just say DF match is equal to PD dot data frame in the data frame you can supply the entire list as an array and if you print that head of that data frame.

See wonderful. So my data frame is created and I can just quickly check how many elements it has and there is a method called shape when you do that it prints the shape basically 45 rows and 7 columns and I'm going to do a few processing steps which is let's see I will use this scorecard as kind of a key of this particular data frame what i mean by that is i want to treat this as a match id so that i can connect with other tables because i have other tables and when i import them in power bi i need a way to link them short of like primary key foreign key in sql term so i want to treat this scorecard which is a unique id basically as a primary key for this So let's do that. So I will just rename that column.

I will just say df match dot rename. And how do I want to rename the column? So I want to say that rename scorecard as match ID axis is equal to one, which is, you know, you have column axis and rows axis. And when you print head, you will see the column is renamed. If you're not getting this, just hold on.

Later on, you will understand why I am calling it a match ID. Now. Once I have this particular data frame, I want to export this data into a CSV file. Okay, so that you know all the data all the data is in single CSV with nice columns.

But before I do that, let me process the batting summary. So the other file that I have is batting summary. Okay, so I will say batting summary. you can add markdown columns, markdown rows in the notebook. And let's see how this one looks like.

So, batting summary has this particular format. Let's see in the notepad++. Where is my batting summary?

So, batting summary. First of all, the outer array has multiple JSON objects. You see, these are all multiple JSON objects.

If I open the first JSON object, that has a element called batting summary see it has just one element batting summary and in that batting summary there is a score of one match so namibia versus sri lanka see all the players number one player number two player they are in one element and if you close this and open the second element okay you will find uae versus netherland match so all these matches are presented by one single json element inside my array okay So now what I will do is I will go through that those records. So I will say for record in data and this each record is one match and one match has multiple records for the player. Right. So if I print, for example, record batting summary, what I will get is the batting summary of one match.

So Namibia versus Sri Lanka. Okay. So there's this. So I'll get total, I think, 11 records. And if you want to just append everything in one array, right?

Because our eventual goal, folks, is to get a single list. Basically, I should have a single list where all the matches are present. And if you want to create a single list in Pandas, what you can do is I can create a list called all records.

And then. I can just keep on appending. I can just say all records.extend and when I say extend it is just extend is basically you have one array and you are appending another array after that.

So let me just show you. So let's say you have an array called a okay and if you have another array called b and if you say a extend b and if you print a you get this so it's just joining those array okay so similarly here i am joining all the records and after i join all of them i can print all records see i got all the records in a single flat list so namibia versus sri lanka if i just scroll down i will see uae versus netherlands and so on and from this i can create our data frame data frame like this and when you print see i get the this is a continuous so if you print tail of it you will see the final match pakistan versus england i don't know if you have seen that it was pretty interesting match world cup final 2022 now that i have a data frame ready. Let me look at, I just want to do some analysis. I'll just print, let's say first 11 elements or so and see I want to do couple of things here.

First of all, here the dismissal column, I want to convert this column into out or not out. Okay, so I want to have a column like this which tells me if the player is out or not out. And the way you can do that is look at the dismissal column and if you don't find anything in it, if it is blank, it means the player was not out. So let's first do that.

And how do you do that? Well, you can create a new column called out or not out in pandas just by doing this based on the dismissal column. So my dismissal column is this. OK, and on this you can use. apply method if you've seen my pandas tutorial you will know so you can say on dismissal column apply some transformation and create a new column called out or note out so this is the new column this is how we create a new column and on the existing column i want to apply some transformation what is that transformation well lambda x lambda is this is just a short way of writing python function so here i am going to use ternary operator so i'll say player is out if X when I say X you you are getting each value from this column okay so if let's say X is this if it is out if the length of X is greater than zero else it is not out okay so I will let's see Okay, Df is not defined obviously I'm used to writing Df all the time so you see You have out and not out and if I put in few more records You will see whenever there is a blank it is not out.

Otherwise it is out Okay. All right now that I got out not out. I don't need this dismissal column so I can just drop that column Okay, and this is how you drop it You will just say drop which column you want to drop and in place is equal to true if you don't specify in plus equal to true it will not modify that data frame it will return a new data frame okay so when you run this dismissal is gone and all you have is out and not out and you have another column called batsman name which has some issues which is see this kind of spatial characters you have you want to remove this spatial character and you can just simply locate all those spatial characters and then you know apply some lambda function for example you can say df batsman dot apply this you can use regular expression you can use a replace function you can do n number of things but just to keep things simple i will just uh you know i found this as well as there are there are other records where i found this character so i'm just removing them and now you see in kusal mendis uh mendis actually you don't see let me see you don't see that particular character so let's see yes see here you don't see that extra character all right Now, how do you connect this particular data frame with the match?

Because as I said, for our visualization purpose later on in Power BI, we need a way to link all these tables. Now, just carefully notice these two tables, okay? So, this is one table that I have, okay?

And this is the second table that I have. I can just use snipping tool and just kind of... take a screenshot of that so that you get an idea so let's say i have this particular table right and then i have this another table so now let's try to manage these two so i want to now connect these two tables how do i do that see i have namibia and sri lanka right i have team name and here i have namibia versus sri lanka I mean that is the only key I have between these two tables so that I can join them or I can link them basically because I have scorecard here but I don't have a scorecard here in this particular table so yeah the only thing I have is match right so here it says Namibia versus Sri Lanka here it says team one and team two now I can say maybe take team one and then use vs in the middle and then do team two right and that way I can join them but the problem could be in in this table the names could be reversed right so here let's say it's netherland versus uae whereas here for example let me print this see here i have uae versus netherland see So if I just simply say team 1 versus team 2, it's not gonna work.

I have to use both the combination. Team 1, then VS, then team 2, then team 2 VS and team 1. So that is the way I can connect. And for doing that, I need to go back to this code and create a kind of like a dictionary.

Okay, so dictionary like this. So let me just do that here. So I want to create a match IDs dictionary.

Okay and the dictionary will look something like this. So let's say there is Namibia versus Sri Lanka. Okay so let's say it looks something like this. And then I can have maybe a match ID as a value. Right and then I can have same thing.

But in the reverse order Why I need this because The order is not guaranteed. So I need to have both and then I have let's say netherland versus UA correct So let's say I have this kind of Python dictionary where I have Team 1 versus team 2 team 2 versus team 1 and the match ID then that is going to be helpful So I can create that dictionary by going through so let me just remove this I think you got an idea or let me put it in the second cell so I can say for index row in df mesh dot iter row so there is a function called iter rows which will go over each row one by one and then every row has team one and team two okay and you can Use that as your key one So this will be your key one correct key one because see Namibia versus Sri Lanka So I'm creating that by doing this and then first doing Sri Lanka versus Namibia. You can have key two So key two is nothing but just you know, you're reversing the order team two versus Team one and then you want to add that into this dictionary.

So here I will say add this into my dictionary key one okay so key one is this right and what is the value you are adding the match ID so you will say row matchId and you will do the same thing for key2 and when you look at now matchId is dict it looks something like this see every team the original order reverse order and the corresponding matchId is dict now you'll be like okay so how do I use this dictionary well see okay let's go back so the way you use this dictionary is to bring match id column in this particular data frame so how do i bring match id column in this particular data frame well this has a match column right so i can look into my match id dictionary so you see if i for this match i want to get a match id how do i get it so i will say match id is dict and this Okay in quote, so if you give this see you get the match ID so I can create a new column in this data frame Okay, I can create a new column in this data frame Call it match ID equal to batting Dot so I want to apply transformation on this match column. So I will say on this column apply transformation Map is short of like it's a function similar to Apply and you are going to give this mapping. So you're just mapping it basically Okay, and when you do this you get Match IDC now you found a way to link these two tables okay and then you will export this particular table as is into a csv file and the way you can export this table is by doing this okay i have that file open by the way that's why but i'll just say temp dot csv for example and when i see temp dot csv say in my csv files folder see temp.csv and it looks something like this it is same file as i was showing before basically it is this file okay now in the interest of time i'm not going to go over the entire transformation for other files too because the code can be bigger and the tutorial will get so long so i'm going to share with you the entire notebook you can take a look later on but see here i you first process match results then i process batting summary and at every point i'm exporting the file so i exported match summary first then i exported the batting summary and then i exported balling summary see balling summary and the player information now player information the name i have given is no images because for player i need their image as well so i have two files like player csv with no images so let me show you that let me remove this stamp.csv and see i have dim players.csv and no images so no image is something that my notebook is giving okay and you can see there is player name there is team batting style bowling style player role and the description some players don't have description but some player do so and in this file we have manually added their images so we collected their images manually and we added it here in dimplayers.csv so if you open that file you will see an extra column okay so i have dimplayers open looks like already so in that we have an extra column called image and if you click on this image you will see the players picture basically right see if you click on any image you will see that particular player's image. Alright, so I hope you had fun and now in the next step, we will be importing this CSV file into Power BI.

I'm going to launch Power BI Desktop. I have already installed this application. If you don't have this, you can just youtube and find the installation instructions. So Power BI Desktop, click on that. close this and just say file save as you know just go to downloads and just give some file name i will just say t20 click and then you will go to get data click on more and import the entire folder of csv files now if you look at our downloads folder so just check video description below i have given the instruction to download the csv files so these csv files either you have used bright data to capture it or you can use ready-made files which i have given to you and here in this folder you see five csv file now for dim players we have two versions one with images one without images so i'm going to delete the no images csv file because that's not going to be useful and i will here click on folder connect and go to that folder and grab all those four files so this pc you go to downloads t20 csv file okay it's going to import all those four files now here you click on transform data to perform data transformation in power query so power query is a component inside power bi that allows you to do data transformation here right click and say duplicate i'll tell you why i'm doing that but go to the first step and then click on this binary okay now when you click on that it's gonna expand that file so our first file if you look at the steps on the right hand side see the first file is dim math summary and when we clicked on it what it did is it expanded that particular file so i will just call this dim match summary and similarly i'll just duplicate uh these raw steps multiple times and then here i will expand dim player so once again click on this and it's going to expand it i will call it dim players and do similar things for next two files.

Now that I have expanded all the files, I will quickly look at the data and performance and transformation. When I'm looking at Dim Players, I see it did not recognize the column names properly. This can happen.

So what you will do is you will say go to the transform tab and then use first row as headers. So see now it is using name, team, image etc. In the previous step if you look at it, it did not have that right.

It was having column 1, column 2 etc. And now I said use this first row as a header row. So after that it looks something like this.

and when I glance through the data I noticed couple of issues number one is the player name has this in bracket C this is captain basically so in the cricket scorecard if someone is a captain in bracket you will see C I don't need that I just need a player name so you are going to apply some transformation where you will use extract option and you will say text before delimiter so delimiter is bracket right so you want to get text before the bracket so in this underscore c if you get text before this bracket you will get only saqib al hassan so you will say okay and you see that c is gone so in the previous step you see saqib al hansan in bracket c but in the next one that c is gone it is a usual practice to do data trimming after this delimiter step so you will say format and trim and that will just in case if there are extra spaces it is going to remove those. One other thing that I do is I can sort these values. So if I say sort ascending, and when you just quickly glance through these values, you will find some duplicates. See Matthew Wade, the same record due to some reason appeared two times and you want to remove these duplicates. In real life, you will always find these kind of data problems.

Okay. So I'm going to now remove shorted rows here and then I will say right click and remove duplicates. So that will remove the duplicates values. See right now the rows is 2, 1, 3 and in the previous step we had 219 rows.

So it removed six duplicates. Many times what will happen is you are building dashboard. At that time, you will encounter issues and then you have to go back to Power Query and perform this data transformation. Okay.

I kind of knew about these steps, so I'm doing it. But in general, you will glance through it. You will sort it. You'll do a bunch of things to figure out all these data issues. And now we are done with dim players.

data transformation we can move to the next one which is the match summary here when i glance through this data it all looks good the only thing i'm going to do here is create a new column called stage now in t20 cricket any match in this particular world cup of 2022 any match that was played before 22nd october was a qualifier match otherwise it was considered to be the super 12 so let me show you that so the logic here that we are going to use is if date is less than 22nd october 2022 then the match was qualifier okay qualifier otherwise it was super 12. so i'm going to create a new column called stage and it will be a conditional column so let me just show you so that you get an idea click on add column and say conditional column and in that conditional column you have a match date see match date so the new column name by the way is stage stage could be qualifier or super 12 and if match date is before before which date well 22nd october so i'll go and say 22nd october okay i just know it it's just a one fact that i know and If that is the case, then this stage was qualifier. You know, in any tournament, there is qualifier and then there is a main tournament. Otherwise, it is called Super 12. And when you say OK, see, you find something like this.

So this was all of these were qualifier and then tournament started from here. Actually, it's not less than equal to it is less than this. So I'm just going to modify the formula. and just hit okay so see scotland jimbabwe was the last qualifier match and then uh from next match onwards uh we had a main tournament which is super 12. so just to summarize i created this new stage column here and then i'm gonna change the type to text so here abc123 means text or number so i'll just change it to only text then the next one is fact balling summary so i go here and then i will just rename few columns for example balling team i can just say team so double click this and just say team okay and then you have things like zeros so i'll just say zeros instead of having zeros i i feel zeros is a better column name so once again you will see that you might be renaming a lot of columns so here i will say 466 and so on. Now see for calculating different statistics for balling performance, I am gonna do some transformation with over.

So many times you know you have 2.5 overs which is 2 over and 5 balls. So it is better if I have a column called balls. And then if I run, then I can divide run by balls to get the balling of strike rate and so on.

Therefore, I will just, you know, create a balls column from this. And the way you do that is you go to overs and then you first split the column by delimiter. And that delimiter is dot. So you will just say, OK. And it created two columns, overs one and overs two.

So, for example, see 2.5. I previously had... 2.5 here okay so let's see 2.5 so it will now say 2 and 5 in one one and second column right overs dot 1 overs dot 2 and then here in this column null values you can replace with 0 right 4 and null is basically 4 over so this can be replaced with 0 and you can just go here repress values null replace that with 0 okay and now you can add a new custom column.

So you will just say add, create a custom column, call that balls. And balls is nothing but overs one into six, correct? So two overs will be how many balls?

Two into six, 12. And then if you have any additional ball, you add that and that is overs two. So you insert that and then you get this particular formula. So if you look at different overs here, for example uh let's say this is this first one is four over so that will be 24 balls correct so see 24 balls but if you look at this particular row let's say this row is 18 balls but let's check this one this is 17 balls so 17 balls is 2 over and 5 balls so 2 or 2 or is 2 into 6 12 and then 5 is 17. see 17 here and then let's perform some quick transformation in batting summary as well Again, I'm gonna rename all these columns force and six six and I don't like that this column is in a text form So maybe I can have a simple column called out, you know I can call this column out and if it is out then values one otherwise it's zero okay so not out is zero so how do you do that so you can just go to transform and say replace values and if the value is not out out is zero and if the value is out once again you can just say replace values so replace values if the value is out then that means zero so now we got nice binary zero one type of column not out is 0 and then out is 1 actually okay so see something like this and by the way this balls column should be number so i'll just say this is the whole number and in batting summary also you need to do similar transformation for captain where you remove the bracket c text so once again you you click this and you just say extract text before delimiter bracket okay and you will see see that c thing is gone see krihan whatever in bracket c and the next one it is gone so now we are done with our transformation you can go to home close and apply after data transformation we need to look into data modeling For that, you can go to this particular tab and you will see it has already established some relationship based on column names. So let me just pull fact tables in the middle. OK, so I'm going to just arrange them nicely.

And fact tables are basically the the transactions and then the dimension tables are basically the attributes. So you can just Google if you want to read more about. fact dimension table and the star schema so now you can see that when you hover your mouse cursor here based on match underscore id it established this link between the two okay it's like one too many see you one too many relationship similarly here also based on match id it established this relationship now we will link dim player table with this one so here the player name is basically baller name here and that thing is called name here okay so i will left mouse click drag and drop on name here and that will establish a relationship when you hold your mouse cursor over here see baller name and name are highlighted you which means these two tables are linked through that particular column same thing here batsman name is linked to name here and that's it our data modeling in this case is pretty simple Once data modeling is done, the next step would be to create DAX measures.

So DAX measures are something that we'll be using in building the actual visuals. Okay. So for that here, I will create a category or a folder where I can keep all those DAX measures.

So I will click on enter data here and I will call that key measures. Okay, so key measures. here it says key measures to just ignore that this is just a category where you're going to add all your measures so here i will click on this and i will say new measure now we need couple of measures and i have given an excel file again check video description you can download it from here i have given a complete list of measures that you need for this project so the first one is total runs okay so it's that one is pretty simple so you will say here uh you can click on this icon to kind of expand it and control scroll to make it bigger so total total runs is equal to sum of total runs is equal to sum of all the runs in fact betting summary table right so here fact batting summary Runs and that's it. So that creates a total runs measure This column one is not needed. So you can right click you can click and delete it So total runs is one measure that we have already created Now let's create the other one.

So the other one that we need is total innings batted. Okay and that one would be so once again click here and i'm going to just copy paste that formula so ctrl c ctrl v okay so fact batting summary and you're getting a count of match ids and that will be the total innings batted now what's the purpose of this measure exactly if if you have no idea on dax etc um i would suggest that You can check my course on codebasics.io for Power BI where I have explained Power BI pretty much in detail with a lot of practical and fun learning where we completed a real life project with one and a half million records. And the course has received amazing reviews. So go check that out. I'm going to go quickly through DAX measure creation because this is not a Power BI detailed course.

Correct. That's why. All right.

So let's say I have created these two. measures and if I want to quickly check it what I can do is I can pull a player's name here and then in that table I can add those measures so let's say I add total runs for example and I can look for a particular player for example my favorite one especially in the last series was Surya Kumar Yadav and here it says he scored 239 runs right you can quickly check if the measure is correct or not by opening fact batting summary csv file and when that file is open you can create a filter you can go to data and filter and you know here in the filter you can type in surya kumar yadav so let's say let's say you have a file like this you can say surya kumar yadav and if you look at his runs if you if you just highlight this you will see 239 and that's what i have 239 right similarly you can create other measures so what other measures do we have here so let's let's check that the other one is total inning dismissed okay and total inning dismissed would be let's see so here i will say new measure total innings dismissed is equal to sum of once again you can do control scroll to see the holding okay betting summary and out so how many times the player got out okay total innings dismissed and when you pull that here so when i say total ending dismiss here say i'm getting some error and if i click on this it says that function sum cannot work with value type string which means we created this measure based on this out column and our column is not a number it looks like so you can go click here home click on you know transform data to go to the power query and you see the the out column in fact betting summary is abc which means it's text so you can change it to a whole number so just change it to a whole number say close and apply and then it will show a fine so aaron finch was out two times and and you can see all these statistics in the similar fashion you will be creating all the rest of the dax measure okay so once again this file is given to you you download it and i want you to create all these measures and once you create all these measures it's gonna look something like this see here it will look something like this so you will by the way create all these measures and also group them in these kind of folders so how do you group them so let me show that so when you have these measures let's say these three measures right and if you want to group them uh what you can do is this you can enter a display folder here basically you can say batting and it will put in a batting folder and then you just drag and drop okay so you Drag it here, drag it here. Okay, so this way, see, you have a nice betting folder. So your goal, by the way, this is an exercise for you. Folks, this is not hard.

Don't worry. You can do it. Your goal is to create all the measures.

You know, I just showed you two, three measures, but you have to create all these measures on your own and just put them in respective folder. Once you are done creating measures, you have to also create some. calculated columns calculated columns is just a jargon actually it's nothing but similar to Excel formula so if you have Excel formula for example let's say if I go to this CSV file and I remove this filter right so let me remove that filter I want to know total runs from boundaries okay so let's say boundary runs if I want to do that I can use some formula in excel correct so whatever let's say if you have whatever number of fours into four plus whatever number six six you have into six that will be your boundary runs correct and if you drag and drop this formula like this you will see the boundary runs for example this guy scored one four and one six so that will make it ten run similar to this excel formula we are going to create calculated column in our decks so let's go to our file here and and by the way this this visual was just for your validation so you can delete it so here in the batting summary i will create my first calculated column which is let's say boundary runs right so boundary runs is this so how do you do that so you can go to this table view and you can say new column okay and this new column will be this this is the name okay so let's say boundary runs and that boundary runs is equal to see that fact batting summary force into four plus fact batting summary six six into six if you have a slight bit of excel idea this is pretty straightforward so you create a new column here once again for validation you can look at anything here see like this one for example uh this person this person hit pat cummins two fours one six so two fours is two into four eight and one six is eight and six is fourteen so fourteen runs pretty straightforward so create all these three calculated columns and once again if you want to look at the final a pbix file which i have given again check video description you you are given all the assets and that final file would have see all those measures so anything which has this symbol is a calculated column so see it has boundary runs okay so it has pretty much everything that you need The visual here correctly represents how much attention dashboarding gets and now we are going to start dashboarding for our project.

When you work as a data analyst in any company, usually business managers will provide you some kind of rough mock-up with their understanding on how they want to see the dashboard. So here they have given this image which they can draw on a not in pen where they want to see different tabs. for power hitters, anchors, fast bowlers.

Remember previously we covered different criteria for each of these categories. And when you select any category, let's say you select hitters or anchors, you will see the players in that category along with the statistics such as their runs, strike rate, betting average, etc. And on the right hand side, you want to have a criteria filter where based on our criteria, we can see a list of players here.

And then at the bottom, you will have some kind of trends for various statistics and bottom right would be the scatter chart between strike rate and batting average. Now they can provide this mockup in a rough format like this or sometimes people use powerpoint or some other tool and they will just draw their rough ideas and as a data analyst it is our responsibility to communicate back. You know communication is a very important skill when it comes to data analyst career.

So We are going to provide you all these mockups. The name of this file is mockup.txt which you will find in the video description below. Once again, when you download all those files, you will see this file. Now I'm going to take this stage2.pbix file which has all the DAX measures created.

You can also do the same. You can just get that file and I will start building our visual. So here, the first thing that we are building is the page for power hitters.

And what I will do is I will go to dim player table and grab the name of the player and just drag and drop here. So it shows me, see all the names of every single possible player. And if you check our mockups, we want to have certain columns in this field, such as.

Let's say team for example, then the batting style, okay, then the innings batted. So you can go to key measures and you can say, okay, innings batted and then total amount of runs that these players made, total balls that they faced, strike rate, their batting average and so on. and once you have all these columns you want to look at your criteria for openers your power hitters and here it says see their batting average should be greater than 30 strike rate should be greater than 140. so now you will use this filter tab to filter those players because this list is showing all the players right you want your batting average to be greater than 30 so i can say betting average should be greater than 30 apply filter and it will see filter all those players then you want a strike rate to be greater than 140 so i will go to strike rate and say his strike rate is greater than 140 so strike rate greater than 140 innings batted Greater than 3 and boundary percent is greater than 50. So innings batted.

Where is it? Okay, innings batted should be greater than 3. And then boundary percent is greater than 50. So boundary percent is greater than 50. 50% is 0.5, right? So that is greater than.

Apply filter and then the last one is batting position should be less than four. So in whatever matches they played their batting position should be in the opening somewhere. So it is less than four and when you apply all this criteria you see a nice list of players who can be your potential power hitter in your final 11 team. You can also do some you know visualization related changes for example total runs I want to see them as a bar chart like a horizontal bar chart so you can now go to visualization tab you can say total runs I want to do conditional formatting and I want to display the data bar charts and the bar charts looks something like this you can sort these columns as well by the way so if you click on it you can see the player with highest runs We all know Joe Butler, I still remember their inning with India.

Joe Butler and Alex Hills just killed it in the semi-finals. So you see they have pretty good runs, strike rate and so on. Now folks, building the whole dashboard is few hours of process and I don't want to waste your time just going over all of that. So I'm just going to show you stage 3 files.

So once again whatever files you have downloaded. okay video description check it you will find this file called stage 3 and that has all the raw visuals created so you know you will have a page for power hitters for example then you will have a page for anchors middle order okay how do you create this page well when you go to here this will be page one so you you say okay power hitters or openers and then you create a new page and you create a new page for anchors or middle order right so anchors middle order you just type double click and type in it anchors middle order and then you drag and drop and start building the visuals here and when you build the visuals in the raw format ultimately they will look something like this okay now if you are once again new to power bi and you don't know the basics of these visuals what you can do is you can go to code basics dot IO and just take our Power BI course it covers all of those concepts pretty much in detail but for this video I'm going to give you this stage 3 file so that you can check various properties of the visual in case you don't know how to build it so when you click on it see here it will show you what kind of visual it is so this one is card visual and if you want to look at the formatting for example then you can just go click here and you can look at various you know properties. So we are going to assume that you have built all these raw visuals and now it is time to beautify them you know to make them look good and kind of connect different pieces. Once you beautify your dashboard it's gonna look something like this. Now this is what we have built but you can build it as per your own preference for colors and different visual behavior.

We have provided you all the dashboarding tips here. which you can use to build a dashboard. See, I'm not going to spoon feed you because when you go work in the industry, it becomes essential that you use your Googling skills to figure things out.

If you are trying to change the color of some visuals, okay, you can just Google it. Googling is an art that can be tremendously helpful and this is a unique opportunity for you to use that art to learn certain things on your own. while we are providing you full assistance.

So read dashboarding tips and then try to make the entire dashboard look pretty good. In our Power BI course also we have an entire chapter on designing an effective dashboard. Now you will be glad to know that this particular dashboard is designed by one of our students.

So he took our Power BI course. His name is Ashish Babariya. He also participates in our codebasic.io resume project challenges.

So let's get started. If you go to the website, resume project challenges, which are free for everyone to participate. He has won our first prize in two challenges. And if you click on this LinkedIn icon, you will see his post here where he builds beautiful, beautiful dashboards. Now, Ashish's background, if you check it, he's a trade agro specialist.

So he comes from a non-technical background. He learned Power BI. Look at his background.

He's trade agro. specialist so he doesn't have a formal training on data analytics etc he learned power bi mainly from core basics channel and various other resources and by participating in resume project challenges we were able to spot his challenge and he is now working with us as a freelancer and he's helping us with all these projects so look at the quality of and professionalism of this dashboard he learned things on his own at a later part of his career and you can become like us too Now you can click on various player categories. So in the power hitters, you want to have few players from this particular list. And if you look at the filter criteria, see this is all the filter criteria that we had. Now look at the filter.

criteria for example for fast bowlers right so in fast bowlers you want to help bowling economy to be less than 7 and they should be taking wicket every 16 balls so if you go to our fast bowlers control click on that and if you click on this visual you will see all the filters that we have okay so see bowling strike rate is less than 16 Dot ball is greater than 40% and so on. So you see all those details here. And if you want to modify any criteria, it's super easy. You just go here, type in and you can play with different things. And in the next part of this session, what we're going to do is invite Tony Sharma, who is a subject matter expert and in charge of this project.

He will help us decide the final 11. So if you control click on it, See, we have our final 11 almost decided, but you can modify certain criteria, substitute different players. And if you look at this team, if you know cricket, just look at this team. It looks pretty solid, like unbeatable team. So see, Power BI data analytics can help you generate data-driven insights that can make a huge difference in the problem that you're trying to solve. We are going to provide you this final file also.

So if you're... If you have question on some visual behavior, you can click on it and you can, you know, check the different visuals that we have. You can look at format and the various properties that these elements are having. And I will quickly play a time lapse view so that you have some idea on how this visual was built. So Nick, we are just a few minutes away from saving the world.

How does this dashboard look? Dashboard looks amazing but can you show me the best 11? Not yet.

We are getting there. Just give me a few minutes. Let me explain how I have done this. So in the previous session we had, I explained what are the parameters I'm using, right? And now you can see these parameters come into action, come live.

So this is how I've created the parameters. You can see how I filtered the players for the openers. These are the power hitters. And, you know, I also have like a solid graph of them, like to understand the consistency, to understand their playing trend and all of that. And I also have a scatter plot to show how their batting average fares for the strike rate.

So you can see these are the players I've got here, correct? So these players will be able to strike ball at around, you could have players who can strike ball at 160, 170, even close to 170. At the same time, give me an average of 35. You can see that. And Josh Butler here gives me the highest average. But he is a good striker as well. He strikes at 140 plus, which fits our parameter.

So you can see that these players are there. So out of these players, I'm going to select Josh Butler because you can see he's consistent all the matches. He has played all the matches pretty decently and he's a good wicketkeeper as well. So we need a wicketkeeper. So he's going to be a wicketkeeper batter.

And his partner, Alex Sales, is a good choice. He's a good choice of a second opener. But again, he's not consistent in all the matches. Also, I need a left-hand combination with a better strike rate. So I'm going to choose Rilai Russo from South Africa.

He's a better option for me. He has a, you know, he can strike ball like crazy. I can show you the combined, you know, the performance of this both. I'm just selecting this both.

So if these two players play together, and we layer this on Josh Butler, if they play together, they will give us 40 runs on average at a strike rate of 150 plus. So if they too bat without losing a wicket, so we'll hit our target of 180. If they bat 120 balls, they will give us 180 runs. And They will stand at least for four overs on an average because the average balls faced, you can see it's 23.9.

And they scores 160% in boundaries, which is so perfect to what we need. You can see the consistency is pretty much there. It's dropping here and there, but they together as a package will give us what we need. I love this feature that you can select two players and see their statistics on average basis.

Because this way, let's say if these two players are playing and if I have questions on their partnership statistics, I can get the view of those numbers easily here. It is not exactly like a partnership, but it will give you their combined performance. Got it. I got the point.

Yes. And so I'm also going to select Alex Hales just as my reserve opener. So I'm going to select three players for this position potentially. But.

The two I'm going to play is Joss Butler and Brie Le Rousseau. All right, let's move to the anchors where we'll select three players here. Again, you can see the filters I applied on this data. Batting average, strike rate, innings batted, exactly like we discussed.

Here the interesting part is we have Virat Kohli on the top with most runs and then follows Surya Kumar Yadav. So let's check into this chart, the scatter plot. Virat Kohli is clearly the winner because he gives us a lot of runs. He's a run machine.

We definitely need to pick him. And the second player could be, I couldn't think anything better than Surya Kumar Yadav. Because you can see his average is 60. He could give us 60 runs on an average.

And he's striking at 190. This is the best we have in the team. So even our openers did not score runs at a good strike rate. This guy can come and propel the score.

So we need him in the team for sure. And these two players are solid. They have a good partnership. It's a great idea to play them together. So for my fifth position, I'm going to go with the three options I have is like Lorcan Tucker, Glenn Phillips, and Daryl Mitchell.

So I'm based on the statistics, based on the averages. It's easy for me to choose Glenn Phillips. So even though he has a strike rate of... 160, which is really high for the position.

He scores at an average of 40. So that's the reason I go for this. He's clearly my number 5. Okay. Right. And let's move on to the next one. So this is going to get slightly tricky because I have many players in this position whom we can pick.

So like I said, I'm looking for a batting all-rounder here. And that batting all-rounder could be a you know, a batsman who can score runs at a very high strike rate, at the same time, anchor the innings or, you know, this person can be a fast bowler or a spinner as well. In order to justify my selection, let me come back to this place again, but I will go to my, I'll go to select my specialist fast bowlers first so that it becomes very easy for me whether I need a fast bowler in the position or whether at all I need a bowler at all in the position.

If I need a good bowler or a good batsman in that position. So I will select my fast bowlers first. So here my selection is super, super, super easy. I'm going to select this guy, Sam Curran. You can see his economy is 6.53 and his bowling average is 11.38, which means he gets a wicket for every 11 runs that he provides.

And his bowling spike rate is also staggering. He gets a wicket for every 10 and up balls, which means if he bowls, The full four of us, we are definitely getting two wickets. And look at this guy. He has got 11 wickets. At an economy, crazy economy of 5.3%.

He gives less than six runs in over. We definitely need him on our team. And he's a fast bowler. And he picks wickets in less than 10 balls. So these two are definitely in our team.

And of course, Shaheen Shah Freedy. How can we ignore him? The kind of player he is.

He's a left-arm fast bowler. And he can really... rattle the batsman. He's one of my favorites. Yeah.

So you're picking this three. Tim Southey is good, but if I have to pick three, I'm just going to be just three. So these are the parameters I've applied here.

You can see that, you know, innings ball, bowling strike rate, bowling average, exactly like we discussed in the parameter session. So let's see their combined performance together just to get an idea. Sam Curran, Henrikh Nokia, Shaheen Shah-Fridi.

You can see these guys will pick a wicket for every 11 runs they give. Which means if these three bowl all the 20 overs, they will get the full team all out for 110 runs. 113 runs and they will get a wicket every 11 balls if they ball all the 20 hours the team is all out and they will imagine they're balling first three or four hours they will let's say if these players they they ball first six over how many wickets they will take if they ball the first six hours they will pick a wicket for every 12 balls six overs is 36 so they will pick at least three wickets that's that's on average that would be awesome if you pick three wickets in the power play.

That's going to be amazing. They will definitely scalp the top three. They will open the middle order for us.

And look at the economy. The combined economy is six, which means if they bowl the first six overs, they will be just giving 36 runs. And the dot ball they produce is close to 50%, which means if they bowl all the 20 overs, the batsman is able to score only the half of the overs, like only the 10 overs. The rest of the 10 overs, they're scoring nothing. It's a dot ball.

These three are the major strength to our team. All right. So let me come to the all-rounders now before I go to the finisher role. So we have three solid fastballs. So in my all-rounder, I'm going to have the spin element.

But at the same time, I want these two players to bat as well at the highest strike rate. So that's the reason, as I explained earlier, I've kept the strike rate parameter as 140. If I don't consider the strike rate, let's say if I get the strike rate as 100, I'm going to get more players. Like even I get Ben Stokes here, but I'm considering strike rate 140. Oh, got it.

I was expecting to see Ben Stokes. Now I know why he's not showing up. Yeah.

Yeah. I will explain why Ben Stokes is not here in the other page as well, because I wanted him at that role earlier. So you can see from this graph, Rashid Khan is really good in terms of, you know.

the bowling strike rate. I mean, I can't say it's really good because bowling strike rate should be lesser, right? It's not good. And you have Sikandar Raza who has the economy. We kind of finding people in this zone.

This zone is the best zone. Who has the lesser economy and lesser bowling strike rate. So, Shadab Khan is really the winner.

He definitely deserves a spot because you can see his performance. He's quite consistent as well. You know, his bowling average is good. Whenever he got a chance to bat, he has batted well as well. This is his bowling and batting performance.

It's of high consistency. But Sharaf Khan, can I play him at number seven? I'm not sure.

Because if my number six did not bat well, I want someone more capable to bat at number seven. So I would play Sharaf Khan at number eight. But then at number seven, I would play Sikandar Raza.

This is the person I'm going to play at number seven. Because look at his strike rate. It's 147. and he also has a very high average batting average of 27 runs for a number seven he normally played number five or number four but for number seven if you have this kind of an average it's great with this strike rate so that's that's going to be my number seven number eight i've got my nine ten and eleven as well so i just need to select my number six so again choosing this position is slightly difficult i would show you why uh you know let me take out the filter from the striker right?

You would see that I also have Ben Stokes here. So Ben Stokes could not make this list because his strike rate was too low in the series. His strike rate was just 105. If the batsman is scoring just run a ball, we cannot have them in the number six because this position might require to hit like crazy. So that's the reason Ben Stokes might not find his place. So I'm going to say it's greater than 130. And again, if I have to say that whom I'm going to pick from here.

So we have three fast bowlers, two spinners. Glenn Maxwell looks like a good option, but he's a spinner again. I'm not sure whether my sixth bowling option should be a spinner, but his bowling average is good.

You can see his bowling, his economy is six. His bowling strike rate is 6.33, which means he has got a wicket almost every six balls. So he's one of my option. My other option is Marcus Toinus.

I'm going to play with this guy. His bowling average is not that good. His economy is not that good, but he's a good striker of the ball.

And he is someone I can also press to anchor the game. So I could go for Marcus Toinus if I want a better batsman with high striking rate, or if I want someone more balanced, I could go with Harteq Pandya. So this guy will give me a good batting average, but his strike rate is also not good.

not that great and his consistency is also not that good in this World Cup but his bowling is good If I really need the sixth bowling option, I would go for Ardik Pandya. But the kind of bowlers that I have, Shaheen Sahafridi, Sam Karan, Enric Nokia, and all those players, I don't think I would need Ardik Pandya in my team. Then I will go for Marcus Torres.

So, I will make my decision easier by showing the final 11. So, if I go to the final 11, I have already picked the top 11, like I said. So, you can see, this is our batting 11. Jos Butler, Rila Russo will be opening the game. Virat Kohli, followed by Surya Kumar Yadav, Glenn Phillips.

And in this number 6, I've selected here Hardik Pandya. But I want to go with, let's say, Marcus Toines. You can see the combined performance of the team. Batting average is 37.76.

The team on average scores 37 runs. Strike rate is 150. You know all the facts. it's given here. But if I choose Marcus Toinus, since my sixth batting position should increase the batting strength, I'm focusing more on the batting strength now because our bowling is already great. I'm selecting Marcus Toinus and removing Hardik Pandya.

You would see my batting average improved from 37 to 39.6 and my strike rate improved from 151 to 154.4. So this is the reason I would go for Marcus Toinies. But I can also go for Maxwell, right?

You can see that the performance of Maxwell and Marcus Toinies are very much comparable. They have a very similar strike rate, almost very close. And Maxwell has bowled well. He has picked up the catch well. But he's a spin bowler.

He's a right-arm off-break bowler. So since I would need my sixth bowling option as someone who can, you know, who can bowl pace, I'm going for Marcus Toinies. But...

On a given day, we will still have options. You can still choose Glenn Maxwell on a given day, depending upon the pitch. If the sixth bowling option can be a spin, then my second option is Maxwell. If I need someone more in the bowling side, if some of our fast bowlers get injured, then I would choose Hardik Pandya for number six. So I would keep all these three options open.

But right now, my team in the final 11, I would have Marcus Toines. And in the opening... I would go for really Russo. But if you want Alex Hales, he's also available for selection. Got it.

Just by looking at this list, I'm feeling super excited. Anyone who is following cricket, if they look at this list of players, they will be like, this team is unbeatable. This is the best team that we got on our planet.

And I'm sure we would win this game and save our planet. So Tony, this team is great, but... I know that Mitchell Stark is a pretty good fast bowler from Australia. Can you tell me why he's not in the final 11? Sure.

I think he must have not met our standards. Let me check. I'm pretty sure he's there, but he didn't meet our standards.

So I think if I recall it correctly, maybe his bowling strike rate is definitely not under 16. So let me take this out. And maybe he didn't bowl dot balls like we want. and his bowling average might not be less than 20. I should remove that as well. And his economy is definitely not less than 7. Let me take that out and let me see if we can have him in the end.

I don't think we have him yet. So maybe he didn't even play the four innings. Yeah, you could find him here now, which is tough. You can see his economy is 8.5, which is more than you know, 7. His bowling average is 34. His bowling record is 24. So that's the reason why he's not there.

And he has also played ball just three innings. We wanted at least four innings. Yeah, that's the reason.

All right, it's clear now. Okay, here's your final 11. Nick, go get us a cup and save the planet. The future of this planet is focused on these 11 players. They are traveling to the wilderness, out in the dark, to bring us light. And may we all, citizens of data, hope and pray, our analysis and insights shall work.

Defeat this potence and bring us glory. Three weeks later. You defeat us in cricket, you get art.

If you lose, join me every one time. Now comes the most interesting part of this entire project series which is an exercise and by working on this exercise you will be able to win an exciting prize. which is 20% scholarship on one of our premium courses on codebasics.io for exercise what you need to do is number one in the tooltip for players you are seeing let's say if it is Suryakumar Yadav you are seeing India versus Pakistan India versus whatever team you all know he is from India so you need to remove that ind and just have versus Pakistan versus Sri Lanka so you need to update that tooltip that is exercise number one Second one is you have to update the visual look and feel of the entire dashboard. So you can change colors, the placement, you know, different visual aspects of the entire dashboard. So come up with your own design and colors.

Number three is providing some more insights. So whatever we have covered in dashboard, in addition to that, try to add more insights. And then once you have that dashboard created, you can write a nice LinkedIn post.

tag me Hemannan and add following a particular hashtag so that you we know that you have submitted this they are going to provide some sample LinkedIn post in the description below these are for resume projects but basically you can write nice post upload your project on novi pro or maybe just short a simple video and make a LinkedIn post on it I wish you all the best and I hope you learned a lot of different things in this project this is going to be an excellent project for your resume and make sure you're writing linkedin post that way you know you can draw attention from potential recruiters if you like this video give it a thumbs up and please share it with your friends any question or comments post in the comment box below or we have a discord server for code basics go there and you can post your question there as well thank you