Transcript for:
Airline Data Analysis Project Overview

Hello and welcome to my channel Tech Classes. In this video we are going to make Airlines project. This is also a complete project. If you have not seen my earlier Data Answers project which I had made by piping then go and see it now . Many people have used it. I had said in the comment area that I have also brought a project on the sequel, similarly, a complete project in which we will create a report, create a project regarding a business problem and then in the report, we will present its analysis in D form of words which In this way a complete project of data analysis is divided, so in this way I have created this project by doing it using sequel and python, we will extract the data in it from the database and will form the analysis on it using sequel and python so this For the project, you need to have knowledge of Python and Sequel and how can we write a sequel by using Python, so this project is also a complete data and its project, you can put it in the school, you can put it in the resume and by modifying it. You can add more things in this , so let's see the steps of the project , in which steps we will complete the project, so first of all we will understand the business problem. First of all, we will understand the business problem. I started the same in the first project also. We had taken the business problem and done all the analysis on its basis, so first of all we will understand the business problem, we will import the libraries which we will do further in the project, next database connection, you extract the data, we have to do that with that data. If a data base is present in the form of different tables, then we will first connect to the database, after that we will extract the tables. We will extract the data from it. Exploring the tables present in the database, first identify the variables. So first yes, then we will We will explore the data. When we get any database, we should explore the tables in it. Which columns are there, which data is there? As we form the idea, in a way, this is data exploration and you can use exploratory data. Analysis So we will first understand the database and which are the tables and which are the columns in it. By doing that we will reach our goal. Next analyzing will be by identifying the variables of D's tables. There will be two types of analysis, one will be basic analysis. And the second one will be Analyzing Occupancy Rate so when we will understand the business problem then you will know what is Occupancy Rate and how we can increase profit ability from Occupancy Rate. Last creating report with which gives results and analysis on d company then last we Whatever our findings will be in the report, whatever our results will be , we will present it in the form of words. I got some comments in this video that I asked how we can convert the project into the form of a report, so you do not have to do anything. You have to write whatever your analysis is in the report, whatever steps you have formed, it has to be represented in word form, as if you are making any project, you have to explain it to anyone, so now how? If you explain , then the form of your explanation will be there, the steps will be there, you have to write the same thing in the report. Who is the report for? Those who do not understand coding, those who are only concerned with the result, like they are stakeholders, management or they are clients. If there is a report for this then you can create such a report and if you are making the best report then you will have to create a report to represent your analysis in it. So let's start with understanding the business problem . So here I have created a business problem. The project we are making for an airline industry, which is our Python sequel project, is for an airline industry which is facing some challenges like here we have written Structure Environment Regulations Higher Flight Tax Increase Prices And tight labor market which is resulting in increase in labor cost, so what is happening to them is that their profit is decreasing and investment from their side is increasing, so we have to minimize it, we have to reduce it, this is our further goal. So it is written here that the result is company challenge they are looking for you conduct and analysis of their database which is our data which is inside it which the company has provided to us then find waste increase by occupancy rate which can help in average profit on the basis of So the profit to the airlines What happens is that their tickets are sold, they get profit from the bookings, so as it is an industry, it may have more than one aircraft and there will be some seats in it, so if those seats are fully booked, then It can make good profit but sometimes it happens that the flight is flying fine but not all the seats are booked till the source destination, some ETV remains there and there is no booking there, then there is loss for the pay and lines. If so, we have to reduce this thing, how will we increase the occupancy rate by seeing that its profit will also increase, so give this date is a complete business problem, here I create a report like the project I had created. Data in that also I had discussed a report, in this way this is also a report in which we will take a challenge, in this we had to do a toro analysis, it just means on one data, we had to do a complete analysis as to why the flood is happening, so in it we asked research questions. Hypothesis was taken for this but there is no such scene in this. An answer has to be provided for the question, so here we have included only challenge objective and business problem and after that our basic analysis will start, so whatever the challenge is, there are three challenges. Have environment regulations flight tax and tight labor market so here I have explained it in detail Give d objective What is the objective of the project Increase occupancy rate by increasing d occupancy rate They can enjoy d average profit on what and me you get d impact Of the challenge, they are fencing, which I just said, if we release your bansi rating, then our profit will increase but by what percentage, how will we calculate the occupancy rate, how will we increase your bansi rate, we will do all this in Python and With the help of sequels, you need to improve your pricing strategy, develop your customers through changes in market conditions and customer preferences, attract and retain customers, enhance customer experience, help you focus on providing a seamless and convenient experience to your customers, from booking to arrival, you need to differentiate yourself. Sales in highly competitive industry and increase customer loyalty is the objective of what we got yesterday, we have to increase an occupancy rate, increase profit, improve pricing strategy, this is the main objective of airline industry and on the basis of this we will make the project. So the goal of this task is to identify opportunities to increase occupancy rate which can ultimately increase the probability of performing flights. There can be some reasons like the condition is not right or its facility is not good enough. Because of this, he will have to get fancy rates, so we have to identify more flights like this, tell the company and do the analysis, so let's start with importing libraries, this is necessary, you create the project, so let's import first give SQL 3. By doing Import Pandas SPD, Pandas is a very good library which we can use for manipulation and for some visualization, here we will import the visualization library and nothing else. Next, we will now connect and give the database connection. So our database is loaded. Our connection has been created, both our cursors have been created, now we will see how many tables are there in our data base, which are the tables in which we have to do from master where tai = table and table in cursor dot patch and There are commands, you must know all this before doing this project, if you do not know then first go to the sequel, after that you will understand how we can do it, how we are executing, how we are getting the result, will create a list of tables. And it will return it, here we will enter it and whatever value is there on the first index, we will put it in the table list, then we call this line as list comprehension, it means to execute it completely in a single line, whatever the value of the list and Then we will print here the table list, you can see all the list of tables present in D database, these are the names of all the tables, Aircraft Data, Airport Data, Boarding Pass, Booking, Flight, C, Ticket, Flight and Tickets, meaning in one airline, one airline, whatever, whatever in the industry. Things are important, database has been created according to them, aircraft data is there in it. Everything is there in the data of the aircraft. Give the airport data. What kind of boarding pass. Whose boarding pass, from where to where. Booking. Only the data of the flight is there. Give the tickets. Flights and tickets. If we will explore all these tables, then after this we will see the data. Exploration So in data exploration, first of all we will see one by one in all the tables which columns are present and what type of value has to be kept in that column. First of all we will start from the first table which is aircraft data so to do aircraft Whatever result will be given directly to us [Music] In the sequel query, first of all we need sir data, once the data is there then we will simply execute the query with select star from table name so let's start from table name what is aircraft data and after that second The input is let's connection , what it will return will be the data frame, so we can do all the functionalities of the data frame on this, so first of all we will do that head, now we can see that starting here there will be five daily returns. There are three columns in the aircraft data , it means the airline , its range is given and its model name is given, just change the name, the airport code is given and in this the airport code is the booking reference in the booking s is the booking date. And give the total amount, how much amount has been spent, the data, you can do it, keeping your analysis in mind, if we have this type of data, then for which tour we should do it, how will we know that we have to do the sequel? How will we know that we have to do it in Python? So IT depends on the data and the business problem you are going to solve. The data here is in the data base and is in the form of many tables, so now we will CC it. By loading it in the file, then if it is done with Python then it can be done with Python too, but now the data is ours in the database and in real time and in real time, the company does not have a data base and in this way you can do it easily. There is some company, what needs to be analyzed, so we are doing it with both Python and Sequel, manipulation will be done with Python and we can show a lot of insights by using the functionalities of Python and from Sequel, we will set the data which There will also be retirement, right now the data here is small, so it is getting fat easily, but the data is in very large amount. In the company, if you have to write a query, then you should come to pack the data, even if you can do it with Python. If you are traveling then first is flight ID flight number scheduled departure scheduled arrival departure airport arrival airport status and actual arrival will be found but what should do means they have not flown yet so here we have given slash and kid of missing value. Here is the airport code, departure airport, arrival airport, from when till when, its timing is the schedule and its ID and number. Next table is our business economy and comfort aircraft code, how many seats are available in it, it is given here and the fare. Condition What is the fair condition of that seat means that seat comes in business class or economy class so next we will see okay we have ticket number flight ID fair condition so okay equals you every way have ticket number book You will get the reference and passenger ID ticket number from this column. This is the booking reference. You will get the booking reference from the booking table and this is the passenger ID so date. Complete table date they have in this data based on these tables. We will take the columns which can be important for us for further analysis. Types of Joe D Call by one, we will do all the tables and print the data type of its column Supreme Court. Now we will put it here, for that we are getting a cursor return of this type. Out of which we will have to pack the data first, then we will be able to print it. In which data type, that pen is present in the database. It is here in the database . Like this, it is in text format. You can see the integer types. So we can see the data in every We do this keeping in mind that there should not be any missing values ​​in it, the values ​​should not be wrong in it or its data type is correct, if we have to change the format , then we will change it, then after that we will check for missing values. Four tables, if we have to check in each table, then we will do it again in the table list, so first of all we will print it, create a table request , then you can directly pass variables in it, then select from table, then as many times as this iPad will do, it will be different from each table. Different table names will be known and data file will be stored in DF table, then from that we will know which table has missing valid in which column. Here all the values ​​are zero, so it means that all the tables are missing and our data. It is very consistent and clean so I have taken three queries here ho I have plans give more hundred seeds ho d number of tickets booked and total amount on changed with time and d third one it is calculated d average charge per h aircraft with different Fire conditions are these three basic and also questions through which we can tell the customer or client more about his data or can give some more meaningful insights, whatever is important for him , so here we will come out first. Which are the planes which have more than 100 sixes, so total we have 9 aircraft, out of them let us see which are the aircraft which have 100, the aircraft which have 100 are given 10 . We can do it by number and from sequel also we can do group by, so we have to group the aircraft by number and we have to find out the number of seats that on how many number of seats the aircraft are present, so here we will count the number of seats. Give the name of the table from which it is already C and give on which group by we have to do it, we have to do the group by which is more than 100, the connection here, so this is the aircraft number of the seat, we got it here but we Here the condition has to be imposed, we only want that aircraft which has more than 100 seats, so here we will put heaving close, heaving number of seats, here we have taken D count of seats on number of seats, so heaving number of seats is greater than 100 . Out of nine, we have 6 such aircraft which have more than 100 seats. Next we will check how the number of tickets booked and the total amount changes with time. So for this we have a very good plot for this so that we can If any value related to time changes from time to time, then we can show it, which is the table with tickets, from here you will get the ticket ticket number and booking reference and if you get the booking reference from the booking table, then we will get these two tables. We will have to do inner join, then we will go to the booking reference and we will get the count of how many are booked, it is ok because we do not want just tickets, we want tickets which are booked, so from n select star from give tickets, inner join so First, this is our table. Second, this is our table. Now we will write on whom we have to join and okay, equals is the parameter, total amount is given here, all the number of tickets will come out. To find out the number of tickets, we will You will do group by here. Here is another way to show me how you can do it by doing it using data from Sikar. First of all, we will convert the book date of the ticket. First of all, we will convert the book date into date time because first look here. It is possible that I have not stored the date here which is our booking date, that object has to be changed in date time, if we have to plot the gram so book all this has been converted into date time here, after this we check the date time here. is also given, so we do not need the time, but we will group it by the number of days, what is the average number of tickets booked in a day, so if we want to find out the number of tickets, then from the booking date given here, we will There is value of time, if we have to remove it, then we will get a particular date and according to how many tickets have been booked for each day, we can count it and put it in group by date and then we will need the date from it, so we are here . We will make a new pen on the pay and give OK, we have got the date for every value here for the booking date, what will we do by group by, we will count it every day, we have to plot the supposition X, after this we will calculate the size of the figure. Let's set its font size. Let's set its font size. Number of Okay, the date has been plotted D Number of tickets Here on the A axis here is the date 2017 starting from 6 Here we have data till 2017 Meaning we have data of two maths and here first from 0 It has started increasing a lot, 8000 like average here only 8000 is visible, you must be seeing a little constant, there are 8 and 1 place and in one place a speck is visible as if there has been a season, like if it is the time of August then it is August. Rakshabandhan comes or any other festival comes, so who knows, the bookings for that may have increased because of this, it could be any reason, you can imagine, just next we have to plot the total amount. You can get a graph of this type : Sohan, booking will come out of booking stable , connection has to be given, I am forgetting it again and again, so now we have to find the sum of total amount, so here we will write the sum of total amount , so now we have book date here. As we had done before , we had to find out the date for the earlier plot, we had to convert it into data first, so now from this table we have the book date, like we had done for the earlier plot, we had got the date, in this way we will get its So let's copy it from here, we have to change the table name here, OK Knowledge CD New Column Date, so here we got the date, now we have to find the total sum for each date, so here we are. But we can also do it by group by total amount so it will give us the total amount of how much people have spent every day and we mean how much has been spent on the airline and we remove it. Here is the plot and you can see both the plots. It is quite similar because according to the number of tickets being booked, we will get the amount, so exactly, a little, a lot, up and down, so here we have it, so this is D number of tickets and D total amount on change with time. Let C D Times Calculate D Average Charge Per H Craft With Different Fair Conditions So in the question we have average charge and fair condition so the charge is we will get the total amount as in the question average par condition so we have to take two tables like this In which these three things are there and if not then we have to join two tables, so first of all we will see the amount and share condition, then we are getting that ticket from the flight table, no, we need the aircraft code, now it has slide ID, so we need something like this. You will have to see the table in which the flight ID will be found and the aircraft code, in which the ID is also given and the aircraft is also given, which flight is of which aircraft, so we will join both these tables, star nav on flight ID, on which we have to group. Because we will get the average only by grouping, so group by and average amount, what will we do to import the fair condition, we will get our average amount , you can see our chart has been plotted here, here is our average amount, give the aircraft code and you can see here. There are only two such aircraft which have only economy class which does not provide business class and there are seven seven three aircraft which have comfort and fair condition and apart from that all of them have business class and economy class but the price of business class is higher . There is economy class, which everyone knows, and here, the three-one-nine aircraft, both its economics and business class are quite high with respect to other aircraft, so in this way you can compare and present your analysis. You can do this in the form of report, so we have written some lines in the report for these three analyses. In this way, you will make the project, for the portfolio, for the resume, you have to present your analysis in the same way. Just by making a notebook, that project will not be divided. You will have to present it in any form. Presentation report has been created in the form of report. Let's go to it and see the basic analysis how we have presented it in the word. After objective basic analysis data is provided inside in the number plane with more. Let 100 seeds be D number of tickets booked and total amount and change over time and D average par per e aircraft with different par conditions So we have started here first of all how for whatever we are going to do in basic analysis, I have given here One complete line is written on it so that the other person knows that we The voice of execution is basically developing strategies to increase hockey pencil rates and optimize pricing. What we have done with HN is also presenting to us which is our business problem which is ok fancy rate has to be increased and profit has to be increased. On the basis of this, I have formed the analysis form, I have not formed any analysis which is irrelevant, we have business problems , so here I would like to give a tip that whenever you are doing a project, then do not do anything in it. What happens is that you have got the data and as much as you know statistics, as much as you know the plot, you have plotted it, now you have to move slowly, you have to take this business problem first and give it to you according to that. Analysis has to be done not that you are analyzing outside it, I had told this in my earlier project also and in this also I am confirming to you that you have to make the project in this way, so whatever result we got. From the table, we have represented it in the form of a table here Utilized Line Chart Visualization So here we have written Chart B Observe Date D Number of Tickets Book Exhibits and Gradual Increase from June 22nd to July 7 Followed by Relatively Stable Pattern From July 8 till August which we saw in the table, well, it started from July and remained stable till August around 8000 with noticeable p in. Okay booking where the highest number of tickets booked on a single day so here. You can see the highest number of single day bookings so far in 2017 in two months, which is here and this can happen on 14th August. Tide tu d number of tickets book and and's true was similar trade in d total revenue earned by d company through d analyze time period designing suggest date father exploration of d factors contributing to d pick in ticket booking b increasing over which revenue and Optimizing Operation Strategies So this sir, all this analysis around our business problem has been done around this, total amount and around d time waver able to generate n bar graph and graphically compare d data after b completed d average cost per competition. Associated with different condition graph figure date is business so here 773 aircraft has only comfort class cn1 and cr2 plane which has only economy class which is visible from here clearly when different prices welcome e aircraft compare de charge business class Are consistently greater two on economy class this and that b seen across that place regardless of fair conditions so as I said the fair condition of business class is higher economy class and 319 is like this is craft both of which are priced very high Have you compared other plans? Analysis rate will see how we can analyze the occupancy rate. Here I have written three steps to increase the turnover. First, calculate H aircraft, D total revenue, give P air and D average revenue. We will move on to the aircraft and its After this we will calculate by matching the total revenue turnover increase by giving which the aircraft is 10% higher of you can see the rate. So how will the annual rate which is the annual turnover increase if we calculate the occupancy rate in the second step? If we increase by 10%, then we will analyze the occupancy rate in this way. First of all, we will write here Total Revenue Ticket Counter, then we will get tickets on average revenue and give is aircraft polls, this is what we have to calculate for aircraft food, so in this we will enter the table. Who can do it? Okay, in that we also get the aircraft, we also get the aircraft code, we also get the revenue and tickets, on which flight we want to join, then we will have to do group buy first and we will make it look like Sabri first of all. So let's add to this, we have got the aircraft , so first of all we will write, now to get the occupancy rate also, we will have to join the table because from there we will get the seat count, how many seats have been booked, which was the table of voting pass. Where is it here ? Here you will get body pass with seat number. After taking out the board, you get the light ID and its seat count, then in each flight ID, how many times that seat has been booked, how many times that seat has been booked, how many times that aircraft has come, that is, whatever flight it has had, this is the seats count in it, like here. Pay three one nine is fifty one but if we see the total count for three one nine here then 116 number of seats so we also need the number of seats in this, what will we do for the total number of seats, so for that we will write another query and Then we will merge it in a single query. Now we have to join these two tables on which plane, so how will we join it. If you have read all of them then you will understand well, so first of all we will put bracket here. Let's give and this table which will be returned from this query, we take it as S, give it S. A, remove the connection of the subscribe, from here, whatever table will be returned from this, and give it S. B. And we group on which. And we will group it by and out of the number of six, how many average number of seats are booked and after that we will calculate the average number of six on the total number of C. The occupancy rate formula is the average of A and C count on B. Dot number is because the average seat is booked, we have calculated its occupancy rate. Now let us increase this occupancy rate by 10% and see how the total revenue of ours increases. The variable name is Occupancy Rate N, so by 10% we You can also say 0.1 [Music] 0.1% of Increase Document 46% has been reduced to 10%, after this we will calculate the total revenue, so total revenue, total revenue equals, you will do fly diary, multiply it , increase annual turnover was calculated from increase occupancy rate. Total turnover is found, total revenue is there, if you increase it, then whatever increase will happen, the amount will increase [Music] Here you can see the total turnover, we will get this amount here, Profitability, D, Overall income, Per air, and Average revenue, Per ticket, H Aircraft, R, important metrics. You consider that airlines have optimal pricing options and allocating resources on this information . We have to write a lot of lines which are suitable for the client in front and in a way the floor is visible that is how we started the lines and how we did it after that. Whenever you take this report while giving a presentation, you will start with the introduction first, so here I have written the introduction, you can also reduce its wording, you can also convert it into presentation form, but I have written it for your difference. So that you can understand that yes exactly how can we create the report or how can we write the word to present so show on d bill of figure d total revenue total ticket average graph with d highest The total revenue is so and sorry so this is the ticket count, the average value of the ticket is the highest, the total revenue is this, but the average is the lowest , the economy class is the lowest in this aircraft, so this can be D region date most of d people go this aircraft ticket one it's cost is less compare tu adarsh ​​so its cost if we see from this chart of SU 9 then its is quite less compare tu other aircraft so it may be that many people Its tickets are charged more, that is why its average value quadratic is less, so D aircraft with release total revenue is C and one air and D possible reason behind this is only offers Economics were in list price and it would be because of its Pure condition and less facility. Now what do people think that it is okay if there is a chip then we buy it, we have to buy more and more from it but if there is too much of a chip then people do not prefer it, they want some quality at least. If it is there, it could be because its condition is very bad, its facility is very bad, that is why it has taken very less tickets and its revenue is also very less, after that we have written, after that we have written, so the average D average was calculated . So for that we have written like this Major House in Airlines Successfully Devour Fool Devour Sits and Discovered Chances You Most of the Fancy Rates by Using This Matrix Rate New and Profit Ability Will Lavishing Operation Expense Associated with They Can See Pricing Strategy and Customer Satisfaction Are the factors which were my influence Occupancy Rate Number of seats and Occupancy Rate of Per D Exam in D Possible Increase So here we have tried to find out fancy rates and revenue greater value and service to consumers by optimizing Pricing and Other Operations Figure Shows Total Revenue Increase After Increasing D of You Can See It by 10% Here It is Written New Date A Sach Total Revenue Per Air Average Revenue Per Ticket Average of Incidents Airlines Proficiency on This Critical and Can Find Improvement on areas and modified pricing and route plans today are a result of assessing the indicators so like the annual turnover which has increased more, the one which has higher fancy rate, do not pay much attention to the one which has lower rate, we can increase it. We can see their pricing strategy, we can do marketing for them, we can get all these things done, we can recommend them, so similarly I have written here, you can read it, so airlines can find area is one important. Feature Date Can Set An Alarm Seat With Black And C Factor Date People Are Not Buying Tickets From These Aircrafts They Should Decide Reasonable Price According To Condition And Facility Of Air And Short And It Should Not Be Very Chip And Height But Most Considerate Should Not Come De Price of Consumer Happiness and Safety Airlines Airlines Mast Tree Balance Between De Necessities But Profit and Significance of Delivery High Quality Service All this is just a suggestion which we have given to the airlines, so in this way you can create the report. Yes, for your analysis and this is a complete project which you can put in your portfolio, you can put it in your resume, you can show that we have taken a data, we have taken the database of some company, we have helped it and done a complete analysis to increase its profit. Has provided so that they can make their own streets and can change the price. I hope you liked this project. If you want more similar projects then do tell in the comment area how did you like this project and what type of it. project you want thank you