Transcript for:
SQL SELECT Statement Execution Sequence Notes

Let's take an inclusive example for all clauses and parts we've covered in Select statement that can be used, and to learn the execution sequence of the Select statement; i.e. when DBMS and backend execute this statement, what comes first to be executed. It's required to display the department name, the maximum salary for each of the departments, provided that the average of this department is higher than 1200. It's also required to arrange the results returned according to department name. First of all, I need to display the department name and its maximum salary. So, I'll write in Select the data I want to display. This data is returned from which tables? It's returned from tables Employee and Departments. As I use more than one table, I should add a relative Join condition. Here, I need the departments and salaries of their employees. So, the relation that can be used for it is the department number in Employee table should equal the department number in Departments table, to set that this employee works in this department. We will add Alias, as we mentioned before, as the column's name is repeated. "For each department" means that the table should be divided into groups of data. It should be divided based on the department. So, I'll write Group by department name. "Given that its average is greater than 1200" This is the condition where I set that the returned data should meet an condition that the average salary should be greater than 1200. So, I'll write Having the average salary greater than 1200. I need also to sort the results set by the department name. I'll write: Order by Dname. I'll name the max "as Max", in order to display a readable column. Now, I've written an inclusive statement covering all clauses and conditions that we can use. We may use more than one condition in Where. Not only Join conditions, I may write And then add the conditions I want to filter data more. When we run this statement, we will receive this results set; 2 departments and here are their maximums. The first step the DBMS does is to call the tables. So, the first part to be implemented is the From clause. Assuming I've called the tables to the memory to be available in this way. The second step to be done is to implement the Where condition. Check all the conditions used in the Where clause in order to achieve them all. The condition used here is to link data. So it adds the department number in front of the other department number. Here, it's like creating a big table consists of Employee with its data and Departments table with its data. So, I should enter data in the correct order here. As these 2 records have no data here, so they will disappear completely, as they don't meet the Where condition. This is the format that should be done. It's like getting data of the 2 tables and writing them all together. Next, it implements the Group By. In order to implement the Group By, it gathers groups of records, based on the attribute I've selected. I've written that Group By Dname. So, it will gather all the records with the same Dname in one group. Here, it gathered each group of data then it set that this group is gathered together based on the department name. Starting from the Group By step, DBMS cannot see or access the data in tables, except data in Group By label itself. What does this mean? I here wrote Group By department name. So starting from the first group it cannot see anything except department name. Assume that it created a label for each box of them then entered the data of Group By. So I'll write here that what is displayed here is Department 1 then here is displayed Department 3, and here is displayed Department 2. Next, it calculates the aggregate functions used. This means, it finds any aggregate function then calculates its value. Here is the value of Maximum and the value of Average. It should calculate each value of them and record it in the label of its group. For this group, I'll write its own maximum and its own average. Now, each group of data has its own label which is the department name, maximum salary and average salary. Next, it implement the condition in Having. It should check here what is the condition? "Having average of salary greater than 1200. Which one of them has an average that is greater than 1200. The first 2 groups verify the condition. While the third group doesn't verify the condition. So it can not be displayed in the results set. Next, it implements Select; step no. 6, checking what needs to be displayed. It confirms that data needs to be displayed is available in the label of each group or not. It is required to display the department name and maximum of salary. The department name is available and maximum of salary is available. That's OK. There is no problem. It adds them in a table in the results set. We've named the maximum salary as Max. Here is the results set. The last step done is to implement the Order By clause. It checks what the base that should be used for ordering or for sorting. It'll sort by Dname. Is Dname available? Yes, it is available here, and it is displayed. This means it's available in the group itself and displayed here. So, I'll sort by department name. As I didn't mention a sorting order, so it'll sort ascendingly. Here is data sorted ascendingly. So the results set of the query we've written should be like that. This is the execution sequence for how DBMS executes the statement in the backend. The most important point to care about is that when selecting here, we cannot display from anywhere except from data available in Group By. i.e. data displayed these labels. So, when it displayed the Dname and the maximum salary, they were available here in the group's label. However, anything else that isn't available, I cannot access it. The same case here in Order By, any data used for sorting should be available in a group. Either it's displayed in Select or not. So, anything mentioned in Select should be available in Group by. Anything mentioned in Order By should also be available in Group By, in case of using Group By. So, if I need to display the department number, I should mention it in Group By.