Transcript for:
Aggregation and Annotation in Django ORM

when you have a lot of data in your Django applications you often want to aggregate that data in order to get insights into values and summary statistics across your broad data set and this is commonly used when you have dashboards and Reporting applications and when you're creating business intelligence software you want to get insights into all of the data that's stored in the database not necessarily on a row level but as a whole in this video we'll look at ways to Aggregate and annotate data using the Django orm we'll see how we can use aggregation to combine multiple values from multiple rows into a single output value for example summing up a column in the database to get a final value that's the sum of all of those values another example taking the maximum value from a column or the minimum value these are all aggregations that we can apply over a query set of Records in our Django applications we're going to dive into that in this video and we're also going to see how to use annotation in a Django application and when we annotate our model in Django we are adding a summary record or some sort of aggregate gated record for every single model in the query set so rather than a single output number we are annotating every Django model in the query set with a particular value that's not defined by default on that model we'll dive into that as well and we're also going to look at two key functions in the Django orm the values and values list functions these allow us to only pull certain values from the database and rather than storing them in full Django models we can use Python dictionaries and lists instead which can be better in terms of performance but it can also help us with annotation and aggregation so let's get started and we'll see how we can achieve these things in Django now we're going to start by looking at two functions in the Django orm and these functions are defined on query sets in Django and if you look at the right hand sidebar it's these two functions here the values function and the values list function so let's click the values function and we're going to read the documentation here this returns a query set that returns dictionaries rather than model instances when it's used as an iterable and each it's one of those dictionaries represents an object in the database and the keys correspond to the attribute names of the model objects so it's a bit different rather than a model it's returning a query set of dictionaries and we can use this to only fetch a subset of columns from a table rather than all of the columns Now by default when you use Django functions you are going to get all of the columns back in the models for the query set we saw in a previous video how you can use the dot only and Dot defer functions to change this Behavior but there is a difference here the only and the fair functions the return Django models with some of the fields missing the values and values list functions are going to return dictionaries and python lists respectively and Django also questions against the use of only and defer unless you really know what you're doing with those functions the values and values list functions are a bit safer to use so we're going to dive in and see how to work with these in this video now I'm going to open vs code and we have our ormscript.pi file here and here's a query that would get all of the restaurant fonts from the database from that restaurant table now if we go to the models.pi file and we look at the restaurant model here you can see all of these different fields that are defined on that restaurant model and let's say that we don't need to get back all of these fields when we issue this query we only want to get the name of the restaurant which is a car field and it's stored in the database for the restaurant's name we can go back to the script here and we can change the dot all function and instead what we're going to do is we're going to use the dot values function and what we pass as arguments to the dot values function and the name of the fields that we want to get back from this particular model and you can also fetch fields from related models as we'll see later on let's say here that we only want to get back the name field from that table and once we get that we're going to get the first record in the table as well so we're going to extract only one single Row from this table and then we're going to print these to the terminal now I've called this restaurants but it's actually going to be a restaurant model instance what we're going to do at the bottom is run Python manage.pi and we'll use the Run screen script command and run this orm script now you can see at the bottom we get back this dictionary here it's not a Django model in this case because we've used the dot values function it's returning a dictionary and the keys in the dictionary in this case the name key match the fields that we pass in from the model to this values function and let's say that as well as the name field we also wanted to get the date that the restaurant opened if we add that as another argument and rerun this script we now get back a dictionary that contains two keys and two values the second key here is the date open key which maps to a date time dot date object in Python so one of the key points in this video is that the values function here is not returning a Django model it's returning a dictionary and it's only bringing back the keys that you specify as arguments to the function and this can be beneficial if you're fetching a lot of data from the database and you don't need to get all of the columns and you also don't need to construct python model classes using this method it's simply going to return the data and construct a dictionary instead that can be much faster so that's one benefit of using values and there's also another function called values list as well instead of a dictionary that's going to return a list of values for each object now I'm going to change the name of this variable to restaurant and we're now going to demonstrate that this is just a dictionary in Python we can use the normal dictionary notation to get a particular value and if we rerun this script we see here we get the name of the restaurant and that's because we're asking for that particular key from the dictionary that's returned from the dot values function for the first row in the database so to sum up these dictionaries for the values function they represent rows in the database table but only with the columns that you're interested in and if we remove the DOT first function here and just look at the query set in general you can see that what we get back is a Django query set containing dictionary objects so it's a dictionary and each one of those dictionaries is for one row in the table and it contains those two keys for the name of the restaurant and the date it was opened now before we move on what I'm going to do is print the connect action dot queries and we're going to see what the query is that's being executed here in order to get back these dictionaries so let's rerun the script below and what you can see is this select query here I'm going to make this a little bit bigger so we can see that we've got a select statement but you can see that we're only looking at the date opened and the name columns from that table we're not getting back all of the columns so that's one effect of using the dot values function as you can see it returns dictionaries containing those keys but the SQL query and the data returned are going to return less data than you would normally use with a Django or in function and there's also much less overhead in constructing these dictionaries than there is in constructing the Django model classes so this can be performant as well let's now move on to something a little bit more complicated it turns out that the dot values function can take additional keyword arguments and these can be used to transform fields and perform aggregations on some of the fields that were fetching from this particular function so for example let's say that we not only wanted to get the restaurant name but we wanted to get them an upper case what we need to do in order to do this is use a Django database function that's defined and this will then run a function in the underlying database to pull out those values after that function has transformed the value so what we're going to do at the top is bring in an import from django.db.models.functions We're importing a function called upper so what we're going to do here is remove the date open field we don't need that here and then instead of just referencing the name here what we're going to do is give it a keyword argument I'm going to call that name upper I'm going to set that equal to calling this upper function that we've imported and passing the name field into that and let's also only get the first three records back from that table so now I'm going to remove the connection.query statement and we're going to rerun the function I'll rather rerun this script and you can see at the bottom we now get this query set where the key is called the name upper and that matches the keyword argument that we passed into the values function but the value itself is the name of the restaurant but it has been converted it to uppercase and that's been done at the database level by this function so these functions are defined in djangles DB dot models.functions module they operate at the database level and in order to see that we can bring back the connection dot queries and rerun the script and then when we look at the SQL statement you can see that not only is it selecting the name column but it's passing the database upper function there and that's where this conversion is occurring let's now move on earlier on I mentioned that we can get foreign key Fields using the values function as well so what we're going to do now is we're going to try and get all of the ratings from the database but we're also going to get the name of the restaurant that's attached to that rating so restaurant is a foreign key on the racing model we want to follow that foreign key and get the name field out when we use this values function so I'm going to remove these two lines at the top here and I'm going to set up our reference to the restaurant.type choices dot Italian value we're going to pull out Italian restaurants only in this query so let's do that just now we're going to use rating dot objects dot filter and we're going to filter this by the restaurants restaurant type and we can follow that folding key to the restaurant model from the rating and then access the restaurant type field and we're going to set that equal to Italian so what we're doing here is filtering to only the ratings that are applied to Italian restaurants but what we want to do then is use the dot values function here and let's pull out two particular values firstly the raw rating itself which is the numerical value between one and five but we also want to get the restaurant's name that has this particular rating what we can do to get that is use the restaurant field which if we go to models.pi and scroll down to this rating model this is the foreign key field it's called restaurant so we're accessing that in the values function and again just like in the filter function we can use these two double underscores here to follow the reference to the restaurant model and in this case we want to get the name of the restaurant so we access the name field here and that's going to give us back the name of the restaurant so let's store this query and available code ratings and we're going to then print those ratings to the terminal and I'm going to remove connection.queries and rerun our script here at the bottom if we look at the results of the script you can see that it's a query set containing these dictionaries for each row in the database and if we look at the dictionary Keys we have the rating for example two out of five and we also have the restaurant name and the key for that is just the key that we provide to the values function and we get back the names of the restaurant so rather than the foreign key instance we're getting back a particular field on that folding key and the values function and that is the name field so the key Point here is that the double underscore in Django it can be used in for example the filter function but it can also be used in the string values of fields that we passed through the dot values function to follow the foreign key reference so this is the values function but we saw in the Django documentation if we go back here that there were two functions the second one is called values list so let's look at that this is similar to values except that instead of returning dictionaries it returns tuples when iterated over and each Tuple contains the value from the field or expression that's passed into values list just like the values function so let's go back to vs code I'm going to clear out this code that we had here and what we're going to do is we're going to go back to the restaurant model and I'm going to fetch all restaurants here but what we're going to do is use the restaurant.objects.values list function and again we pass one or more fields to the values list function I'm just going to pass the name field and then we're going to print the result that we get back to the terminal here below so let's clear the terminal and execute this script and you can see that what we're getting back is a little bit different now we have a query set as before but we are not getting dictionaries anymore we're getting tuples containing in this case just the name of the restaurants so there is a tuple for every Row in the table that's returned from this query and each element in the Tuple represents the elements that are passed in so again if we pass in date open there's an additional field here and we re-execute the script you you can see that the tuples now contain not only the name of the restaurant but they also contain this datetime.date object that represents the date opened field in the database now whether to use values or values listed depends a bit on the context for me I prefer dictionaries because it's much more explicit what these values actually are because they have that key in the dictionary for example the name key and you can much more easily tell what the values are representing using that method but the values list function is handy as well in some circumstances so let's see another useful example of values list we're going to show the flat equals true keyword argument and you can use this when you're only fetching a single field from the values list function so again if we re-execute this and we see what happens without that keyword argument you can see that each element in the returned query set is a tuple with a single value now you might want to get a better representation of this you might not want to have a tuple with a single value for every element in that query set you might just want to return all of the values in python list with no tuples in that query set so what we can do is we can pass the flat equals true keyword argument to the values list function and if we rerun this script you can see that the resulting query set Now does not have any tuples it just contains the strings that represent the restaurant's name so we flattened that query set and removed the tuples and we're just getting back the values of that field and that will work for any field in the table for example if we use the latitude field which is a floating Point number field we get back a query set where each value in the query set is just the number there's no tuples here with a single element so I like this argument a lot and I use it a lot in Django applications if you're constructing a list on the back end from database values this can be a very handy function to use so that covers the values and values list functions in Django what we're now going to do is move on to aggregation and annotation and later we're going to see how to use the values function with annotations in order to help you group data by a particular column so let's start with aggregation what aggregation does is it breaks down multiple values into a single value and it does this by gripping or clustering records on a particular field or set of fields in your Django model so we're going to see how to use some common aggregation functions now and let's start with the simplest aggregation we're going to count the number of Records or rows in a particular database table so I'm going to remove this code here and I'm going to paste that query that we've already seen in this series restaurant.objects dot count if we print that to the terminal you can see what we get back as a result at the bottom is the number of rows in the restaurant table in this case it's 14 and we can also perform this counting aggregation on a filtered query set for example if we wanted to get only restaurants whose name starts with the letter c we can use that filter and then use the dot count method after that if we rerun this script now we get back only two restaurants in the table now Django has an additional method for aggregation and that is the dot aggregate method and this method allows more complex aggregations to be performed in your Django model classes now the Django models module has some objects for performing aggregations we're going to see one of those now for the counting aggregation so at the top from django.db.models I'm going to import the count function and we're going to see a different way of Performing this query to get the number of Records in the table so rather than using the dot count method we're going to use the dot aggregate method and what we're going to pass into that is an aggregation that we want to perform so what we want to pass in here is we want to count the number of IDs in this table and we're going to print that out to the terminal so let's rerun this script and you can see that we have a dictionary that's being returned from the aggregate method and it has a key called ID count and the value there is 14 which matches the value that we originally got from the dot count function and if we uncomment connection dot queries and save the file and rerun the script we're going to see that what is executed on the database is the count aggregation function and that's executed in the underlying database to give us back this value of 14. and note as well that we have an alias to the result here it's called ID underscore count and that matches what we see in the dictionary here but if we want to give the returned value a different Elise we can pass this as a keyword argument so for example to count the number of rows I might give this a keyword argument of total and when we run this script again you can see the dictionary now has a key of total that maps to 14 and you can see the Alias and the underlying SQL query has also been changed to total now one thing to note is that the aggregate function is a terminal function for Django query sets it returns a dictionary so we cannot chain additional functions onto that so the tons of dictionary of key value pairs representing the aggregations that you're performing but you can then perform a filter function after that and that's because this is a terminal clause for a query set so let's now look at more interesting aggregation we're going to look at the average aggregation let's say that we wanted to aggregate all of the ratings in the database and we want to calculate the average rating that's stored in this database if we go back to orm script.pi I'm going to import another aggregation function from the models module and that's the average aggregation function now at the bottom here I'm going to comment connection dot queries out and we're going to print a different aggregation here that's rating.objects.aggregate and we're going to give the aggregation a name of average and we're going to use that average function that we've imported and pass to the rating field and the rating field is just a field on the rating model that's the integer field that stores the actual numerical rating itself so we're going to use that and we're going to aggregate it and calculate the average rating in this table and print that to the terminal so let's clear the terminal below here and we'll execute the script and you can see the average rating in the database is three so that's the average rating across all of the rating objects in the database again we can apply a DOT filter functional and in this case I'm going to filter to only restaurants whose name starts with the letter c that will filter down the query 3D set to only two records as we've seen earlier and then over those two records again we're going to call the aggregate function and calculate the average rating over those two records so let's rerun the script and you can see now the average rating for those two particular restaurants beginning with the letter c is actually four out of five stars now so that's two aggregations we've looked at the count and the average aggregation I'm going to bring in two more now and that's the minimum and the maximum aggregations and you can imagine what these do they calculate the minimum value in a particular column or the maximum value so let's clear this print statement here and this time what we're going to look at is the sale database table so sale.objects.aggregate what we're going to pass in here is the minimum so we're going to get the minimum value over the income field in that table so let's go back to models.pi the sale model represents a table where we track the sales made by restaurants in our system and each sale has an income which is a decimal field and that tracks the amount of money made and particular seal in this imaginary system so let's go back to the script and what we're doing here is we're aggregating and we're getting the minimum value of the income here now you'll notice I've put minimum but we've actually used the maximum aggregation function so let's change that to minimum and what we're going to do is print out the results here so let's execute the script at the bottom and you can see the minimum sale made in this particular table is 5.71 and if we go back and change this to maximum we can change the name of the keyword argument and we also need to change the name of the aggregation function if we rerun the script at the bottom you can see that the maximum seal is 98.89 now I'm going to do is quickly go to the management script used to actually create this data and this is the for Loop that creates all 100 sales in the database and you can see the income field is chosen uniformly at random between 5 and 100 so it makes sense that we have values here such as 5.71 for the minimum value and 98.89 for the maximum value these are kind of a cutting the boundaries of that range so it makes sense here that the Max and Min functions are returning those values now the aggregate function can take multiple keyword arguments here if we want to perform multiple aggregations over a database table so I'm going to bring this to a new line and add back that minimum aggregation that we had earlier so we're calling the aggregate function we're passing a minimum and a maximum keyword argument to that and we're using the associated aggregation functions here over at that income field let's execute the script and you can see below we get back a dictionary containing two values now for each particular aggregation and again we can use additional aggregations as keyword arguments for example the average and I'm going to bring in one more aggregation at the moment that's the sum function if we wanted to get a total sum of all of the incomes we can pass another keyword argument let's call it sum and we're going to use that sum aggregation function again over the income field and that model the sale model so let's execute this script and if you look at the dictionary that you returns from the aggregate function you can see we now have four keys and the sum of all incomes is 5290.99 so we're now calculating the sum of incomes we have the average income which is 52 and we have the minimum and maximum aggregations too so we're calling this aggregate function we're passing multiple aggregations in and we're getting a dictionary back with a value for each one of them so these are very common aggregation functions let's see something a little bit different now something very common instead of aggregating over the whole table let's say that we wanted to aggregate our filtered subset of these values so we might be running some kind of report and we only want to get the values for the last month of data or the last 31 days of data what I'm going to do at the top of the Run function is set up a variable called one month ago and we're going to import Django's time zone module at the top so from django.utils let's import the time zone module so we're setting this variable of one month ago which is equal to timezone.now and I'm we're subtracting a Time Delta of 31 days so it's not exactly one month ago it's an approximation that we're going to use in this video and then what we're going to do is we're going to get all of the sales that were made in the past month so we can use the sale.objects.filter function and if you look at the sale model and models.pi there's a field called date time so let's copy that and we're going to use that in our filter function and we're only going to get sales whose date time is greater than or equal to the date specified in this variable one month ago so this statement here is going to get back all sales that were made in the past month and then what we can do is rather than calling sale.objects.aggregate we can use the sales query set that we're getting back on line 12 and we can call the aggregate function on that query set and that's going to calculate these aggregations only on the sales that were made in the past month so let's run this script again and we're going to see the values that are returned from these aggregations and you can see the minimum value of 6.91 in the past month we have a maximum value of 99.7 now you may notice that that's higher than maximum value before we did the filtering and that is because I just regenerated this data randomly because the sales that I had were not from the previous month I originally generated this data quite a while ago when I started this series so the date time field on these sale models was quite a while past and therefore I wasn't getting any results when I did this filter statement so I regenerated the data that's why we now have a random value that's higher than what we had before but this is the maximum value as filtered from the past month sales and we also have the same for the average it's 58 and the sum is now 3491 so this is how we can take a date based aggregation of our database data using Django and these types of filtered aggregations can be built up very easily with the Django orm and for the statist sessions among us there are other aggregation functions that we can use here for example we have a standard deviation and a variance aggregation function that we can perform in exactly the same manner as we did here below so lots of possibilities with Django and aggregation the key Point here is that aggregation will look at the fields that you give it and it will return a single value based on whatever the aggregation function is it could be the minimum the maximum the sum the average the count and so on and the aggregate function will return a python dictionary with each result in that dictionary as a key value appear let's now move on to annotation in Django's orm now the difference between aggregation and annotation is that when you annotate values you're going to get a value added to each model in the query set that you have coming back from the annotate function whereas the aggregate function it does not do that it returns a single value based on the aggregation so when you use the annotate function that we're about to see rather than breaking all of the rows down to a single value The annotation is going to be applied to all of the models in the query set so annotation can add new and important data to your Django models in a query set so let's start with a contrived example here I'm going to remove this code and I'm going to add a comment here explaining what we want to do here so let's move connection dot queries down to the bottom what we're going to do is we're going to fetch all restaurants and then let's assume that we want to get the number of characters in the name of the restaurant so if a restaurant was called XYZ we want to add the number three to the model now what we're going to do at the top here is input another function we already imported the upper function earlier on and that transformed the restaurant name to uppercase we're going to import another function that we can use at the database level and that's the length function and we're going to use this to calculate the length of the restaurant name and then we're going to annotate each model in the query set with that particular number with the length of the name so let's see how to do that now we're going to set up a variable here called restaurants and it's going to be equal to restaurant.objects and this time we're going to use the dot annotate function now the annotate function does work similarly to the aggregate function we can pass a keyword argument representing the name of the field that we're going to annotate on our models so let's call this one length name and we're going to set that equal to an expression here where we use that length function that we import and we're going to calculate the length of the name field on the restaurant model so if we go to the restaurant model this is one of the fields on the model it's a car field so we're going to calculate the length of the car field for each row in the database table and then we're going to annotate that model with this field called Len name that contains the number representing the length of the restaurant's name so in the line below that let's print something to the terminal we're going to call restaurants.first to get the first value from that query set and then we're going to look at the Len name attribute which will exist on that object so let's run the script and at the bottom you can see that we get back a value of 13. so what's happened here is that the restaurant model which normally doesn't have a field called Lynn name has been annotated with that extra information and that extra information represents the length of a particular field that already exists on the model and what we can also do here is use the dot values function so I'm going to do is call Dot values and we can get two values here we're going to get the name but we can also get the annotated value as well called length name and we're going to see what output we get back for that particular query you can see if we look at the terminal I'm going to expand this the query set that's returned contains two keys one has the name of the restaurant and the other has the annotated field called learn name with the length of that restaurant's name and you can inspect these names and the lengths of the names to verify that this query has worked successfully so the point of the annotate function is to add additional information to your Django models in a query set based on some kind of aggregation or some kind of database function that's applied to the data in that table and we can also use these annotations in a filter statement in Django so let's imagine now that we want to filter these restaurants to only those whose name contains more than 10 characters so after we've annotated the query set with this function what we can do is then we can call the dot filter function here so I'm going to call Dot filter and then on a new line we're going to specify the filter expression we're going to take that new field Len name that we've annotated on the models and then we're going to use the greater Lan operator here and in fact let's use the greater than and equal to look up here and we pass the value that we want here in this case it's 10. if we now re-execute this script and we look at the output you can see that we only get back values whose restaurant's name contains 10 or more characters and if we change the lookup to just greater than and re-execute this we're going to see that we get back even less rows in this particular query set only two are returned whose name contains more than 13 characters let's see another annotation here we're going to use another database function that I'm going to bring in from the functions module and that function is called concat and that's used to concatenate two or more fields from the database let's now see how to use that in this run function below and in order to use concat we're also going to bring in a couple of extra fields from the models modules let me remove all of these aggregations for now and I'm going to bring in the car field that we've seen on Django models already and also a value object we're going to see how to use that in a second let's now go down to the Run function and I'm going to remove this code here and we're going to start by using a concat operator now and we can also remove this comment I'm going to add another comment here explaining what we want to do with the concatenation what we want to return here is something that looks like this we want to get each restaurant's name here and then in Brackets what we want to get here is the average rating for that restaurant from all of the ratings in the rating table that are applied to that restaurant so we're going to build this concatenation with the concat function that we've just imported and what we're going to do is we pass multiple Expressions into this function and the database is going to concatenate these together at the database level and return them to the client so let's start by adding the restaurant's name here this is this part of the string that we want to get back from the concat function the restaurant name and then what we want after that is we want to add a raw value and that's why we imported this value object from the models module we can pass a literal into to the value function and if we look at what we want to construct above after the restaurant name we want to have a space here and then a bracket with the text called rating so what I'm going to do is pass that literal in here we use a space and then bracket and then the word rating will be put in there along with colon and then another space and we also want to get rid of the closing bracket because that's going to be applied after we aggregate together the ratings here so that's our first letter on here after we do that we want to pass in the number and this number represents the average rating for the restaurant so what we can do in the concat function is use the average aggregation function and what we're going to do is import that at the top again from the models module I think I'll remove that by accident re-importing that and what we're passing in is the average and we're looking at the restaurant table here so what we're going to do is we're going to get the ratings and we're going to follow that Fallen key to all of the ratings and calculate the rating average here so just to explain this the restaurant model has a reverse foreign key to all of its ratings and we can follow the foreign key to those models and look at the raw rating under the hood and then we're going to aggregate them with the average function and then all we need after that is this closing bracket once we've calculated the number we just need to close off this expression so we can use the value function again and this time all we're passing in there is that closing bracket so what I'm going to do is move this expression to a new line we need one keyword argument to pass in here and that is the output field we're going to set the output field to a car field in Django and what this is going to do is it's going to take these values here it's going to calculate this concatenation and then we're telling Django give us back that output as a car field as a string so I hope that makes sense let's now see how to use this concatenation that we built up as an argument to the annotate function so I'm going to create a variable here called restaurants again and the query is going to be restaurant.objects.annotate and let's pass a variable a keyword argument here called message and that's going to be equal to that concatenation that we top above so for every restaurant in the table we're going to calculate this concatenation and add it to each model with a field name of message so let's now iterate over each restaurant R in that set of restaurants and what we're going to do is we're going to print out our DOT message to see what's been added to the model so let's clear the terminal and rerun this script and you can see what we get back below is the results of these concatenations we have the restaurant name and then we have our brackets containing the average rating for each restaurant so this is a bit of a more complex example we're taking the concatenation operator and we're passing multiple values to that for example the name of a field in the Django model a raw string value with the value function and we're also passing another aggregation into the concat operator using in this case the average aggregation over the ratings and then when we built this concatenation up we can use it as an argument or a keyword argument here to the annotate function in order to give the Django models return from the a database query that extra information that can then be used in your application for example you could take the message field that's been added here and you can display that in your Django templates so that end users can see the average ratings for these restaurants that's just one example but we're going to move on to some more practical examples now let's assume that we want to annotate each restaurant object with its total value of sales so the total sum of all sales made for that restaurant I'm going to remove all of this code here and we're going to start from the top of this function let's create another variable called restaurants and it's going to be equal to restaurant.objects dot annotate and The annotation here we're going to pass in is called total sales we'll give it that name and we're going to use the sum aggregation function here which we need to import again from the models module and this time from the restaurant we're going to follow the foreign key to its Associated sales and then we're going to look at the income field on the sale model and we're going to aggregate those and sum them up for all restaurants in the database now what I'm going to do below that is use a print statement and we're going to pass a list comprehension in here and for each restaurant in the data we're going to print the total sales so let's make the terminal a bit bigger and execute this script and we can see the output here for each restaurant we're getting back a decimal containing the total sales for that restaurant and that's been brought in from the sales table that's being aggregated in the database and summed up for all of the values for that restaurant now when we call annotate in this way we're getting back every single column from the database table as well as any annotated columns as well so what I'm going to do is bring back the connection.query so that we can see this and I'm going to use the P print function here in order to see this better and I'm going to remove the print of the total sales so let's go back to the script and execute that and we can see the queries that are being executed I've got no queries here and that's because I'm not evaluating the query set so let's bring that back and re-execute now if you look below you can see the SQL query that's being executed for getting all of the columns from the restaurant table and you can notice this column as well where we're using the sum aggregation function and we're casting the result of that as a numeric value and returning that with the Alias of total sales and you might also notice there's a group by statement here in the SQL and we're gripping by every single field on the restaurant table now that's fine but let's say we don't need to get back every single column from the table after we've performed our annotation we can then use the dot values function to specify the values that we want to get back so for example we might want to get the name but also our aggregated field called total sales and that is coming from this annotate function the keyword argument that we pass in there called total sales now if we execute the script we're going to see that we get an error and that's because we're now returning a dictionary so we can't use this dot notation because this is a dictionary we need to index in with this syntax here and get that particular key back from the dictionary so let's rerun this and now we get back the values accordingly but if we look at the select statement you can see we're only getting the name and the aggregated field called total sales when no longer looking at every single column so again we can use dot values in order to limit what's coming back from the database and that can help improve performance and remember what you get back from values are python dictionaries so we've seen here how to use annotations to get back summary information that can be added to all models in a query set for example here we've annotated each restaurant with the total number of sales that they've made by using the sum function and following the foreign key to the sale tables income field let's comment out connection.queries and remove this code above let's see another example now we want to annotate restaurants with the number of ratings that they've received we're going to set up a variable here called restaurants and this time we're going to use restaurant.objects dot annotate again and we can annotate each restaurant with a number of ratings that they've received by giving out a keyword argument which we'll call num ratings and then we're going to use the count aggregation that we saw earlier to count the number of ratings that have been applied to that restaurant model so let's import count again at the top from the models module and then below that we're going to use those restaurants that we got back above and let's call the values function and we're going to get back the name of the restaurant and we can also use that annotated field num ratings and we're going to see the number of ratings applied to every restaurant in the data let's run this script and you can see that for example the first restaurant with the name of Pizzeria one has four ratings the second one has two and so on so the numbers here they're coming from this count aggregation function that we are adding to the annotate method and again if we look at the queries quickly we're going to build up towards something new here if we look at the SQL that's executed again we can see the grip by operator here and we're gripping by every single column in the restaurant table and this is what ensures there's an aggregation for each row in the database so there's 14 rows in the restaurant table and for each model that comes back here we're going to get back an aggregation for the number of ratings for every single Row in that table now we might want to also get the average rating for those restaurants and we can do the same with the annotate function as we did before with the aggregate function we can pass multiple keyword arguments in here so I'm going to pass a second one in and this is going to have a keyword argument called average rating and we're going to calculate the average of the restaurant's ratings using this statement here if we look at the values function below as well as the number of ratings and the name I'm going to add another field here for the average rating so let's run this script again and remove connection.queries and we're going to see the output of this and we can see now as well as the number of ratings we're getting back an average rating for that restaurant so again annotation is being applied for every role in the table that's coming back from this query so we're Gathering the number of ratings and the average rating for every restaurant in the database now we might not want to perform an aggregation for every single Row in the database instead you might want to create a specific aggregation by gripping by a subset of the columns now an example of this would be rather than calculating the average rating for every single restaurant individually we might want to calculate an average for Just For example the Italian restaurants in the data and as well as Italian restaurants we might just want to calculate these aggregations on each restaurant type so that we can report for example on how the average rating differs for Italian restaurants against Indian restaurants and so on so how can we specify a subset of fields for aggregation here when we use the annotate method what we can do is Before We call annity we can call the dot values method so what I'm going to do is go back to the documentation for the values method and we're going to read this paragraph at the top this is very important to understand how annotations work in Django now ordinarily annotations are generated on a pair object basis an annotated query set will return one result for each object in the original query set and we've already seen that however when a values Clause is used to constrain The Columns that are returned in the results it the method for a evaluating annotations is slightly different instead of returning an annotated result for each result in the original query set the original results are grouped according to the unique combination of fields specified in the values Clause so an annotation is then provided for each unique group based on the values that we pass into the dot values function and as an example just below here where we consider an author query that attempts to find out the average rating of books written by each author so this query author.objects.annitate will calculate for each author in the database the average book rating for that author but if we look below that you have a different query here where before we call annotate we're calling the dot values method and we pass the author's name into that method so in this example authors will be gripped by the name so you get an annotated result for each unique author name now the difference here is that if you have two authors who have the same name their results will be merged into a single result in the out of the query so this is very important if we use the values function before annotate the grippings are performed based on these columns that we pass to values we're going to see an example of this now in our application what we're going to do is calculate the number of ratings that have been applied to each type of restaurant in the data now the difference here is that there are 14 restaurants and each of these restaurants has a particular type if we look at the type choices here they can be an Indian restaurant Chinese and so on so if we use the dot values function and be passing the restaurant type field before we use the annotate function what's going to happen is the annotations are going to be applied at the level of the restaurant type so for example we'll get the number of ratings for each of these types and that will be returned as values on these models let's see an example now we're going to go back to this script here and I'm going to remove the average function here for now and we can remove that from the results below here so before we call Dot annotate what we're going to do is we're going to call Dot values and we're going to pass in the field that we want to group by and in this case it's the restaurant type field on the restaurant model we're going to grip by that field and then we're going to calculate the number of ratings that are applied to each of these groups now before we run this script and see the output we're going to remove the dot values function here and we're just going to print the output of this statement here where we use dot values and then dot annotate if we run the script below we can see the output that we get and this time we have a single field in The Returned dictionaries from the dot values function the restaurant type is the key and that maps to the type of restaurant for example a Chinese restaurant and then we get back the number of ratings the total number applied to all Chinese restaurants in the data so this time we're getting back a result only for each distinct gripping of that restaurant type field that we're passing into the dot values function so there's only a small subset of restaurant types for example Chinese fast food Indian attack Italian and Mexican so what we're getting back here is not a result for each row in the table but we're getting back a result for each distinct grouping so this allows us to perform more complex aggregations using Django and if you look at the generated SQL by uncommenting the connection dot queries and re-execute the script let's make the terminal bigger so we can see this if we look at the group by statement here you can see that we're gripping only by the restaurant type now so before we were gripping by every single column in the table in order to get an output for each row but this time we're gripping by the restaurant type so we're going to get an output for each distinct restaurant type and the underlying database so that is the effect of using the dot values function and passing a field or set of fields to that before we call the annotate function when we call values before dot annotate Django is going to use the gripping described by the dot values function in order to compute that annotation so I hope that makes sense let's move on to the final thing in this video we're going to see how we can use the filter and order by functions on these annotated films that are added to Django models so let's go back to sales made by restaurants I'm going to delete this and replace it with this statement here where we're annotating the restaurants with a total sales field let's uncomment connection dot queries and we're going to run the script again and we can see the output here containing all of the restaurants and each one of these restaurants now has a field called total sales and we can use a for Loop to print out the total sales for each of the restaurants in that query set so let's execute the script again and we're going to get back some numbers here but you can see the numbers are not ordered in any way they are only ordered by the restaurant primary keys that are coming back naturally from that table now the Django orm does have an order by function and we can order by the results of an annotation if we chain the order by function after The annotation and we can pass the name of the annotated field and that is total sales this needs to match what we pass in here in this case we're ordering by the total sales and that should give us back a different ordering of this query set based on the number of sales that the restaurant has taken in so if we re-execute the script you can see this time the numbers are in order from smallest to largest based on the number of sales that that restaurant has made so the very first restaurant in this query set is the one that took in the least amount of money from these sales and the one at the very end is the one that took in the most amount of money and just like normal in the order by function if we use this syntax here we can get back things in reverse order so the restaurant with the most amount of sales will be first in the query set and this will now be in descending order as you can see here and as well as ordering we can also use the dot filter function to filter the results of the statement based on some sort of value that we pass into the filter function so you can see the output of the total number of sales on the terminal below let's say we only wanted to get back the restaurants who had taken in less than 300 in sales we can use that annotated field total sales and then chain the less than lookup to that and we pass in the number 300 here and then what's going to happen is we're going to get a filtered query set only containing restaurants whose sales or the total sales is less than 300 and you can see that here the value is coming back are only those two restaurants who have taken in less than 300 in sales and if we uncomment the connection dot queries and re-execute the script we're going to look at the SQL query that's generated here let's expand the terminal and we can see that we have a select statement we perform the aggregation here and cast it as a value of total sales and you can see we are using a group by operator and again we're gripping by every single field in the restaurant table because we're getting back an annotation for every single Row in the table but what we have below the group by is this halving operator in SQL and this allows us to do some filtering based on aggregated results so we're taking the results of the group by operator and we're taking this sum here and we're only returning the values that match this statement here the values are less than 300. now this has been quite a long video one last thing to show before we finish is that we can actually aggregate the annotated values so what I'm going to do is remove the for Loop and we're going to look at the restaurants that we got back from the above query and that's the restaurants whose total sales was less than 300. what we're going to do now is chain the aggregate function to that so restaurants dot Aggregate and we're going to take the two restaurants that we got back above and we're going to calculate the average sales from those restaurants and we're going to pass this annotated field called total sales enter the average aggregation function here just like that and what we can do is we can print these results to the terminal and what we're going to see below is the average total sales for all of the restaurants in the query set that we got back on line 10. now let's remove connection.quaries actually we don't need to see the query but what we can see in the terminal is that we get back a dictionary with the average sales key and the value is 252.7 for the 2 restaurants that we got back in the above query so the point of this is that we can perform annotations we can then perform optional filtering and ordering and then we can aggregate if we want the results of those annotations by passing the annotated field into our aggregation functions to return a single value back from the database so that's all for this video we've seen a lot of things in this video how to use the values and values list functions in Django we've seen how to use the Aggregate and annotate functions and we've seen how we can use the values and annotate functions together in order to perform a specific gripping operation in the underlying database in the next video we're going to look at some more advanced querying operations in Django but for now if you've enjoyed this video please like And subscribe to the channel be much appreciated and we'll see you in the next video