Transcript for:
SQL Querying Fundamentals

in this video you're going to learn the basics of sql querying in just 15 minutes and by the time we're finished you'll be wondering what all the fuss was about are you ready then let's get started hello and welcome to vitamin bi bringing you business intelligence for beginners and beyond on this channel i help you do more with data so if you're new here consider subscribing right so as i said in my intro i'm going to teach you the basics of sql querying the essentials that you'll need to know if you're considering becoming a business intelligence analyst obviously we can only go so far in 15 minutes but i promise you that by the time we're finished you'll be able to understand quite a lot of what you might be called upon to do we'll be focusing specifically on querying data not creating or deleting databases tables or data just selecting data from tables that we want to return for our specific needs before we get started let me give you a little context you might be wondering why you need to learn sql with bi tools becoming more and more sophisticated can't they write the sql queries for me the answer to that is yes and no yes when it comes to requesting data for different charts graphs and tables that you might want to present in dashboards in these cases the bi tool will have some kind of graphical interface to be able to drag and drop fields to return data but no in that in a lot of cases you'll need to write sql queries to pre-aggregate filter and select only the data necessary for the project that you're working on these are often called views and you create views mainly because it's more efficient than connecting to all data when you only need some of it so you could have a table that contains millions of rows of data but by creating a view of it your bi tool will only need to work with a few thousand making analysis and presenting results much faster so you see knowing how to query with sql is a useful skill to have let's jump onto my computer and get started so today we're working with a sql lite database that i found online containing data compiled from the fifa video game by ea sports so soccer stats i'll leave a link to the database in the description if you want to play around with it too in terms of the tool i'm using to query the database it's called navicat it's the premium version that allows you to connect to and query pretty much any relational database but they also have a version for specific rdbms like mysql sql server etc link also in the description here we're in the navicat interface on the left we can see that i'm connected to the soccerdb connection that i've created and in that connection is a database called main and in that database we have various tables country league match player etc we're going to be working mainly with the player table if i double click it we'll see the data contained within that table we've got three different ids player name birthday height and weight what i want to do is write sql queries and ask the database to return specific data from this table so i'll open a new query window now when we're asking for data we're using what's called a select statement so that's what we need to write first and if we want to return the whole table in the result we do so with a star then we specify the table that we want to select everything from in this case the table called player select star from player then we run the query and the database returns all data from that table you may notice that i've written everything in lowercase because the sequel isn't case sensitive however i have the option to what's called beautify the sequel and when i do you'll notice that select and from have been capitalized although it's not obligatory this does actually make your queries easier to read it's not so important with such a small query like this one but when you've got one with tens if not hundreds of lines of code it really does make life easier so we've selected all fields or columns and all rows from the table but what if we only want to select specific fields simple instead of the star we just write out the field names we want separated by commas like this player underscore name comma birthday i run this and those are the fields returned when we select specific fields we can also rename them by creating aliases to do this we use the as function so player name as name and we can see that update we could specify an alias containing a space but to do this we would need to put the name in quotes like this full name let's go back to all data and look now at how we can ask the database to only return specific rows in the result to do this we use a where clause with different operators for example we could select only players with a weight of 190 pounds so we use the equals operator we could also select any player with a weight greater than 190 pounds or greater than or equal to 190. you see it's not that complicated and we've gone from over 11 000 records to just under a thousand being returned we can also specify more than one condition by using either and or or so weight greater than 190 and height greater than 190. so both conditions must be met if i change it to or it means that either condition must be met in order for the row to be returned so that's selecting rows based on the value in integer fields what about when it comes to text values well we can also use the equals operator to find any row where the text matches exactly with what we specify player name equals in single quotes aaron galindo but we could also use the like operator which achieves the same goal but what if we want to select just rows from all players called aaron so basically the player name field starts with aaron here we can use a percent character after the text to look for like so and we get all aarons returned if we wanted to find any player name that ends with aaron we put the percent character before nothing that ends with aaron we could put a percent before and after which would basically mean any player name that contains aaron and we've got rolando aarons finally here's an example which puts the percent in the middle of text so here we'll return all rows where the player name starts with a and ends with n you can go even further with the like operator by using underscore to represent single characters an example would be like t underscore m percent so the underscore is asking for any rows where the player name starts with t then any character between the t and the m followed by anything and run we've got tamas tamir tim timmy etc there are also things called wild cards that allow you to go even further when specifying the rows you want to return but we won't go into those for this video we will however finish looking at the where clause by talking about a couple more operators that you can use for text fields we have in which allows us to specify multiple or statements more simply however it can only be used for exact matches so we can't use percents or underscores with it so let's look for ronaldo and messi open brackets cristiano ronaldo comma lionel messi integers we can use the between operator so i could say where weight between 180 and 190. finally we have the is null and is not null null being empty in this table we don't have any null values but if i open up the match table we can see that there are lots in there so select star from match where home player 1 is null or is not null there we go let's go back to our player table and see how we can sort results we can do this using the order by clause the rows are currently sorted by the id field in ascending order but if we wanted to sort by weight we can say order by weight this sorts by weight in ascending order because that's the default sorting method when not explicitly specified but we can change this to descending order by adding d e s c now we can see who the heaviest player is when it comes to sorting data using order by this isn't something that's overly important for our use case because when we're selecting data to create views we're usually then going to connect that view to a bi tool and use that for building individual chart queries and these tools will let you apply sorts to the data now we're going to look at how to join data from different tables to demonstrate this i'm going to use the player attributes table we can see that in this table we have the player id but not the player name to be able to create a view containing the player name plus the overall rating we would need to join the data from the player attributes and player tables i'm going to start by specifying player api id comma date comma overall rating now i want to get the player name from the player table how do i do that well because we're going to need fields from two different tables we need to specify which field comes from which table we do this by writing the table name dot field name so player attributes dot player api id same for date and overall rating now we can specify perhaps after the player id player dot player name if i run this i'll get an error saying that there's no such column as player dot player name because we're saying that we're looking for it in the player attributes table this is when we need to specify the joining of the tables using one of the four join types i'm not going to go into them in this video but i'll probably do a separate video that explains them in more detail so don't forget to subscribe so in this case we'll use an inner join and we'll say inner join player the player table on and now we specify what fields we want to use to create the join in this case we're going to use the player api id field which is contained in both tables and again we have to add the table name first dot field name equals player dot player api id and run now we have the player name appended to the player attributes table data pretty clever right but let me show you something even more clever i showed you earlier how to use the as function to create aliases for fields well we can actually do this for tables as well which makes the query much cleaner first i'm going to give the player attributes table an alias of a and the player table an alias of b i just need to write these two letters after where i've specified each table so a after player attributes here and b after player here once i do this i can now replace the table name for each field like this you see much cleaner so you may notice that there are multiple rows for each player on different dates so how would we add up all of these different values for each player well to do this we're going to want to aggregate the data using the sum aggregator i'll add this to the overall rating field and run the query ah that's not what we were expecting but that's because we haven't specified in the query how we want to group the data together this is when we need to use the group by clause when we use the group by clause we need to add into it all of the fields that we need to group so in this case essentially all fields apart from the overall rating a dot player api id comma b dot player name comma a dot date and run this is correct but not exactly what we want we want to combine all of the ratings for the different dates so we actually want to remove the date field from the specified fields as well as the group by and run that's better but let's clean it up by aliasing the sum overall rating as rating let's sort this descending order by rating descending strange that ronaldo and messi aren't in there at the top let's try and see why this is perhaps there are just more entries for some players when we sum them up to find this out we could add a count of b dot player name and yes in this case how about using an average i'll replace sum with average a avg and run and there we go that makes more sense we're going to go one step further and filter this result to do that we can use the having clause the having clause is only applied to the result of the group by function so isn't the same as the where keyword that's applied before and it's only applied to numeric values so in this case i'm going to ask for only ratings above 85 i need to write it directly after the group by and before the order by so having rating greater than 85 and run only 26 records and there we have our final query i'll beautify it and this is what it should look like you can see that all of the sql keywords are capitalized and you can see that we've come a fairly long way in a very short space of time we've only covered a very small part of sql as a whole so there's masses more to learn but what i've shown you today is a really good foundation for giving you the confidence to dive deeper did i not say that once we'd finished you'd wonder what all the fuss was about if you got value out of this video please do like share and subscribe for more videos like this one why not start with this playlist here as always thanks for watching i've been adam finer and until the next time stay be i curious you