Hey, what's up everybody? So for those of you who do not know me, my name is Caleb, from Caleb the video maker, too. And kill crate comm so I did a a design series A while ago is my first video series here on YouTube, about 25 videos long. And then I went on to make a 67 part video series over MySQL. And I did a couple other ones, I have over 100 databasing videos, and I wanted to, I wanted to try to tackle this, again, to try to make you better explain things better, and also bring you to a new level of skill. So what what level of difficulty is this course going to be? It's going to be for beginners. So why do you want to use a series you most platforms, programs, applications rely on a back end database. That's how we store information in this world, everything we do is powered by a database. So it's important that you know how to properly implement database concepts designs, to know how to properly structure data. So that way, you get the best optimized database that will set you apart from the rest of the IT team or the rest of your classmates, it will give you the skills needed to get that job or whatever you want to do with it. Also, if you're just a business person, or you have your own project, and you need to learn how to build a website, or you need a database to store all the information you're using, this series will be helpful. Basically, database design is the fundamentals of databases, you should know how to properly structure data, this series is not going to be very specific on a specimen a certain database management system. So this isn't like MySQL database design. This isn't like Oracle, this isn't SQL Server, this isn't whatever else you use, this is going to be general database design. So then you can take these skills and apply them to any of the databases that you are working on that are relational, which I mean through this series, you will you'll really understand what that means you don't get that. So the first part is going to be a lot of the concepts on the chalkboard, and I'm not going to like write tons of scripts on here. So don't worry, it's not going to be confusing to where you can't see the code on a screen. Because in reality, a lot of database design is not done on a computer first, first, you have to think about it. And then you kind of just draw your plans, whether it be in your head or on a document. Or if you use a database design program. I use a chalkboard because I can easily draw concepts, thoughts, way the way data is connected, and all that. So a lot of these databasing videos, they're going to be on this chalkboard because that's all you really need. We're not we're not programming a database, we're designing a database. And designing is art, you draw it, you design it. So a lot of the concepts from this video will be useful no matter what you will be using for your database. So database design. Before we talk about database design, let's first talk about a database because we can't design a database if we don't know what a database is, that doesn't even make sense. So what is a database? And that's a really good question. I'm glad you asked. Well, I can actually tell you what a database is. And that's what this video is about. So a database is something that stores data. Alright, another vague term data. And I want you to guys, I want all of you to think about data, very broadly, don't think too specific. Think of data as anything we can store. In a database or anything we can write down. Anything that has a value can be anything. data can be the list of every single customer of your store since 1995. Or it can be a list of every single transaction in a online shopping center. Or it can be a list of every user and their username. Or it can be a list of every single weather event. Since 1927 can be pretty much anything. It doesn't have to do with specific people data is so broad. So database is obviously broad too, because the database stores data. So yeah, just think of it like this. Here we have a database of bases where we store data. Pretty simple. Alright, so now, that's pretty vague, and I'm not sure if that was very helpful. So now let's break it down to where we see it from a technical standpoint. Everything in this world, not everything, but almost everything is ran from a back end database. It's what stores all of our information. So for example, I like to use websites, very examples, because they're very clear. And they're very easy to understand the concept of this. When you go to a website, and you sign up, you, you give them a username, you give them a password, and you give them an email, all of that, when you sign in the next time, it allows you to sign in, well, how did that website remember what your username, email and password was? That's because all of this information that you put in, when you registered, all of these values are given to a database. So here you can say is our database. Sorry for my craptastic handwriting, but don't don't even worry about it. This is what's said to be the front end. This is said to be the back end. That's because we don't necessarily see what's going on back here, we can put in our email, our username and our password, and they can get stored in this database. That way, when we come back again, and we log in, we give them a username and a password, it takes that value, it checks it with the database to allow you to get into the account. That is a perfect example of a database. So the database can store that information. But it's not limited to only registration, you can use this to store all of your records of sales in a store. And also for that just goes on. Don't think of it as a limited thing, because there's so many things you can do with the database. So many things. Next question you may be asking is how does it differ from, like a spreadsheet, for example, so a spreadsheet, basically, if you don't really know much about spreadsheets, it's basically where you have columns. And then we have rows, and we can store information about like, we can enter the values for each column. spreadsheets are good, if that's what you need for your very small company sees me. But if you want to get to any complexity, where you can, like, imagine if we wanted to select, let's say this was a spreadsheet for all of my employees, I own a huge business with 100 employees. And I want to select every single employee who has missed less than three days and have been a an employee for at least three years, and I want to give them a $100 bonus, well, that's something if I had a spreadsheet, I'd have to manually go through that check every single value, that's not gonna work, because time is money in business. But anyways, with a database, we can run queries to get those kind of values to figure out which employees have been a member, a employee for three years and volumous, three days or less. So that's how it differs from a spreadsheet, basically, what you can do with it, if you have a spreadsheet with 10 employees, or like I'm sorry, with 10 values, and that's like you can figure out all of your information in a couple of seconds, well, then you might not need a database, that's up to you. The other thing is with with spreadsheets, it's it's a it's either like all data or no data. Let's say I just this stored every single detail about every single employee that I had, and all I wanted was their name and their address. Well, we can't really do that easily with the spreadsheet. Unless we take all those columns and like copy and paste and all this extra garbage or find some algorithm to do things fancy Lee, if you know what you're doing, you can probably do it with a database, we can selectively choose which color. So we can say only these two columns Do we want returned, and we can ignore the rest. On top of that, we can allow different users to access different information. Back to the website example. We have Users page, and then we had a database. When here you put your username, email and password that's stored on a database. But that's not necessarily the only thing that's stored on a database. A user might only get to access that but a database administrator might get to access when they registered, the transactions they've made on your shopping website or what they've watched what what they viewed, the comments they've made, how often they visit the web page, it's pretty much limitless to what you want to do with the database. So we can see that two different users, this person only gets to view some things at the database. This person can only can view pretty much anything with a spreadsheet. It's kind of all or nothing. Do you have access to the spreadsheet? Yes, I do. Well, here it is. If you don't have access, well, then too bad, you don't get any of it. So that that's awesome. That's a security feature as well, because you don't want this person to get all of the hidden information that's stored within this database. So hopefully that makes things pretty darn clear. The next video, I'll kind of be talking about how data is stored in a database. Because I didn't I was going to get into that in this video, but I kind of ran out of time. But for the sake of just your understanding, if you decide not to watch the second video, which you will watch the second video for I relations, that's a term where we get the term relational database. So this series is for what's known as a relational database. That is a specific kind of database. And that's going to be most of the relation that's going to be most of the databases that you work with. If you're working with database design. We're going to be talking about the relation, very basic. Don't freak out, but it is math, we're not going to go super in depth with math, if you want to know more about the math, I have blogs on my website about that. Caleb curry.com, forward slash blocks, advertising, not blogs, advertising house advertising, anyways, basically, a relation is just a connection between data. So let's just think of silly example, if we have the two sets of numbers, two, four, and then the other set was six, eight, and then we kind of combined these sets. So we've kind of like crossed here, and then we cross down here. And where are these cross is our new set. So we have the combined set of six, two, and then we have the combined set of six, four, and then we have the combined set of eight, two, and then a combined set of a four. And like you're probably wondering, like, what the heck does that have to do with database design? Well, somewhat, we kind of do the same thing with database design relation in relational database comes from this mathematical concept. But rather than combining sets of numbers, we're combining attributes of real world things. So let's apply this to the database. First, you need to know two terms. The first one is an entity. The second one is an attribute. So an entity is anything we store data about attribute or the things that we store. So if an entity is a person, an attribute would be their name, their username, their password, their hair color, their address, their phone number, their relationship status, the person they're in a relationship with the orders that they've made, or whatever, whatever it is, it's depends on what you need to store for that specific application. So entity is what we store the data about. attribute is what we store. When we apply this to this, this connection with the the the relation, we have the attribute types and the attribute values. So let's say we have this person over here. And we'll name him Caleb, because it's an awesome name. All right. And then we will also give him a username Caleb curry with no spaces. And we'll give him a password. Pi 123. And then over here, we have the set of attributes for a table we have name, username, and password. We can make this connection of data so the name is Caleb the username Caleb curry passwords, pi 123. That makes sense, although graphically It looks like a mess. So we store these in tables, which is exactly what we were doing with the relations. So we have up here in our columns we have we have the name and then we have the username and then the password. Don't worry if you can't read my handwriting you get it. We store this in a table. Right? So these are the things we need to connect with this person over here. We named Caleb and I accidentally erased them, whatever. We'll just leave it at that. So that's Caleb. And we want to store this information in this table, which is a graphical way to, to illustrate how we're storing data in a database. So we take his name, Caleb, we take his username, Caleb curry, we take his password, pi 123. We kind of get this, this table concept, it almost looks like a spreadsheet that we talked about in the last video. Although it is slightly different, and we'll be talking about that as time goes on. This guy is the entity that we're storing information about. These are the attributes that we need to store about this person, or any other person. So we enter specific values for each one of these attributes to make the relation between the attributes, and the entity. Hopefully, that makes somewhat sense. And if it doesn't, don't freak out this all kind of I mean, this is complex thought. And you don't necessarily need to understand every single detail to design a database. So I'm just trying to explain things. And we'll get through. These are the attributes of this person, this entity, an entity is basically just anything we store data about it's something can be a person in this case, but it can also be an order, it can be a customer can be a client, it can be a employee, an employer, whatever it is, it doesn't even have to be a person. So now, the relation idea, you kind of understand that we're just making the connection between the attributes, and the entity, the value that the entity attributes, we're entering that in the database. That's, that's kind of a weak explanation. But honestly, it doesn't really matter that much for getting into basic database design. So if you want to learn more about that, just look up relational database design, you'll figure it out. So now, we have the entity over here. And then we have the attributes. So in table terms, you probably know, rows and columns, you might not know specifically what they are. And I'm going to tell you this right here is a rough. You see that might be run out of space here. I think we're good. Yeah, we're good. So this, all of these values are a row. So a row is all of the attribute values for a specific entity. So this row would be easier just to go like this, this row, points to that entity, that person. Pretty simple. A column are all of the values for a specific attribute type. So the attribute type is username, we can have another person Billy Joe, and his name, his password can be within you, and his name can be bill. This is another person in our table. So this is talking about a different person, we have this person over here. pretty chill, or this row is talking about this entity. The username is a column where we have all of the attribute types username, So Caleb curry, Billy Joe, and whatever more if we put more values into the table. Okay, so it's again, a little messy, so I'm just going to clean it up a bit. Erase. So just to summarize a bit, we have very first thing we have an entity type. That's that's the word you want to know entity type, which talks about entities I'll just I'll just tell you what an entity type is in just a sec, entity type. And then we have the attribute type. So basically, the type means it's a category. So the person was a specific entity, the case the guys named Caleb, that was an entity, it was the person then we also had the entity, Joe or Billy or whatever his name was. Those are two separate people to separate entities, although you could tell that they were somewhat similar. They're both in the same table, they were both talking about the username, password, and so forth. So the entity type is user, a specific entity would be Caleb, or bill, or Jamie, or Jake, or whoever it is. So the entity type is basically the category of the entities that we're storing. The category is a user and all the all the entities within it should be a user. So my camera stopped recording by itself for some reason, which is super annoying. But anyways, our entity type is the category of entities such as user, every single entity should be a user. Caleb was a user, Billy was a user, we could have Jimmy john, we could have got Katherine, we could have Caitlin, we could have Cassandra, we could have, you get the point. The attribute type is basically the categories of attributes. So we had the username, we had the name, we had the password. These are attribute types, because we don't actually have a specific value. If like, no one's name, his name. When we actually give these specific values, they no longer become types, and they become specific examples. So this guy over here, here's our entity, we can store. So basically, here's our entity, we can store attributes about this entity, the username can be Caleb curry. The name can be Caleb. And the password can be pi 123. So these are specific attributes about that guy. Now all these are all like complex terms, and blah, blah, blah, blah, who really cares? What really matters is can you put this into a table for people to understand who really cares if you know the terms, but I mean, I'm sure your professor care. So making that anyways. Table, the columns or the attributes, so we could have, for example, an ID, a password, a user name, and email, we could keep going address, phone number, and whatever else. And then a row is where we give a value for every single one of these attributes, all talking about one entity. So like this guy, this guy right here? Well, his ID, for example, 72, his username is Caleb curry, his email is cool, guy@hotmail.com blah, blah, blah, blah, blah, all of this row talks about this guy. So basically, just think attributes are columns. entities are individual rows. Sweet, then this whole thing right here. This is the table, and that is the entity type. So the entity type would be user because every single row within this table should be a user. Oh, another thing I would like to mention real quick, another name in mathematics for a row is a tupple, or tuple. I think it's terrible. I don't really know. But that is another term you'll hear. So yeah, tupple is a row. Thank you. See ya, sorry for like yelling and getting all intense, but fun. Alright. This will be about database management systems. And because this series is over relational database design, it will be specifically relational database management systems. And we'll talk about the difference. So first, let's just talk about in general, what is a database management system. So this is often shortened to B dB, and S for database management system. And relational database management system is the same thing except it starts with an R. So the our DBMS, so I'm just gonna write that because it's easier. Alright, so where we left off in the last video, we learned? Well, actually, like both of the videos, we learned that a database can be used to store a ton of information. So we have this database, just think of it like a barrel with a ton of information just piled in there. So like usernames, and like passwords and junk, and transactions. And all this stuff is listed in here. And as you saw on the last video, it looked very similar to a spreadsheet. So what makes it different, we have all this data. Now what do we do? What do we do? Just look at it? No, don't be knew, obviously we do cool things with it. Obviously, I don't even know what that was about. So what we can do is we can run what's known as a query. So a query, just sort of like searching our data, doing something with it, and giving us cool results. So a database management system, uses our data in manages it, allowing us to view it in a human friendly way, and do cool things such as search for values, change the appearance of data, changed the way data is stored, and so forth. So for example, we have 6 billion users, okay, just for just for an example. And we want to delete anyone who hasn't been active in a year. Well, without a database management system, all we have is a bunch of data. And we'd still have to go manually do all that and delete all that. So the database management system, is what allows us to easily run a query, alright, select everyone who hasn't been online in a year. And then we can delete everyone who basically, basically just delete everyone who hasn't been x active in a year. Basically, a relational database management system is just a sub category of a database management system, a specific kind that is designed to work with relational databases. So yeah, not much of a difference there. So just some of the things that these database management systems can do. The first one is obviously run fancy queries to give us specific results, such as give us all of the transactions that were processed before 2011, June 10. That's something fancy we can do with a database management system. It also allows us to change the way our data is presented. That's something known as a view mechanism. So basically, if you mechanism allows us to change the surface appearance of our data. So when we store our values in our database, we might have it to where it's, we're going to store an ID, we're going to store a username, I'm just going to shorten these up to make it easy. We're going to store username, we're going to store a password, we're going to store an email. And then we're going to store street address, state, city, so forth. But let's just say this is how it is for now. Here is the actual table structure for the user table. So here are all of our attributes for a user. Get rid of this. So it's clear? Well, with a view mechanism, we're able to get different views of this data at the surface level. So for example, if we had two people accessing this data, one, what's his name him, Jim, and the other one? Let's name her, Jane, ie, Jamie, Jim and Jamie. Let's say Jim, he doesn't really care about the emails, because all he's working on is to find out information about usernames and passwords. So what he might do is he might create a new view of just the username and the password. Once again, I'm just shortening these, just to make things simple for illustration, I probably wouldn't put them in the database like that, unless you wanted to, but whatever. So this is a specific view, we can select the username and the password. But the internal structure of the data has not changed. This is still how it's stored within the database. Jamie, on the other hand, she cares about the ID and the email. So she creates a new view. Just like that. So this, these are both called views. So the view mechanism allows us to create different views. This can be used for database administrators, as well as for other applications, such as when we have a web page, we may want. We may only want users to be able to access their username, password, and email. But stuff such as the date they registered, that might be all private on the database. So then we can just create a view giving them the information that they have access to and that is a layer of security. So basically, if we have a huge database for a business, well, the database admin administrator is going to have access to it all. The president of the company will probably have access to it all. But the people who are advertisers, they probably only need to know information about advertising, the the people who monitor comments on your website, they are not going to need access to the sales and the transactions, they're going to need access to the the table for comments. That's a security feature that a view mechanism allows us to do by giving different views. So not every single person who uses the database has the privilege of creating a new view, just because you can access the database does not mean you can do things such as update data, change the way it's structured, and so forth, that is probably something only the administrator or the owner of the business is going to be able to do. That means we can allow users on websites, they can only ask, they can only access their specific username and password, they can't access everybody's username and password only theirs. But if this guy, Jim, who is hired at the company, he can access the username and password of every single user. That's a security difference difference between the two people, not allowing the user of the website to get too much information. And at the same time, not limiting Jim too few too little information. So that is a really good security feature. So what else a relational database management system allows us to do transactions, which we'll be talking about as the time comes, basically a transaction is what it sounds like, it's when we do something with the data that for the technical terms, basically, it is either complete all the way or it doesn't work at all, if we have a multiple step thing, such as transferring money to an account, deducting that much money from the other person's account, so forth, updating the data boom saved. If at any time the power goes out, and the server crashes and it doesn't complete, well, then the transaction is canceled, nothing is saved. That's something you can do with a relational database management system. So that's pretty much the basics of relational database management system, just in case you're not really sure, like the terms and stuff, examples of relational database management systems or just database management systems. MySQL is one. It allows us to build and run a database. SQL Server is basically like the Microsoft, if you like Microsoft, it's run on the server, and then you access it. So you have Microsoft server, we have Oracle database we have, I don't even know how to pronounce that I post gray SQL or something, and so forth, the list goes on. But basically all of the big terms that you hear in databasing, they're referring to the relational database management system, or the database management system, if it's not relational database, don't really don't really think of the database and the relational database as two separate things. They are what they're used as one. But if you do want to be like conceptual, well things, we have the database, which stores the information, and then the relational database management system, which allows us to manipulate that data. When we have something such as MySQL. We don't really have a difference between these two. And everything's just kind of like, okay, we use MySQL to create the data. It's stored. My SQL is kind of like all we do the actual files, they're stored on hard drives on a server or your home computer wherever you're running your database from. And the relational database management takes the data on the hard drive and puts that into presentable tables for people to like the administrator to view because if we have like a hard drive disk, let's just say this is a hard drive disk. And some of the data is going to be stored like right here. Other parts of data that's going to be stored like right here, and some right here. Well, if this is a table the relational database management system is going Want to be able to take that and put that in the appropriate location to make it presentable, so don't really think of it like, Oh, we have a database, and then a relational database management system, it's kind of think of them as one and all in themselves. So with stuff like MySQL. The other thing, finally, I know, I'm kind of running long on this video. The other thing a relational database management system is going to do is it's going to create consistency. In bigger piece of chalk here, consistently, it's going to make consistency behind for the front end. So on a website, we could have someone put their full name in one box, and that can be stored on a database as two separate columns. first name last name. Well, let's say if we updated this, and we decided to make it one column for some reason, which I don't recommend, well, the front end didn't change, because the only thing that changed is the back end. With a database, we can have consistency on our front end to make our users happy, never having to worry about things changing. On the back end, the things can change. For example, the way we store dates or our time zones or our character sets, well, that's generally, that's not going to affect the front end, unless the data on here is directly affects the front end. For example, if this is a username column, and we delete the username column, well, that's going to cause issues, because it's not a column on the database. But basically, when we have a database, the front end is generally consistent, as well as we have server side scripting languages, such as PHP, or whatever else. Well, that is going to hide our database and even even step farther. So basically, rather than saying, oh, there's no username column, it'll say something like error, try again, later, something very vague, so people don't understand like, Oh, crap, I can pack this guy because I know his database structure, I can figure out how to mess up his website or his business or whatever. So yeah, that was a server side scripting language. I'm not gonna write that out, because it's huge. Yeah, that is the basics of relational database management system. Hopefully, that was helpful. In this video, we'll be talking about SQL. So s q, l. Alright, so what is SQL? SQL is a programming language used to communicate to a database. Basically, we speak English databases do not so SQL is kind of like the mediator between computer database and human English, think of it like that. It's almost English, it's very simple. And in this series, we're not actually going to be programming any databases, although we will be talking about general SQL concepts. So it's important that you know about SQL. So SQL is not like a specific relational database management system. programming language, it is a general language used for every single relational database management system. From vendor to vendor, such as MySQL to SQL server or something else, it might vary a little bit, but the general concepts of SQL stay the same. So this is this is going to be a general concept that you should know. So SQL first is used to define the database structure. And then it manipulates the data within. So basically, we can kind of think of SQL as two categories here. One, it defines the structure and then to manipulates by that it means we insert data, and then we can search it or delete it or update it. These are the two subcategories of SQL. We actually for some reason, name these their own languages. So this is data Definition Language for DDL and this is data manipulation language, or DML. Some people may say there's more categories, but I mean this is pretty much the domain to so think of it like this. I'm just going to erase this here. Now that we have this. We have DDL DML when we have like a table, for example, I'm just gonna write this down here. And we have the columns. So we have like ID user name, password. Well, that's data Definition Language. data manipulation language is what we would use to insert a new value, such as the ID of 72, the username of Caleb curry, the password of pi 123. That's data manipulation, language. So when you think of a table, and if, obviously, if you have a big database, you'd have multiple tables. Well, all of this structure and the connection between tables, that's all data Definition Language. So it gets all of the main structure of the database. And then the, the data within the database is data manipulation language. So those are the two main categories. Just I just want to talk about some of the features of both of these languages. So basically, when we define tables, we have a create statement. So basically, all SQL I write in all capital letters. That's just a pattern that I've chosen. So create is an example of an SQL command used to create tables. And we can create a database this way. That is data Definition Language, because we're defining the structure of our database, a command such as update, or, yeah, I guess, update if we're updating the actual values within columns. Well, this is manipulation. language, because we're not changing the structure, the columns stays the same, just a specific attribute of an entity changes. we're updating Caleb Curry's username to Caleb curry 123. Well, that's a data manipulation language. Another thing that SQL can do is what's known as a join. And joins are very important, very important with a relational database management system. Because with a relational database management system, we break things up into separate tables, for simplicity sake, for one, sorry, I was I added the scene there, I think I was whatever. Rather than having a spreadsheet with like tons of information, we break it off by entity. So here's a table about the user, here's the table about the orders, here's a table about the customers, then we can join those values to get a new, larger table, a new view, that looks like a huge organized table. I mean, basically, it looks, it doesn't look like we broke it up into multiple tables. So if we have a table here, and then we have a table here. And let's just say we have, this is a sale table. And this is a user table. Within the sale table, we have a sale ID, we have a user ID the person who bought the stuff. And then we have the objects that they bought. And I mean, this is this this silly example just for you to understand the concept of a join. And then within the users table, we have the the name, the user ID, and so forth, we can do a join here, we could take the objects that they bought, the person the objects that the person bought, as well as the name of the person who bought it, connected by the user ID, which will we'll talk about this more, then we can get a new view, which has the name and the items that that name, but it's all talking about the same entity, this user. So a naming convention is just a pattern that people do or that you do to keep things consistent. So not only will this naming convention help keep my database consistent across different tables and columns. But this naming convention is not only mine, other people use this naming convention. That means when I go to a different person to review my database, they already understand what's going on. Now, obviously, there's multiple different naming conventions, there's not one way to do everything, and it's not required. This is just how I decided to do it. And this is because most people who use my SQL, which is what I learned first use this, but for different relational database management systems, you may have different systems of naming things. So I'm just going to get Explain what I'm going to do. Anytime I write SQL, I put it in all capital letters. So select is an example. Fortunately, we're not going to be doing much SQL. So I'm not going to have to write in all capitals on a chalkboard because it's, it's quite complicated. But when we're talking about anything I named myself, I use all lowercase letters, I just use all lowercase letters, because I just think it makes it more simple. So when we have a table, such as a user table, I don't capitalize the EU. And then when I have an ID, I just put user ID or just ID whatever, but I don't capitalize, Id I just keep it lowercase. And for spaces, I just use an underscore, so I don't use any spaces. So basically, that's my naming convention that I use, and yours will vary. And we haven't really got to this yet. So you might not understand this. But if I have a foreign key, basically, if I have a copy a connection over here, to where we have the same column over here, I just make sure their name both the same thing. So I know what they're talking about. So yeah, I, some people will do things such as put like user table, or they capitalized certain words. And if you want to look at other naming conventions for your specific relational database management system, go right ahead. In fact, I'd encourage it. But just so you understand what I'm doing here, I will be doing another thing, when we're talking about stuff in the real world. Sometimes I don't necessarily use that naming convention. Because if I'm writing everything I need to store about a user, well, I might just put, we need to store the user's address, I'm not going to put the underscore users underscore address, I might just put the user's address with space in English. And then when I convert when I start making this into a database, structure, schema, whatever you want to call it, then I'm going to convert that to just address lowercase. We're going to be talking about Okay, what is database design? And when we talk about what's a database, we never really talked about database design. Why do we have to design a database? What What's so special about it, it needs to design it's not like we're building a house or something? Well, in a way, it's it's kind of like that it's we're building schematics to build the best database. How do you measure whether a database is good or bad? Well, it has to do with data integrity, which we'll talk about that in more depth in an upcoming video. So data integrity is basically where all of your data is correct, it's up to date, there's no disconnected data. For example, if we have two tables. And for some reason, these tables are connected in some way. Well, if if for some reason this link breaks, well, then when this table updates, this table gets left behind, and then the data becomes incorrect. That's a data integrity problem, because the data is not managed properly. So with a good designed database, we prevent data integrity issues. So all of our data is up to date. We don't have, we don't have repeating data, we don't have old data that should not be in there. Which, if for example, if you have a database, you can store lots of information from like long ago, and that's fine. But you don't want data that has recently changed, and it hasn't changed in the update. So you want the database to always be up to date. Unless for some reason you're storing like an archives, let's say you have a store, and you want to store an archive of every single order you've ever made or sold or whatever, well, then you can store all that in a database. And that doesn't really need updated because it's an archive. But if you have, if you have a customer in your database, and the address is out of date and doesn't update, well then you have a problem. Another example is if the person up has four so if your database is designed badly and you have repeating data, well then one example of the data can update and the other ones left behind. So now you have like two addresses for one person, which should just be one address. That's another example of a data integrity problem. So we'll be getting into like specific cases of data integrity problems as we go on. But when we talk about data design, database design has Whole, people usually break it up into three sections, two or three, it just kind of depends on, like, what people mean it, it varies from person to person. So okay, first, we have the conceptual. And then we have the logical. And then we have the physical. Alright, these are all known as schemas of schemas, whatever you however you want to pronounce it, but that's this. Alright, so we have the conceptual schema, the logical schema and the physical schema. A schema or us just think of a schematic, like when you're building something, it just tells you the way the data is structured. The conceptual, this up here is when we're just kind of thinking about things, how things are related. We don't think of limitations as an Oh, we're not going to be able to do that, because we only have this much availability or something. This is kind of like unlimited brainstorming phase. And as we move down, this is more specific to where we say, Okay, this is how we're going to program our database. So up here, this up here is more general. This down here is more specific. Once you get down here, you have to start worrying about, alright, what kind of relational database management system Am I going to use. But when you're up here, you can just kind of think of general concepts how data is related. So the farther you go down to the physical, the more specific things get. I mean, that's pretty obvious. So rather, rather than thinking of it like three steps, I kind of think of it as a continuum, we are taking our general ideas and moving them to specific implementations. Hopefully that makes sense. But basically, when we talk about these individually, the conceptual just talks about how data is related. Because in a relational database management system, we have relationships between tables, if you think of, we have a users table. And then we have a sales table. And these users can go to this website, and buy products. That's our example. Here. It can be anything but this is just an example. Well, every single sale has to be bought by someone, specifically one user. So that means there's a connection between the user and the sale. So we have this connection here. That is part of the conceptual design, we say, this table, the sales depends on the user. So that's, that's because with without a user, we can't have a sale, because there's no one to buy the stuff. And a user doesn't doesn't necessarily have to have a sale. So it's dependent this way. That would be the parent, that would be the child. Hopefully that makes sense. When we get to the logical, we start planning out our table structure. So we have all these relationships between data, well, then we can realize all right, we have specifically a user's table, and we have X number of columns, the username, the password, the ID, the address, the phone number, the credit card information, if that's something you need to store, anything like that. Then we have another table. And is the the sales so we have the sales table. So we have user, use a table, and then we have the sale table. And then this, this is connected in some way. So we have user can have multiple sales, and a sale can only have one user. That's a logical database design, because we're actually structuring our columns are data types, our relationship for drawing those out, so we have this logical drawing or schema, the physical, that is when we start actually implementing that into a database. So when we're talking about just designing and we're not actually creating the database, the physical design would be figuring out what relational database management system will work best. What are our table types? What what server Are we going to store this on? How are people going to access this? Is it going to be over the internet, it's going to be over computers programs. So usually database will be on a server, and this will be accessed by multiple different people. That's because the database is like the core of all data. For example, if we only had this one person using the data, well, then we wouldn't need this huge database, we can just store that information on their computer, because they're the only one on there. It doesn't even make sense. So typically, everything's kind of accessed on a server. So the physical would be okay, what kind of server we're using? How are people going to access our data? Is it going to use a webform? Do Is that something we need to program? And then when you physically implement that is that that's when you actually program the database, build the webform, or whatever you are doing, installing the server and testing, building the users who who's going to access it, the views? What are the different views going to be? What data is going to be returned, and so forth. This is also when you really focus on security. Because logically, when we're in the logical phase, we're still kind of thinking, Okay, no one's going to try and break it, no, everyone's going to use the database correctly. But in the real world, that doesn't work out that way, people, either they try to break things, or they just don't know what they're doing. Or there's errors or there's incorrect data. Well, that's when somewhere along this continuum right here is when we need to start worrying about security. We don't want people breaking into our database. And we don't want our database to fall apart and have data integrity issues. So hopefully, that clears some things up. So the database design, something's up real quick. database design, is using skills that you know, to build a database that is not going to have data integrity issues, update anomalies, we'll talk about those two, but that's basically when if for some reason you're storing two of the same data, like if my address is on in the database twice, well, that's a problem. Because when I update my address, only one may update, for example. And now the address says I live in two different places. That's, that's wrong. So a database design is a method to separate information over multiple tables, rather than having one huge table like this, where I store my address, we store our sales, we store our, our customers, we store our our favorite video games we store our relationships with. Yeah, you get. So basically, this is a bad design, because it's all in one table. And when when I have to update it, so we have, let's just say this is a row, right? And I have the columns are ID, and username. And then we have us again need to be a little bigger. And then we have username, we have address, let's go phone number. It's easier phone. And then we have favorite color. When we store like this, well if we need to put a new favorite color, we have the ID seven. That's me, we have the username Caleb curry, we have the phone number one 800. Favorite color cammo. And then if I want to put a new favorite color in, well, we can't because we don't want to we already put one in there. So we need to create a new route, say seven Caleb curry one 100 dot dot dot. And then favorite color. Green. Right? Well, now we have all those duplicate data, we have this in here twice, do this in here twice, we have this in here twice. And we have, we have to have two rows for favorite color. So your you have to find a solution to this problem. But now if I decided to update my phone number because I got a new phone number, well now I'll update it and I'll change it to 1234567 whatever it is, well now I updated my phone number but for some reason, we have a data integrity issue, because my phone number is in here, two different ways that that's a bad database design. And in this series, we're going to be talking about fixing problems like that. And the best way to do this would be to like separate it into multiple tables. Plus, you probably don't need to store your favorite color. Unless you're running some kind of like art website or build a business, so it's possible. But we can do the same thing with, like orders how many orders if I had to put my orders in that table, which is another bad idea, we have the same data integrity issues. So this series is going to talk you through getting rid of those data integrity issues, so you have the best database design, for best performance. We're going to be talking about data integrity. Alright, so what is data integrity, we talked about it a little bit in the last video, but this video is going to be going over in more depth. data integrity is just having correct data in your database. So when you're when we're databasing, we don't want repeating values, we don't want incorrect values. And we don't want broken relationships between tables. For example, in the last video, I gave the example, you have a table here. And this is the user table. This is weird, right? And like this, and then we have a table over here. And that is the sale table, which is where you sell products. For example, this is just an example. You can you can implement other examples. But this is just to illustrate my point. A sale is when a user buys product, that means a sale needs a user as the buyer. Otherwise there's no sale, we can't have a sale, if no one bought it. Does that make sense? That means the sale is dependent on the user. And we have a relationship the user buys stuff, stuff is bought by a user, this is a known as a Can you see this is known as a relay ship on ship. Now, do keep in mind that the term relational database does not come from the word relationship. It comes from relations, which is the mathematical connection of sets, which we talked about that in the earlier video. But just don't get that confused. But relational database does have relationships between tables. So this if for some reason this was broken. And now we have a sale that was bought by a user or no user. So for example, let's say this sale was bought by user with the ID of seven, okay. And then we have a user of ID of seven. So there's a connection, the sale was bought by the user by ID of seven, which points back to the user. Make sense? That's data integrity. But if this relationship is broken, and it says the sale was bought by the user with the ID of seven, and then this person is removed from the database. Well, now this sale points to a table or a user that doesn't exist. And that might be that might be practical, in some sense to say that sale was bought by that user one he did exist. But for things that need to update consistently, we don't want that to happen. So that's example of integrity issues. So you want to keep that relationship to say, Okay, well, we got rid of the user, we need to update the sales to say, well, either the person is deleted, or also do something with the sales. So yeah, sorry, if sounds a little confusing. Okay, so the three main types of data integrity. First one is entity integrity. We talked about entities in like one of the first few videos, an entity is anything we store data about. So we had the user to begin with. This is an entity. What it means by entity integrity is basically unique entities. When we have a user, we often give them IDs This is known as a key. So an ID is used to enforce uniqueness among the entities. Like for example, if we had a table for users, and we store the the name of the person and then we store their address. Now let's just go to phone number. Because when we talk about addresses, it's best to break it up into like street state, and I don't want to go through all that. So we could potentially have a person with a repeating name. So let's To say his name's Caleb, and his phone number is one 800. You wish? All right? Well, if we had another person, if they both lived in the same house, for example, and this was their house phone, well, sometimes people name their children the same name as them. So now we have two people with the same phone number. And you look at this and say, Okay, well, either either we have two people with the same phone number, or we have two rows talking about the same person. And we don't know, because we don't have proper data integrity. So a way to solve this is to either add more columns to where the row would be unique, such as a social security number, or we could add an ID on. And this would be like six, and seven. So now we're talking about two separate entities, this is talking about this guy, because his ID is seven. This one is talking about this guy. Because this ID is six, they're two separate entities, that's entity integrity, having uniqueness among your entities, we have uniqueness with the rows, and we also have only one user table, we're not going to have another user table, because that information should go in the original user table. Onto referential integrity. This is the one we talked about earlier, is when we reference the ID of a table and another table. If we had a comments table on the website, right, and we say who the comment is by, we have a user ID, which points back to this ID over here. That's because the comment has to be posted by a user. Otherwise, we have a comment that didn't have someone who posted it, which doesn't make sense in the practical world. So we always have to have referential integrity, which basically says, Every time there's a comment, there needs to be a person who posted that comment. And that connection needs to remain it needs to stay. If for some reason this is disconnected. And then this user is removed for this table was changed, well, then this could change. I mean, this stays the same. And then there's a disconnection to where this is changed. And this is old, outdated data, where either the user no longer exists. And the comment still says he does, or something else like that. That's referential integrity. And that'll make more sense once we start designing and stuff. But yeah, domain integrity, that brings up a new word domain. domain is basically just the acceptable values for a column for say, it's what we are storing the range of what we are storing within a database. So when we're talking about tables, we have columns. So these columns up here, let's say we have a phone number. Well, that means we should have a certain number of digits, we have the area code, and then the three digits and then the four digits. And that's at least us in my change other places, but that is 10 digits. So we know that the phone number should be 10 digits, and it should be numbers. So if for some reason I put the word cake in the phone number column, Well, for one, this is going to cause an error if we have domain integrity, because it's not a 10 digit number. It's just a word. That's not the proper domain. It's not a digit is not numbers, and it's not tenable. So we know that's going to cause an error. So that's the basics of data integrity. When we don't have data integrity, we have errors. When we have data integrity, we do something to correct those errors, such as saying, all these all need to be numbers. That's how we implement data integrity by setting database rules and how you do that with the database. Well, that kind of varies upon the relational database management system that you use. Some relational database management systems allow for general rules, saying okay, this has to be a range of this numbers. Some are more vague where they, where they only allow you to implement data types, data types would be like, integer 10 digits. So when we talk about practically implementing data integrity, different relational database management systems have different things you can do almost every single relational database system that I have, I'm pretty sure all of them, but who knows. They allow for data types. So a data type is pretty, pretty self explanatory. It's the type of data. So in database, we classify things as generally, integer, or text, or date. So we either have numbers, text, or dates, that's like three general categories, which we'll be talking about that more in a future video. But the data type can put limits on what we're allowed to store. For example, we can say, okay, we only want a character column with the maximum of 20 characters. That means we can put a string in the column that says, I like p 1-234-567-8910 1112 1314 1516 17, that'll fit. But if I wanted to change this pizza to lasagna, well, then I just added a couple extra characters that might cut it off, I'm not sure 1-234-567-8910 1112 1314 1516 1718 1920. So that's the maximum we could put in that data. It's like, that's an example of implementing that data integrity for the characters. So this, this will be for the domain integrity, we can also have referential integrity with what's known as foreign key constraints, which allows us to connect to tables, because when we drop tables, we could say that they're connected in some way. But how do we enforce that in a database, we use foreign key constraints. For example, we can say if this is the parent, let's say this is the users. And this is the comments. The user places a comment, so this is the parent This is the child, because the child can't exist without the parent, you can't have a comment without a commenter. Well, using foreign key constraints, we can say, Alright, if this user gets removed, also remove all of the comments by this user. That's an example of foreign key constraint. As for the non repeating data, that is done with just designing your database in the best way. So yeah, I know that video was kind of jump back and forth from concept to concept and maybe a little confusing, but I'm sure you will get it as time goes on. I just want to do a little bit of review, as well as go over the vocabulary that we've learned. So if you don't really want to know, we're going to go over some new ones too. But if you don't want to review, then feel free to skip this video in the next one. So this video will be about data and database design terms. So the very first term we learned was data. Data is pretty much anything we store in a database, you pretty much know what that is sounds pretty good database. That is what we store our data in. And then we have a specific kind of database, which is a relational database. And what's special about a relational database is that it stores things in tables. We have a database management system, or database managed DBMS for short. That is how we manage a database using code. That's how we control our database. We have a specific kind of database management system called a relational database management system, which is just used to control our tables and the values within our tables of a relational database. Pretty Simple. Now we're going to introduce a new word to you, which is no or no, however you want to pronounce it. I've been told I pronounce it wrong, but I don't really care. Anyways, this is when someone does not enter a value within a column on a table. So if we have a table, and we have something such as fax number, well, not everyone is going to have a fax number. So when we have a guy will say we have an ID here, we have an AI guy with the ID of seven. And we don't have a fax number, this emptiness is said to be no. So basically, you'll hear empty value, although, by definition, No means no value. So it's kind of a bad definition. But who cares? No, basically means there is no data in that specific field. And we'll be getting into more terms like that in a second. So that is what normal means. That's one you will need to know. Another one is an anomaly, or anomalies. So, anomalies are basically errors within our data in our data integrity. So when we have an anomaly, it's something that goes away from what we expect or from the normal. When we update something, for example, and instead of updating one column, it updates 10. And it wasn't supposed to. That's an anomaly. So yeah, you'll need to know that as well. The other one is integrity. We implement database integrity, to protect against anomalies. So we went over three different kinds of integrity. We went over entity integrity, over referential and we went over domain. Real quick what these are, entity integrity is saying uniqueness among the table, referential keeps the connections, the keys, foreign keys, primary keys, they keep them connected across multiple tables. So we could say these two tables have some kind of connection. And then domain is basically saying a column within a table has all of the expected values. So we have a phone number, we would expect numbers, not text or not a date, that is domain integrity. And the word domain is basically the range of values that are acceptable to store within a column. So these are our data, vocabulary that you should know. Now we're going to move on into the design term. So designing is what we're here for. So this is fun. Very first thing you need to know is an entity. Which is anything we store data about a user a, a mortgage, a transaction, a credit card information, anything the comments a comment, they're all entities, that's what we're storing. data about, then we have attributes. attributes are the things we store about the entity a comment, a, an attribute of a comment would be the date that it was made the link, how many characters is it? who posted, that's an example of an attribute a relation. We're talking about mathematics, it's a connection between two sets of data. When we're talking about databasing. Put very simply, it's just a table. So a relation is another name for a table where we have data within our database. So we could have a structure of different tables. So these can be connected. Just I don't know however you want to connect them. This can be our database, where each one of these different sets of different sets of information or data is known as a relation is a table. next term is a tupple however you want to pass that to winging it here. That's another thing for a row. It's a, it's all of the attributes about a specific entity. So if we have this guy, we can say, his name is Caleb, his phone number is blah, blah, blah, blah, blah, his address, his age, his or his birthday would probably be better how long he's been a member of this website, or whatever it is, this all of these values about this entity. That's what that is. That would also be a row on a table, because we physically represent relations with a table. So when we have a table, we have the columns, so we have name, phone number, address, social security number. And then when we enter all of those, for a specific person has the ID of seven, the name Caleb, the phone number, blah, blah, blah, the address, blah, blah, blah, that is known as a tupple, or a row. Another term is table, which we just talked about that which is just a physical represent representation of a relation. Tables are what we store in a database to organize a my like cutting my head off there. To organize our data. Within a table, we have rows and we have columns. The rows would be a specific individual entry within our table. So like that guy we just drew all of his values would be his row, the columns are that specific, is a specific attribute of that person, such as a name. This is the column where we would put the value, Caleb. All right. Now we also have another three, three more terms that we haven't learned yet, which you will possibly hear. They're kind of older terms, but they they're just talking about the same thing. You may hear the terms, file, record. And then fields. So what are these talking about? Well file is just another name for a table, a record is another name for a row, and a field is another term for a call. So other than that we have the words is value. A value is the information that we put in to a specific column. So here is a value Kayla, or subscribe, or get rid of, or 700 or June 16, whatever it is, those are all values. All right. Oh, man, it's already 10 minutes. All right, then we have an entry, you may hear the term entry, which I like to use this for the like the it's almost like a verb, the when you enter data, you get an entry. So basically, an entry is just another name for a row. So um, there's obviously a lot of similar names. And we'll be making a table saying what we're just just getting database design. The process of designing your table to remove anomalies and have data integrity. game or schema, however you want to pronounce that, that is just a physical, that's just a drawn out structure of our, our database, then we have a new term here. It's called normalize, or normalization. normalization is basically just a bunch of steps that we're going to follow to help us get the best database design. So that's just that's part of database design, when we go through these normal forms, to get the best database design, we'll talk about that more in an upcoming video. But just I want you to know that the term normalization is the process of building the best database design. The other one and naming conventions. And naming convention is just a consistency used to make things consistent. So we named tables a certain way we named columns a certain way. we name our database a certain way We draw our database schema a certain way. So basically, convention is anything that is done repeatedly to create consistency. And the last one here is keys, which I'm not sure if I talked about this or not yet, I think a little bit, but not much. But key is something to make everything unique within a table. This is how we make connections among tables, we connect IDs, we may say, Oh, this comment was posted by the user with the ID of seven. Well, that points back to the user table. We do all that with keys. And we'll learn more about those as time goes on. Alright, so you see, we have a lot of terms and a lot of them are repeats for this meaning the same thing for example, we have relation. And then we have table. And then we have file. So relation table and file are all talking about the physical structure, a table pretty much record with circle diesel's record would be a row, so we have a row. And then we have a sub, these are all talking about rows within a table. And then we have, how should I do this? Let's just cross them out columns. And then we have attributes. And then we have a field, those are all talking about a column in a table. And also entry, you can kind of think of that as a, a row, because you put like, you enter data into the table and you get a new row. The value is just a specific, you can think of it like we read some of this. We have this table, and we break it up into columns. So we have ID, name, and then phone number, for example. Those are a column headers, and then we have specific dice, seven, Halo. And then we have another guy eight. Well, if we separate this into cells, which is another term, we have a specific value that we've put in here, such as john, that is a value. No entity, that's the only one that's so we just just went over value entity is what's left that's not circled or crossed out or squared, or some entity is I don't like to really assign this to like a row or a table or anything because an entity is just something we're storing data about. So that could be an individual row, or it could be the table type. So for example, we have a table for users while the user is the entity, but a specific, a specific entity would be a row within this table. Because we'd have the entity for this guy over here. And then we'd have one for this guy down here or for this check over here. Or lady I should say. So these all get their own rows to see. So an entity that's is something we store data about. This video will be just some more vocabulary, there's not as many we're broken up into two categories. This one first one is SQL. So first we learned about SQL, which obviously is program language to use to binary mention this I'm horrible. Structured Query Language is what SQL means. What does that sort of stands for, it's used to connect to your database. Hi, sorry, I just got distracted. I was waving this paper. And like my lens is like flashing for some reason. Anyways, the next part was data Definition Language, which is used to define it's part of SQL. So SQL has data Definition Language, and D and L, data manipulation language DDL is used to define the structure of our database. data manipulation language is used to insert values in our database, as well as update them, delete values and search values and so forth. Oh, the other thing was, I never really mentioned this, but I do want to mention it here, SQL keywords. So anytime we have the word keywords in a programming language, it's talking about something that is reserved that you're not supposed to use for your user defined data. So for example, Select in SQL, this is a word that your database is going to recognize, as Okay, we want you to do something, we want you to select data. It's a keyword, because it's already defined. So I want to I don't want to name my tables or my columns, any keywords because it can cause errors or confusion, or just make things complicated. I always put SQL keywords in capital letters. Right. So that's pretty much all for the SQL. Here's my race. Go America, and every other country, oh, bright presentation. So the very first thing is we have the front end of an application. The so the front end is what the user sees. The back end is what's going on behind the scene. When we have a website, we don't just have what the website looks like, we have code that gives us that look, when we put in data to forms, we have languages that communicate with databases, to create a easy to follow non confusing front end. But the back end is complicated as heck. So that's the difference between front end and back end front end is what the end user at a particular time will see, the back end is what's going on behind the scenes. So often, when we're working with databases, we will program a front end because we don't want everyone accessing our server, it's insecure. And it's just not a good idea. So we often create a front end application to access our database, most commonly a website, although there's other forms, but whatever. So here's our database. Rather than people signing into the server and using SQL code to get information, we often program front ends for people to access this data in a different way. So each one of these front ends would be like a website page, for example. So here would be your registration page here would be your users page, here would be your profile, your homepage here would be if you like you can even have different tier systems. So this could be a page where the website administrator can add or delete users, they have that extra privilege. This is also the view system, which we will talk about in a second. Well, the front end doesn't allow us to directly type in our own SQL, it only allows us to use what the front end gives us, which is then communicated to the database. So that's what a front end end is, the back end would be the server side code used to communicate with the database. So when we when we're talking about clients and servers, we have those two terms, we have client, and then we have server. So very first thing is client. client is think of it if we have this wall here. Over here, we have a computer. And then over here, we have the database. This is the client side, because we're accessing the database over here is the server side. So client, client side and server side, I don't really have room to write that. But the server serves instances of this database to the client where they can access the data in some way depends on how we the privileges we give them. So when we talk about different privileges for different people, we use what's known as a view. So personally, I finish this client side and server side. And we communicate from the client side to the server side by a server side scripting or programming language. It's huge. Well, my hands tired. We can give custom views. So a view is just taking the data from the database and illustrating in a different way than how it's actually be stored. For example, a user on a website can visit, let's say, we're on like a social network and I visit my friends page. It'll tell me their their name, possibly their username, if that's what they use, their date of registration, their phone number, if that's public, for example, anything like that their relationship status, although information that is private may only be accessible to the user who owns it, for example, their password or their email, or their friends lists, that might be a different view, depending on what page you are on what page you are on, and the user that you are. So I may be able to access my own password and email. But that might be private to other people. That's the view. Views often use joins, which basically Connect data from multiple tables to create a new table. We have two tables here. users table, comments table. Can you see that? Can you see it? You better be able to see it? Well, a comment is always given by a user. And we connect it using the ID. It's called a foreign key connection. Well, this ID, let's say it's the ID of 25. Right? Well, instead of having, okay, this column, and it was by the person with the name, with the user with the user ID of 25, we want to say, Caleb curry. So we have a join, taking the name from the user table, and a plot and connecting it with the comment of the comments table to get this new generated. Table. This is just a view. It's not actually stored in the database like this, we're just creating a new illustration of how the data really is stored. And then we can have the name of the person followed by the comments, or comments of that person. You may be asking, why don't you just use the name as a reference? Well, that's because many multiple people can have the same name, for example. So we always need something that's unique for this connection. Whether if you don't want to use an ID, you could use something like a natural key. Use your name, for example, but I often use IDs. And we'll be talking about that more in upcoming videos. So yeah, that is all of the vocabulary that we have gone over so far, pretty much. If you have any questions about any of those, just leave a comment, I recommend that you write all those down, I should have told you that at the beginning. But anyways, I recommend you write those down, as well as the definitions, study those, learn them. And especially study from the previous video study all these different terms that mean the same thing. Because after a while, you get kind of used to those terms. But if you're just learning and you don't know that these are talking about the same things, well, then you can get confused very easily. So I recommend you study. Alright. A row is also called a tumble and is also called it whatever, blah, blah, blah, that will help you that way. When I say whatever word it is, you know what I'm talking about. And you don't have to like, check your charts or whatever, or go watch other videos. So that's something I definitely recommend that you memorize, as well. For the SQL stuff. It's not as important, like the DML DDL. But I do recommend that you memorize all of these terms as well. We'll be talking about atomic values. So atomic values, basically just mean that means that the value stores one thing. Everything we store in a database should be one thing. When we have columns, they are about one thing. All of the values within that column are one thing. We have a column named. Well, what does a name consist of, you can have a first name, middle name and last name. Well, this, although we're talking about one thing and name, it's not really necessarily atomic, because we're storing three names within one name. So if we wanted to make this atomic, the best thing to do for database design would be to break this up into first name, middle name, and last name. So now I have Caleb, Daniel, through three separate names, right? Alright, so when we do store things and database we want to go, we want to try and make them atomic. Now we don't want to go to the extreme, where we break everything apart to where it doesn't even make any sense. But I just think of like, atom, an atom atomic, when this word came about, we never really split an atom. So we thought of an atom as the smallest, indivisible, into visible, I thinks the word piece of information in this world. Now we can split it, but whatever, just so sometimes, you may think, Oh, this isn't really a good term, but everyone will still use this term. So just think of atomic as the smallest one, one individual thing stored. Now we don't want to go to the extreme. For example, if we have the word, subscribe. This is a word, it's one word, don't go to the extreme, where we separate it into like word like individual letters. because it'd be ridiculous. It's like, Alright, we store in a table. Instead of having first name, last name, we have first letter of the first name, second letter of this lab, first name, third letter, the first name for further than first name. That doesn't make sense. We want to break it down into we can treat it as one thing, for example, an error, a phone number, it often has an area code. Well, we don't want to break a phone number up into multiple portions, I often just store phone numbers as one, because you treat it as one thing. When we come when it comes to addresses, one address consists of like multiple, we have we have a street address, we have a city. And then we have a state, we have an area code for a zip code I mean. So addresses, I often break this up into multiple columns, rather than storing one address with 123. Epic lane. legit California 73821. Rather than storing all of that, in one column, I break this up into street address, city state area code. And then that allows us to run more complex queries on our data because we can select all of the users from the state of California, or we can select all of the users who live in this in this city, or pretty much anything like that. So this is part of one of the normal forms, which we will be talking about in future. But as for now, just think of storing everything as one. So not only do we want the column to be about one thing, like an address is just an address, or a phone numbers, just a phone number. But we want the values within that to be singular as well. For example, we could have a column favorite movies. This isn't going to work, because within that column, we could store unlimited movies. And that is also improper. So that's not going to work as well, every single column header should be singular favorite movie. And you may be asking, Well, how would you store multiple movies? Well, that has to do with more complex relationships. And that is something we'll be talking about very shortly. But the solution is not to store multiple things within a column. This video will be about relationships. And I don't mean relationships as dating advice, because I probably couldn't give any of that. But besides that, we're going over database relationships. So what is a relationship? Well, in a database, everything is connected. It's Think of it like a giant web, somehow something is connected to something else rather than storing everything in a giant table. Sorry, my what my boards kind of wet, it's not really writing very well, we split that. And then we have two smaller tables for manageable data simplicity, as well as data integrity reasons. So now, we still have to maintain the connection between data. So we take one huge table, break it into two tables, and then we have what's known as a relationship. There are multiple kinds of relationships. And we'll be talking about that in the next video. But for this video, we'll be talking about the general concept of relationships. So relationships, talk about entities, an entity is anything we store data about. So this is an entity. And this is an entity. And they are related in some way they're connected. There's a relationship between them. This erasers scrap There we go. Okay, there we go. So now my chalkboards. Alright, so I have to worry about entities. There's two terms that we talked about earlier. entities and attributes, entities or anything we store data about. attributes are the things we store about the data, well, when it comes to entities, when that attributes are what we store about the entities, so basically, we have entities. And then we have the attributes for the entity, right, and then we have a different entity so that as part of our database design, we figure out all of the things we're storing data about, think of like a college, we could store information about the student, the professor, the classes, the faculty, the whatever else. So we're gonna have multiple entities, and then multiple attributes about that entity. And then we can have a third entity. So just think club, student, Professor class. This is all part of our database, our database. So our database structure consists of three entities and attributes about each one of those entities. What's an example of an attribute? Well, it can be the student's name, the student's address, the classes, they're taking, the, what, there, whatever, it doesn't really matter. They're a major, that's another entity, we could have an entity about the the majors, such as science, or whatever else. And then we can have attributes for that, like the classes that are required to get that major, how long it takes, and so forth. That will become part of our database structure. Well, these are still related in some way. Because we have a student is a student is going to school for a major. So there's a connection from this entity down to the major. So let's just replace these with like, here's student. And then Okay, that looks like crap. I could have put some specific examples in here. We could have a student we could have a professor sample proof, then we can have class. And then we can have major. Well, a student goes to school to get a major, a student takes classes. The classes are taught by a professor, the professor's typically have a specific area that they teach. So they might be part of the major like the major league to teachers for that major. Whatever. Students Yeah, you get the point, everything's connected. So when we design relationships, that tells us how we want to structure our tables. That's going to make a lot more sense in the next video. So I just wanted to introduce the idea of relationships in this video. The next video we'll be talking about a specific kind of relationship. So, one to one, that is a type of relationship, there are two other ones, one, to many, and many to many. Here are three types of relationships. And we will be talking about each one of these individually, in the next three videos. This one will be over one to one relationships. So I wonder one relationship basically just means that one entity has a connection with one other entity. And that's all they can have. I like to think of a marriage. So we have a husband and wife. This is a one to one relationship, because a husband can have one wife, and that wife can have one husband, the husband can't have multiple wives, and the why the wife can't have multiple husbands, it's one to one, because one entity is limited to one other entity. This entity is also limited to one other entity, if they want to one relationship. So another example of a one to one relationship would be a social security number. Alright, so a social security number in America is basically just a unique number every citizen is given. There are, I think there's occasional repeats, although generally, you think of a social security number as being unique to one person. So we define it as a one to one relationship. Even if there's possibilities of being repeated data. One person only has one social security number. Generally, that social security number is only supposed to have one person. So we have a a person. And then we have a social security number. One person has one social security number one social security number is assigned to one person, if you want to draw out your relationships, you can say person as one social security number, social security number. assigned. I'm sorry, yeah. Okay, so assigned to one person. So a person is assigned a social security number, social security number is assigned to one person, we would not even though there's a possibility that the social security number might be repeated. We want to design it this way. Because that's like saying, the social security number is supposed to be assigned assigned to multiple people. It's not supposed to it's supposed to be unique for the person. So we designed as a one to one relationship. We're going to be talking about a one to many relationship in the next video and then a many to many relationship. And after that, we'll be talking about how we design this within a database. How do we structure our tables. So in the previous video, we talked about a one to one relationship, that is one of the three kinds of relationships that we can have. The other one is one too many. What that means is that one entity can have a relationship with multiple other entities, but a specific entity can only be related back to one other entity. I know that sounds a little confusing when confusing when we're talking about entities and all this conceptual, high level thinking. But think of think of it like comment on a website. When you go on to YouTube or whatever and you go below, you can leave a comment on my videos. That comment is only owned by you. It has your has your name. That is a one to many relationship. The reason for that is you can make many comments, but an individual comment is only owned by one person. So to draw that out, you have I have a user. And we have multiple comments. You can see that the user is able to make multiple comments. But each individual comment, like this one, for example, is only owned by one user, that is a one to many relationship, or one coin, many. Can you see that? Yeah, I think you can. So this is how a one to many relationship works. If you want to think of, again, like the marriage thing. Think of a king, what his name him king, Sam. And Sam demands to have multiple wives, and he's super controlling of his wives. So Sam has multiple wives, right. So I'm just kind of dry on a chalkboard here. We have Sam the king over here. Give him a crown, give him a big smile. And he is royal and famous among the lions. And he's really selfish with this woman, he demands to have 500 wives, or however many wives he desires. So we have all of his wives over here. So he has these two wives, and then he has this wife over here. She's not so good looking. But anyways, he is married to these three wives. Although he's very, he's very protective and very selfish with his women. And he does not want these wives to marry other men. He they're strictly married to the king. So the king can marry multiple women. But each woman can only marry one person. Oops, I kind of already have the line drawn, drawn. So they can only marry one person, they're not able to go marry another guy, that would be a whole nother way that ain't gonna happen off with their heads. See what I'm saying? It's not going to work. So that is a one to many relationship. In the last video, we talked about one to many relationships, the video about that we talked about one to one relationships. Well, now we're talking about the third possibility of relationships. And that is many to many relationships. So when I think of many to many relationships, I like to think of a polygamous marriage, where multiple husbands can have multiple wives. And multiple wives can have multiple husbands. I just draw this out. So you can kind of understand what this would be like. We have the king right here, right? So let's say he's a king, and he decides to marry three women. Let's kind of make up real quick. This guy's Why do you need three wives? Come on? There we go. So this guy is married to these three women, right? And then he is very protective. But then he decides Oh, what the heck, they can marry different men if they want. who really cares. So this girl marries this other guy, right? So now she's married to two people. Right? What's wrong, sir? Only the king can do that. Well, not now. Because it's a one to many relationship. One King can marry multiple women. And one woman can marry multiple men. So this guy is married to this girl. And this girl is married to this guy over here. And this guy's married to this girl over here. So Sally is married to Jim, who is also married to this girl over here. See my hand drawn here. And this girl, surprisingly, is married to this guy. She's married to Kathy who is also married to Jim. And by the way, Jim is married to the second wife of the king, who the king is Mary did this girl over here. And you see, well, it gets very complex very quickly. And that is an example of a many to many relationship. Now, obviously, you're not going to be storing, like, stuff like this in a database. So let me think of a practical example real quick. I can think of a college, specifically a class, and an instructor. Alright. So that's it, let's think of a class and a student. So here's a class. And here is a student. Okay. Now, one class, how many students can be in a class multiple, you know that. So we can have multiple students in this class. Let's draw boxes for quicker. So let's say all these boxes are students. A class can have multiple students. Well, not only that, but a student can take multiple classes. So this student can take this class. And this class can be taken also by that student, and this student over here. And this student can take another class, which can have this guy in it, and this guy, and this guy can be in this class as well. And by the way, he is also a drama jock. He is also taking another class, if he's in that class, which has that student and that student, and it just gets very complex. So you can see this might be very, very difficult to store within a relational database. And there are reasons we actually can't store this in a relational database. You might be thinking, well, what if you need to store the classes that a student is taking in this students within a class, we will be talking about in the up that in the upcoming videos, with solving many to many problems, many to many relationships do not work in a relational database, the only two that will work is a one to one and a one to many relationship. But this is just the concept, you think of a concept of a many to many relationship, think of a class and a student class can take many can have many students and a student can take many classes. In the last three videos, we talked about one to one relationships, one to many relationships, and one to many to many relationships. This video, I wanted to talk about how we design these relationships, I decided I was going to break them up into three videos. So first, we'll be talking about designing one to one relationships, and then one to many, and then many to many. So this video will be about designing one to one relationships in a database. Alright, let's begin. So the first thing we need to realize is that when we have a one to one relationship, the the attribute side, so we have an entity. So let's say let's just say we have a connection between a person and their, their username. Alright. So we have this, we have this guy here, let me get rid of these boxes. I don't mean this. We have this guy. And he has a username on a website, we'll just say it's Caleb curry, because that's usually my username for things. And we'll just name this guy. Right? Well, that username, the only person who's going to have that username is him. And he is only going to have one username, that account is only going to have one username. That's a one to one relationship. So we need to realize that this is exclusive to this person. It's all hits, no one else owns it, no one else is connected to it. It's only hits. So that means we can often store one to one relationships as attributes rather than entities. So rather than his username being an entity, it's an attribute. If you don't really know what that means, don't worry, I'll be explaining in just a second. But just think of it as it's exclusively his so that is part it describes him. What is this guy's username? It's Caleb curry. It's an attribute of that person. When we're talking about one to many relationships, though, are many too many. We think of like a class. The students of the class are not exclusive to that class. They can take other classes. Therefore it wouldn't really make sense to store them as attributes because it's not Really, it just doesn't make much sense. And I know, it'll, it'll be clear, so just Just wait a second. So we can actually have a table. The relationship between the account and the username of the account can be stored just as a column within the table. So we have an ID. And then we have a name, for example. And then a username. This username is exclusive to this ID of this person. So like six, Caleb, Caleb curry, that is how we would design a one to one relationship, we would put it within the same table. If we have a new row, within our table, we now have a new person, this person can't have another username that Caleb curry, they have to have a different one. So we can make it john 123. This name is exclusive to that person. So we just store it as a column within our table. Make sense? So here are our individual rows. This points to this ID, it's an individual entity. So the entity is the the account, the user name is an attribute to that an account, there will be occasional times when we store a one to one relationship over multiple tables. And I will explain now when that would be the case, let's just think of an example of a database for a credit card company, we have a relationship between the card holder and the card. By card holder, I just mean the person who who gets the card and can use it to buy things, the person who owns it pretty much. And then we have the card. All right. Now let's just say this, this company only allows you to have one credit card. So this card holder can only have one card, and this card can only be owned by one card holder. That means we have a one to one relationship, one to one, how would we store this in a database? Well, if we just went with the attribute thing that we talked about earlier, here's what it would look like. Here's our table. And I'm just going to list the columns. So we have the cardholder table. First thing is we'd have like a an ID for the card holder, we would have probably card holders first name, the card holders last name, and then we would have the card. So we could say, well, that's their card, we could let's say we could assign every card and ID or a card number. So we can say card number. If they have a card, we can give it a value if if they don't have a card, we can leave it No. Well, that can be considered an attribute of the card holder. Because all these points back to the cardboard. Now, if we want to store more information about the card, such as the the issue date when the card was given to the person, so we could say, card issue. date. Well, now we can see that this is relying on the card, not the cardholder. So if we want to store extra attributes, about the attribute in the one to one relationship, we can move that to a new table. And we can make it a card table, give it an ID. And then we can say card number and then issue date. Now we can store as much information about this card. So if we want to store the maximum out, be a late fee. Pretty much anything we want to store about this card can be stored in this table. And we can replace this card number with a reference. So I'm going to card ID and that points back to this ID. You see what I'm saying? This is a one to one relationship over multiple people, just like this card has a connection to the card holder. There's a one to one connection over multiple tables. The reason we did this is so it could store more information about the card. Because if we stored it all in one table, we would have bad database design. And I'll show you why. in just a sec. Alright, so here's our table. Can you see that, all right, we have the ID and then the name of the person. And then we have the card number, and then the max amount, and so forth. Well, you can see that this table is really about two things, it's no longer about one thing, we would have to say this is a user and card table, which is not proper, because we're not following the rule of one where a table should be about one entity. And a row should be about one entity. Because now we have the user side, and then we have the cards, it's like trying to store two tables. In one, the only time it's acceptable to have the card number in the users table is when we don't store more information about it. Because now all we have is a card number. The card number is about the user and points to the user. But the max amount of the card has nothing to do with the user, therefore, it's reliant, dependent upon the card. So I wonder one relationship, in conclusion, one, a one to one relationship. The way it is stored is either an attribute within the table, or if you need to store more information about the entity, then we can have another table and then just use foreign keys to connect them which that's something we'll be talking about in an upcoming video. Almost always though, you will be seeing a one to one relationship, used as an attribute. So when you think if you have a one to one relationship between something, you have a connection, let's say it's a dating website, the user is in a relationship with another user, that that's a one to one relationship, we could just have that as a column within our user table. So we could have the user table, we could have the user ID, and the name, first name, last name, a phone number, whatever else. And then we can say, relationship. That's a one to one relationship, we could just have that have that be an ID of another person. And we don't have to worry about having another table for that. This video, we're talking about designing one to many relationships. First, I want to bring back an example from the last video into this video. So we can kind of explain the differences between one to one and one to many. So in the last example, we had a credit card company where a person is allowed to have only one card and a one card can only be owned by one person. Let's draw that out. We'll just draw the table and the columns within that table. So two tables here, we have the user table. And then we have the card table. This is when we store a one to one relationship over multiple tables. Otherwise, we could just store it as a column because I mean, we're only storing one thing. We have the user ID. And then we have a card ID. And then to connect these as a one to one relationship. We have a foreign key. So we put a connection card ID in the user table, the user ID in the card table. So Alright, ID, user ID, that means that what that does is it connects these tables. So user ID to this user ID, card ID to this card ID, that is how we would make a one to one relationship over to tables. And then we can store other stuff about the user in this table, other stuff about the card in this table. Now, to convert this to a one to many, we can change this to something very small. And one to many relationship would be saying, one user can have many cards, but a card specific card can only be owned by one user. So there's no card can't be owned, or co signed or owned by two people. Now, the way we do this, is let me just erase these columns real quick. As a start from scratch. So we have a user table, and then we have a court ID. Once again, we need IDs for both of them. So we have a user ID, and then a card by the user owns multiple cards. And we don't know how many they own there, they can have one, they can have two, they could have three, they could have four, they could have five, so it wouldn't make sense to put the card ID in here as multiple columns. Because I would put five columns for card ID, well, then what if someone had six cards? What are they going to do? They're not, they're not going to have a sixth one. So what we do is we put the user ID in the card table, which points back to the user, I d in this table. So now, if we made specific examples of this, let's say we had this guy here, giving a talk. Right. So we've got this guy right here. And his user ID is 63, just a random generated number by the database. Well, we also have cards over here. So we have this credit card, we have this credit card, we're actually drawing card cards here. So we had the barcode and the name and the codes and all that crap. And then we have this card over here. We make a connection, we want all of these to point back to this guy. So we give a user ID, we could say the user id 6363 63. So now we know that each card is owned by the user with the ID of 63, which we can figure out is this guy, whatever his name is, we can join that. So now we store a one to many relationship into tables, where the many side gets a foreign key to the single the one side. And if you don't know about foreign keys, that's fine, basically, in simple terms of foreign key is just that that Id connection is user ID is a foreign key that points to the user ID in the main table, foreign key points to the primary key and the primary table. This is also said to be a child. I'm sorry, a parent child relationship. This will be the parent because every single child has to point back to the parent. That is how we store a one to many relationship in a database. So how would we know if we had this example? We had comments, the comments section on a shopping website. So you're allowed to leave a review there. Let's say you can leave a review. And we had this table these this table setup. We had the users. And then we had the reviews. And we had a an ID user ID. And then we had a review ID. I just give that to every single table. Every day I give every table an ID. Then we have the username. And then we have the actual review. So the user has a username, the review has the actual comment. And we could also say we could say we have a review title. So we can say like oh This review is this Product sucks, or this product is really awesome. And then we can have a paragraph saying, why that products awesome, for example. Right now, this is not properly designed, because there's no connection between the user and the review. Because if we look at the review table, who put who posted the review, it doesn't say all we have is the review ID, the review and the review title. It doesn't say who posted it. So if we put a key foreign key that points back to the user table, that would say that is the parent, the person who posted that review. So we can have a user ID, which points back to the user ID in the primary table? Is this a one to one relationship? Or is this a one to many relationship? Well, this, we could have multiple reviews, and each one would point back. So we could have a another review by and we can have the same ID. Let's just think of some examples. Let's say we have the review ID of six. And then the review is awesome, are good. And it's by the user ID of 62. Well, then we could have another one with a review ID of seven. And it could say, Great. And then we can add it from the user, Id 62. Well, these are both from the user id 62. So we can say this is a one to many relationship, because multiple reviews going back to one user, if we want to wanted it. So a user can only post one review, well, then we need to put the review ID in the user table. So now we have this sort of like this double connection thing here. Because this review ID connects it. So it's saying only one review ID can be from the user. And honestly, this, this can probably be improved, because the review is not necessarily connected to the user that directly. For example, if the website is solely forgiving, one review, well, then we could give it a review ID. But that might not be the best example, simply because the user might buy multiple products. And each review could have a product ID easy. So I mean, it kind of gets complex and just kind of have to study. But the review, it would probably be best to make it a one to many relationship, because it wouldn't really make sense to have it as a one to one relationship. We want it as a one to many, because multiple reviews for multiple products can be given. So this points to a product, this points to a product, this points to a product, and they're all by a specific user. Alright, so yeah, that's just some basic design concepts that you will need to understand. When you have a one to many relationship, just store over two tables, and give a foreign key to the many side pointing back to the one side. Before we start the many to many relationship design, I wanted to talk about Parent Child relationships. So we have when we design these relationships, a table is always the parent and a table is always the child. And when we're talking about keys, keys keep things unique. And they always they're they're used to connect tables that are related. So every t table has a key that keeps it unique. When we create a foreign key, it references that primary key in a another table. So the primary key is the parent. The foreign key is the child. The child points back to the parent inherits the values from the parent. For example, the parents ID is a six then the foreign key pointing back to The parent is at six, the job. So in a one in a one to one relationship, where we just store everything in one table, which is the most practical way to do things, we don't have to worry about primary keys or foreign keys, we just have one table. And that's all we have, they only have columns within this table. So we have the user ID, the password, the username, the whatever else. All those are just columns within a table, we don't have to worry about a parent or child, because they're not making any relationships across multiple tables. But when we get to one to many, that's when we have this table pointing. Or has many children. So you would say this is the one side it's the parents. Then we have each of the parents children. These all point back to the parent with a foreign key. So if you think of like comments on a YouTube video, every comment has a poster or a user account of a person who clicked post a comment. That means that the commenter is the parent, and the comment is the child. Another way to think about that is if you if you were given a comment on YouTube, you can figure out who posted that comment. That's because the child always points back to the parent. Now, if you look at a user account on YouTube, such as Caleb, the video maker to awesome channel, check it out, and be sure to subscribe. Well, you don't necessarily know every single comment that that parent has given, there might be a feed in there that says recent comments. But when we look at the child, we always know the parent because it inherits the values. But the parent doesn't inherit anything from the child. Think of like in real life, you can see a married couple, and you don't know if they have children or not. But if you see a little baby, we think, all right, that baby has a parent or parents. That's the similar idea with database design. Think of this wall here between the parent and the children. The children inherit values from the parent such as the foreign key, a seven, that tells us who the parent is. That is important. I know, it seems like okay, who really cares, who's the parent or child, just get on with the database design, you knew, I'm going to unsubscribe? Well, this is important, because when we get to understanding how to design each relationship, we need to understand where we put the foreign key, which table does the foreign key go into. that is decided by the child because the child always has the foreign key that points back to the primary key. So user ID, primary key, user ID, foreign key points back to the primary key user ID. So foreign keys always go in the children, primary keys go in the parent. When it comes to many, to many relationships, we have problems with Parent Child relationships. And that is why we can't store many to many relationships in a database. And that's what we're going to talk about in the next video when we talk about many to many relationships, specifically. But just know that when we have an entity, think I want you to think of whether or not it would have a parent, a lot of the times it would but other times not so much. But if you think of think of an order for a store, an order needs a parent, because an order is when you buy product who is buying it, someone has to buy it. Otherwise it wouldn't. It wouldn't exist, the child would not exist without the order. So let's let's relook at this. This is the user and this is the orders. We can't have a child without a parent logically, it just doesn't make sense. So if there's no parent of the order, the order cannot exist. First, we need to keep that connection. And order needs someone who bought that order. Hopefully, that's kind of making sense. So just keep in mind, parent child relationships. I'm not trying to repeat myself, but I'm trying to repeat myself. So you get destroyed in your brain, foreign key goes on the child, primary key goes on the parent. And soon we'll be talking more in depth about keys. So just for now, memorize primary key and foreign key primary is the user ID. The foreign key is a reference to that user ID. It's in a foreign table, and it points back to the original table. This video will be about many to many relationships. Oh, and we're talking about designing many to many relationships. So I said a couple of weeks ago, videos ago that we have problems when we design many to many relationships. They don't work out, right? They don't. And I mean, I never really explained why. Well, this video, I'm going to explain why. Think of, let's think of a good example, a college class can have many students and a student can have many classes. So we have two entities we have to worry about. We have a class. Let's just pluralize it because it makes more sense, the classes, and then the students. So let's just think of an example of a way we would try to treat we would try to design this, and it's not gonna work out, right, but we'll just we'll just try. Here we have classes table, and we have a specific class such as math, math, 101, whatever, really. So we have Matt, we can't even read that my handwriting is so bad. So we have math one on one. And then we list all of the people within this class. So we have student number one. And then student number two. And then student number three. All right, well, we can try that. Except, what if a student drops out? Well, then we have a null value, which we don't want to have those if possible. And what if we need four students? Well, that's not going to work because we only have three columns. So how else can we do it? Well, we can try just having students. Well, then every single column could have like 100 students, and we already learned about the atomic rule, we only want one column to store one value one student. So what do we do? Well, okay, that's not gonna work. Let's try over here. Let's try it on the student side. We have a table for students. So we have Jimmy, and he's taking his first. Alright, let's let's list the columns. So we have the name. And then we have class number one. And then we have class number two. And then we have class number three. Well, that's cool. What if he wants to take four classes, though? Well, then we have to add a new column. And let's say someone psycho crazy guy takes like 20 classes. So now we have all the way down to class 20. And then a new person comes to the school. And he only takes one class. So now his first class is like math. And then there's 19 columns for that guy who are empty, because we have to half the columns for every single person. We think of it like this, we drew that out. So where you can see it better, we would have a table. And then we'd have the ID of the person. And then the classes. So once puts like class one, class two, class three, class four plus five, six. Let's just leave it at that. And we'll just kind of make rows for these. We have new guys sign up. So we have a guy with the ID of six. And his first class is math. Right? And then his second classes, geology and then Spanish, and then fashion. And then biology and then chemistry. And then we have a new guy come and he his IDs, eight, and he only takes science right? Well now we have no No, no empty nothing. We wasted all of this space. And you can assume that's going to keep happening. So that's wasted space in our database, bad design. So how do we possibly do? Well, the trick is, because because we think of if we try to think of like parent children, who's the parent in this situation? Well, a class has multiple students. So that would make the class the parent and the students, the children, but a student can have multiple classes. So that would make the student the parent and the classes the children. So how is it that this is the parent, and this is the parent at the same time, that means this points to a child. And this points to a child. So this parent is the parent of this parent who is also the child of this child, which that just like blew my mind, I don't even know what's going on. So the way we do this, is we break it up into two, one to many relationships. So this many to many, is going to become a one to many. And then a second, I think, I guess all right. Okay, just for you guys, as information, these these colon things, I think the correct way to do that is many two, and four, like the like the end keep going on. Just don't don't worry about that. We'll talk about that. in upcoming videos, I'll just write that for my sake. So we're going to break this up into one many. And then another one too many. But the other way. So that is how we fix that problem. All right. So let's implement that with classes. And students. What we need is what's known as an intermediary table, or a junction table, they're both the same thing. I'll write that out. So you can hear me enter mediary or a junction. There's also like 50, other names for this table, but I mean, just intermediary table will work fine. So I n t e r m e d ay ay ay ay er y. And what that is, that is how we connect our tables. So we break this up into a total of three tables. So we have the intermediary table, which is the one I just talked about. And then we have the students table. And then we have the class table. So class, student, this is the intermediary table. So we have a one to many relationship going this way. And then we have a one to many relationship going this way. So one, many, one, to many. I know that seems a little confusing at first, but trust me, it'll make sense. Once we start explaining things more. So we have one to many, one to many. And this is the intermediary table. So let's just draw this connection. So we kind of make can visualize this a little more. And kind of, like split this to see how it works. So we're storing the many side of both relationships in one table. That's because we know that a one to many relationship is spread out over two tables. So one, to one, to the to, it's being shared. So what would we name is, we would say like, we can name it. class students, for example. And these are going to have foreign keys that point back to the class and the students. So now that we kind of understand what it looks like when we draw it, let's give specific examples to illustrate this. Actually, like what it would work like look like whatever. Alright, so let's say we have our classes over here. And our students over here, okay, so first things first. We have metaphor, just keep it simple math. Science, I mean, obviously, the more in depth like math 101, math, two of five, or 1200, or whatever. And then we have English. But just for simplicity sake, let's say these are the only three classes in the college. Alright? Then we have all of our, our students, we have Caleb, we have celeb, we have calib. And then we have, man, I'm so conceited, all I ever talked about is myself. Alright, let's get rid of my name. Let's go with Johnny, and Jake, and Sally, and Claire, Claire. Alright, so here are all of our students for our class. classes, students, we connect these in the middle. So first things first, we give them all an ID, because that will be our primary key. So we'll just give these random Assigned Numbers, which is pretty much what a primary key is a surrogate primary key. So we'll get this 163, this is something our database would do for us. This one will be 75. And this one will be 89. So you guys can see that clearly, we'll do the same thing over here. So john will have the ID of eight. Jake, the id 17. Sally, the ID of 16. And Claire, the ID for just getting 666. Because she's 666. I'm just kidding, we'll just we'll just go with six. Now we use the intermediary table to connect these IDs. So we have this table here. We have two columns. We have the class ID. And then we have the user ID. And then if you wanted to title these tables, we can make it classes, class students and students, the intermediary table, what an intermediary is, it's something that connects two things, it's like, it allows this table to talk to that table. So it's a connection between these two tables. We put the ID of the user with the class that they're taking. So let's say john is taking English in science, we would have 875 889, right really big, I'm going to run out of room really quickly. And what this does, is using this intermediary table, we can figure out that all of these user IDs point back to a specific person. And these IDs point back to a specific class. We are not repeating data, because we need to know the user have this class ID This is a foreign key pointing back to one specific value. So that means if john decided to drop out, well, then we have foreign key constraints to get rid of these. That way, we don't have to worry about incorrect data or users. I guess, I'm sorry, this should be like students, sorry. But we don't have to worry about students who we don't have to worry about students being enrolled who don't exist. So that is the solution to in many to many relationship. This video is getting really long. So Sally could do the same thing. We could say 16. And then 75. She's taken science, Claire, she's taken a she's taken math. We keep going on if we really wanted to, like can make this table as big as we wanted. And basically this is the easiest, best way to use as much databasing resources storage as possible, as well is now we solved the many, I'm sorry, the parent child relationships. That's because we have the parent over here. We also have the parent over here. They point to the child table. So this child table becomes the child of both of these parents. Sort of how in real life every single person is two parents and the person is this in Tire the row mean, okay? applying that to this, the parent is the class and the student of the class, and the child is a row where we have both the class and the student, this right here needs to be unique. It's an individual child, we can't have 663. Again, because that's saying, Claire is taking science and Claire, I'm sorry, math, Claire's, taking math, and player is taking math, that doesn't make sense. So this right here is unique 63 to six, you can have it in there twice. So the child is an individual connection between the parent and the other parent. So that is how you properly store many to many relationships. And I know the all these relationships are kind of complex, and you don't really understand with like a stupid chalkboard. But like I said, we will be getting into on screen computer videos and the introduction. I mean, I said that in the introduction, we're not going to do that introduction, because that wouldn't make sense. So yeah, we will be explaining these concepts. And then we will be applying these concepts to actual databases. Once we get through all of the database concepts. This video will be a summary of all of the relationship garbage that we learned. And basically, this video is going to be like, all of it in one, and it's going to be awesome. And it's going to make a whole lot more sense after this. So the very first thing that you need to realize when it comes to relationships, is that there's not always a defined relationship for every circumstance. And it's sometimes you have, it's something that you have to decide on your own. So my old database design series, which if you want to check that out, just go to my YouTube channel. A lot of people were confused about, okay, is a student to class? Is that a one to one or one to many or a many to many relationship? Well, that's something that you're actually gonna have to decide on your own. Sometimes you have to think logically about the application of the relationship. So let's think of the example of a class and a professor. What kind of relationship is that? I don't know. Let's try to find out. So we have the class. And then we have the professor. People would ask me, like, Okay, what kind of relationship is this? Well, it depends on the application. If it's me, it can literally be all three, it just depends on what you need for your application. If it's a one to one relationship, that means a class can be taught by one professor, and a professor can teach one class. If it's a one to many relationship. That means either a class can be taught by one professor, and a professor can teach many classes, or the opposite of class can be taught by many professors. But a professor can teach only one class. Or it can be many to many relationship, a professor can teach multiple classes that can be taught by multiple professors. So it depends on the college because some colleges are going to allow the class to be taught by multiple professors, other colleges are not going to allow that. So that depends on the application of the database. If I'm building a database for a college that does allow multiple professors for a class, that means it's a many to many relationship, because many professors teach many classes. Many classes are taught by many professors. Right? So when it comes to designing one to one, one, to many, and many to many, I'll show you how to do all three of those for a class to Professor. So if we have one to one we could have, for example, we could have the professor right. We're gonna have a professor table only know how to spell it. But prof is I don't know if there's two F's, who cares, Professor table, right? If it's a one to one relationship, we can have the professor ID, the name of the professor. And then we can have a class. That's the class that they teach. And it's the only class that they teach. That class is only taught by the head professor. That was, that's how you store one to one relationship. And typically, you're not going to do that with another entity. When it's when it comes to like a name. You typically will do that because the name is just assigned to that person. A class can generally I mean, typically professors can teach multiple classes unless you're some Kind of adjunct teacher, but class, in this case is a one to one relationship. If we wanted to make it a one to many relationship, or then we would break off class here, get a new table class, we have a class ID. and here we can put more information about the class and we'd have the class name. This is the most common way of storing information on two tables. You rarely very rarely store a one to one relationship over multiple tables, but it is possible we talked about that earlier, we'd have class name, and then we can have more information about the class. And then we have a foreign key class ID, that would make this the parent because class ID points to class ID, if you wanted this to be the parent, well, then we would switch that we would make Professor ID, which was probably the most logical way to do that. But you have to keep that in mind because it can be a one to many directions, a one to many this direction, or can be a one to many this direction. And obviously, typically, the class is going to be the many side because a class can be taught by multiple, I mean a professor teach multiple classes. That means that professors on the one side that many is the class, which means we need to give the professor ID on this side. So the proof ID is a foreign key, referencing the proof ID in the professor's table. That is how you draw a one to many relationship. This direction one to me, parent child. Now if we wanted to make that in many to many relationship, we'd actually had to break this up even farther into three tables. So now we have the professor's table. And then we have the class table. And then we have the probably class professors table, which is the intermediate intermediary table. So we have this table here. This table here, this table here, this, these all, this will have an ID, Professor ID, and this will have an ID, class ID. Then you reference this in this intermediary table, where we'd have Professor ID of like six with the class of seven, Professor ID of six, with the class of eight, which is basically saying this professor with the ID of six is teaching two classes. Now, when it comes to IDs for this table, this whole thing can be an ID in itself. Or if you want, you can give it a new ID to say class Professor ID. And that's kind of unrelated. So that's pretty much we'll get into that later. But that is pretty much the song of designing relationships. One to One is typically in one table, one to many, it's two tables with the child having a foreign key pointing to the parent, and then a many to many has to parents, with the intermediary being the child pointing back to both of the parents. Hopefully, that is all crystal clear. And awesome. Because now you literally know how to design every binary relationship and database design. And by binary, I mean a relationship between two tables. So you can have a binary relationship which we talked about, there's one to one over two tables, there's one to many over two tables, and there's many to many over two tables logically, but then we have to break those up into the best way to store those to store those in a database. But this is basically binary because we're talking about two entities. A class and a professor, a student and a class. You can have relationships between multiple tables, but that is something I'm sorry multiple entities, but that is something we will have to address in a future video. Whoa, didn't see you there goodness. Anyways, In this video, we're going to be talking about key words. So, a key is not something you use to unlock a door. keys in database are something else. A key keeps everything unique. Basically, that's that's the easiest way to remember what a key is. So when you think in a database, we kind of structure things in tables. And we want to kind of do that we do we do that. We do do that haha. Anyways, if we have a table here, we have the columns, which are the attributes of each thing. So like, let's make it a user account table, because that's always the easiest example. So yeah. Okay, user. So that's the title of the table. Now, each individual row within this table is going to be a new user, each column is going to be a thing a bout the users. So the columns can be username, first name, last name, password, and email. As an example, I mean, you can really put whatever you really want in there. Well, a key is something that is going to separate this row from the rest of the rows. So in this example, think of the things that can be duplicated, think of all the possible values that you could put within one of these columns. And think, is it possible to duplicate it in your application? I mean, yeah, it's possible to duplicate it if you allow it to be duplicated. But what naturally should not be duplicated. So typically, when you sign up for a website, or a, a game or something, they'll ask for your email. Right? Now, when you sign up, you put your email and you typically have to go to your email, and click like confirmation or something to register your email. So they know that it's a genuine email account. Basically, just so they can have your email and they can send you emails and stuff. But yeah, that is an example of what we could use as a key. That's because every single row within this table should have a separate email, some websites allow you to use the same email more than once. If that's the case, then this would not be a good key. This kind of key is known as a natural key. Because it's naturally already in our at our table, we don't have to define a new column just for the sake of a key. And we'll be getting into more of that in upcoming videos because doing a whole bunch of videos of Ricky's so you get them like really good. All, like I'm gonna go over pretty much everything. But anyways, even if you make it to where only one person can hit us a certain email, and that email can only be used by one person. Or that could be used as the key. The way that works, is it basically let's say we have emails 123 at blah, blah, blah, a VC, blah, blah, blah, x, y, z, blah, blah. And these are three separate rows. Well, this is going to let us know which person we're talking about. And, like now, if someone has the first name, first name and last name by chance, let's say we have two Caleb Curry's Caleb curry, Caleb curry, we can't use the name as a key because two people had and it's, it's going to confuse us because Okay, do we have two people with a name Caleb curry, or one person Caleb curry in the database twice? Is that an error on our part or what? So we can't have the key. Be in there more than once. So it's never there's it's always unique. So the key is always unique. He said, he said, he said, Alright, so what is another example of a key that we can use in this table? Let's get rid of this email. Let's say we're not using the email anymore. And let's say the website allows multiple emails from different people. So like if my mom signed up, and then I wanted to sign up, I can use my mom's email. So we have two people from the same email. That's an example of some websites that allow you to do that, but not tons of websites do that, but it is a possibility. It all just depends on what you want to do. It's up to you because I guess it's up to the if you're making a database for yourself with it's up to you, but if you're making it for someone else, then you get your their rules and you follow them. Let's think of some other ones. Alright, well email, let's just say it's knowing your password. Well that's really a bad one. Because passwords, they're not necessarily unique because two different accounts can have the same password. And if if you didn't allow that to where they could have the same password, you try to put in a password for your new account. And it'd be like, Oh, this password is already in use. And then you're like, Oh, so then you can try that password on other accounts and try to hack into people. So that would not be a good database design, obviously. So password, that ain't gonna work because not every password is unique. First Name, Last Name, it could possibly work. But it's really bad. Maybe first name, last name, and middle initial or middle name. But even that there's, there's people with the same first name, last name and middle name. So that's first name won't work, last name won't work. And the combination of first name and last name will work. And yes, keys can be a combination of two columns. We'll be getting into that too. What we have left is user name. Well, user name is generated word that is used to represent your account. It's typically not a name or an email. It's something like Caleb 123, or whatever, you know what I'm saying. So then, when people will talk to you, they're talking to Kayla 123, rather than Caleb curry, or they can put the name there. But some websites use usernames, or some games use usernames or programs use username, so you have a username and a password. Well, username is always unique. Because if you have two people with the same username, when you try to sign in, how is the database going to know which one you're trying to sign into, you know, they're you they have to be unique. Some of that kind of makes sense. That can be used as a key because we could have the username, cave of food. And then we can have doors 123 and then pop. These are three different people, all the other stuff, first name, last name, password, email, they point to that key. So we have a first name, and a last name, and password, and an email for the user with the username, Caleb, see, that's how the key works. We have a first name, last name, password, and an email for the person with the with the user, I'm sorry, we have first name, last name, pass email for the user with the name, dog 123. And then we have a first name, last name, password and email for the user with the username, pi, that's kind of how the key works. The other thing is that the key should never be changing. So the second thing, never changing. Now, why is this? Well, that's because when we create keys, we use them to kind of structure our entire database, everything points towards the key. And the last example we had, I'll just draw key for representation. Let's say this is the key and we'll just say it's the username. Right? Well, every other column is part is points back to that key. So we have the name of the person, we have the password. And we know that these are pointing to this specific person, if we got rid of this key and that we have two people with the name Caleb with the password, pi. Well, now, are we talking about the same person? Are we talking about two separate people? Or what we don't really know? But now if we give it keys and key let me get get that key back. So this let me see if I draw key right? Probably not. Close enough. All right. Here's our key. Now we know that this Caleb points to this specific person we can have the the username, or you might often see numbers like 72 the ID, which we'll talk about that too soon. Now, if we have a separate key like this, well then that is a different person so we can have this guy have the username. Caleb is awesome. 72 this got me like sexy chicks. 74. See what I'm saying? So that's kind of how the keys work. We never want them to change because then it's kind of it's confusing because all of these things objects are part of this key. Basically, this key is used to define uniqueness. Well, if we can change it, it questions the integrity of our database. Because between tables, we connect things by keys. So let's say this is a user table, this is a common table. If we have it to where we can change the key, well, then this connection is, you're gonna have to update this connection all the time. Now, if we have 50 tables connected to this user table, I'm just going to draw out for snakes like well, I won't draw all 50. Well, now when we update this key, this key connection. So we have a comments by a user, we have sales, we have friends, we have messages, whatever, it doesn't really matter, well, then we change this key, well, that means every single one of these connections are going to have to update that requires a lot of work from the server. Makes sense. So we don't want them to change, because we don't want to have to have that extra work from our database. So that's one thing, unique, never changing. The other thing. These are just three general roles, there's some other suggestions and stuff, but I'll get into that soon. But the other thing is never know. That means empty VAT or no value. So when we have a table, and we have a key value, let's say we have a user ID, this guy six, this guy's five, discuss for this guy's 30, it doesn't really matter the order. And then we have their name, whatever, blah, blah, blah, well, we don't want it to where we can have a blank ID and still have information. Like, Sally. This Sally doesn't have a key, that is another thing we do not want to do with our keys. So here are the three main rules. And we'll get into more in upcoming videos. But the key should always be unique. Never be changing, and it should never be empty. You should not allow a row without a key. This video, I wanted to talk a little bit more about introduction to keys, although I'm going to go out just a bit off topic. And I'm going to talk about indexes. So I don't think we talked much about these in this series yet. But we are going to be talking about an index. Alright, so what's an index? I'm not going to be explaining every single detail of indexes, because we're not really talking about indexes. In this video, we're talking about keys. So I'm just explaining enough so you understand how this relates to keys. So think of an index, best example I have is a book. So if I have a really awesome book, like, like physics or something, what I can do isn't like, Oh, I want to know about this. I can flip open to the back, right. And I can travel to the end. And I be like, Oh, that looks nice page 7832. And I can read about that. Well, that's a lot quicker than being like, Oh, I want to know about this. page one, not on their page two? Nope, not they're not not there. Oh, no, not there. Huh. Keep going through the entire book until you find it. That will take forever. Another good example is a phonebook. When you go through a phonebook you have, everything's alphabetic, and it has the data right there. So when you think of an index in a book, it points you to the data. Whereas a phone book, it just has the data there, you can be like, Oh, I'm gonna go to the letter G, find grandma in color. Right? That's kind of how an index works. And it's very similar for a database. We add a table, and it's like a user table. And let's say we have 6 billion rows in this great, crazy number. And we want to find everybody with the name Caleb just came. We want to find everyone with the name Caleb. Well, without indexes. Basically the database is going to go row one. Does it have the name Caleb? No, it does. row two doesn't have a name. Caleb no row three. Yes, it does. We'll put that we'll put that back in the return statement. And then it goes through every single thing every single row within the entire table, which can take ages. So if we're given an index and we saw We order that data in a way that the database knows how to find certain, like it could know exactly where the C's are, well, then it can go straight to the C's, grab all the Caleb's and boom, it's done. It's like, super fast. That's kind of how an index works. Well, what's not to do with keys, when when we define a key, it's actually a type of index. That's because keys and indexes are used for SELECT statements, where clauses and joins. So I could do something like select everything from users, for example. There's just like a SQL example. I mean, it'll vary from whatever programming like relational database management system, you use this as a general example. And then I could say, where now this is where I specify. Sorry, drop the wait. Hold my books anyways. I'm selecting all of the columns. And then I want to select every column where first name has the value Caleb, for example. Well, that's how we would do it. So this right here, this little where thing? Well, it's best to have an index for that. Another thing we commonly do are joins, which is basically combining two tables. And I'm not going to go into all the syntax for all that. Basically, we have a table over here, users, and then a table over here, comments. And we want to make a generate a new view, where we have all of the comments from a user and all of the information about that user, we could have a new generated view. But we can join these statements. And how does it know how to join it, it uses the I uses the key, right? So if you have a primary key, and let's say we use username, so the key on drala key here. The key is username. Well, how would we join it, we would say, I want you to take this table, and this table, and then combine all of the rows, where the username is Caleb curry over here. And the username is Caleb curry over here. So that way it knows all of these comments are from Caleb curry, which we can assign to that user with the name Caleb curry, or you might often see it with like an ID number. So it could be like ID six, take all the comments from the user with the ID of six, join it with the SEC, take all the rows with the user with the ID of six, join it with all of the rows with the user of ID six, and it combines them individual rows for each ID. Within when we make a primary key, or just a key for now, we're defining an index. So the database can easily access the order of our data and combine things. So that is how kind of indexes relate to keys. Key is a type of index. There are all kinds of types of index. Look up tables. So what is a lookup table. Let me I have to kind of explain something else first before we're going to get into this. So just kind of relax and take. Take a good look. I'm going to erase this so I can have some room. So let's say we have a membership table or membership database, and we have a user table or members table. And within that we have like a membership status. Okay. And let's say there's like 10 options, I guess. So you could have like, bronze, silver, gold, platinum, titanium, and then like not a member or like a trial member or partner or special member or member above all members. I don't know I'm just kind of making stuff up. Anyways, some people will create what's known as a lookup table, so they'll have a table over here with every single option for the membership So we could have the memberships. And then within here, we would have, each individual row is one of the membership statuses, right. So we have like 123456789, I'm just gonna go at nine because can't fit anymore, nine individual rows. And then we so that's one column, the ID, for example, which will mean we could use a different key if we needed. But for this, we'll just use an ID. And then we have a column to say what the membership status is. So we're gonna have like, gold, silver, bronze, platinum, and so forth. Now, we can have a members table. And then a column within this member table, can reference the key of the membership. So remember, like one or two videos ago, I talked about how keys make connections between tables? Well, this is how that's done. Here's an example of a connection using a key, what we would do is we would have like the member, the members, user member name, or their ID, for example. And then we have their first name, last name, address, phone number, billing type, whatever it is, and then we would have a column membership, you see. And within here, let's just say for example, we didn't have this table, this was gone, we just had this member table. Well, when we have 6000 rows within here, the membership type would get repeated. Lots and lots of times, we would have like gold, gold, gold, gold, bronze, platinum, platinum, platinum, bronze, glow, platinum, bronze, and go on go on for all of the rows. And that would work. That's that that is a possibility for your database. I'm not saying it's going to cause your database to explode, or you'll get fired or anything. Well, unless you're told not to do that you might get fired. That's a possibility for the database design. Now let's think what kind of relationship is this? One member can have one membership, and a membership can have multiple people having a membership. So it's a one to many relationship. The one side is the membership, you'd have one of these, and then you have many people using that. So that means if we if you remember from the relationships, we need to take the key from this one, and put it over into this table. And now if we add this new relationship in, we don't have all of that repeating data because all we do is reference one individual row. And there's some other benefits to that, which I'll explain in a second. So now we can be like rather than gold, gold, gold, silver, bronze, platinum, we have 111234 for all of our rows. And we still have individual information like Caleb, Jimmy, Jake, Sally's Sammy, and Sarah. Well, when we do this little connection thing, we have a new connection between these tables. We're taking the key of the membership, which is this column right here. So here's our key. And we are taking that key, and we are moving it over into the member table. So now this column right here, within the member table, points back to the key of the membership table. This is known as a foreign key. Now this will help protect the integrity of our database. That's because there's only nine possibilities for the membership and if someone puts something else we could flag an error. The other thing is we can have it to where it will not accept an empty value. And then if for some reason we decide to change our like let's say we wanted to change silver to like I don't know diamond well Then we can change it here. And because this value is not over here, it's only a reference, it will automatically figure it out. Because you see the key never changed. The key is still one. So one still points using one. So this connection is still here, the value or I'm sorry, I guess it should have been two. Whatever you get my point up, we can change the name of gold, we can make it epic. If you want epic membership, well, now, we didn't have to change 6000 rows within our member table, we only had to change one row within our membership table. And all these connections stay the same. So the IDS helped create better connections between tables requiring less maintenance for our database. And it also protects our integrity, because we don't have to worry about incorrect values within the membership. Because it's all connected back to the membership table, which we change from here. So we update epic, it updates every single instance of epic within our entire database, or it gives us an error or something, we can set it using what's known as foreign key constraints. That key foreign key constraint. So because we're using what's known as a relational database, which is the type of designing we're doing right now, we have relations tables, and everything is connected across the database. It's not all stored in one individual table. But to protect our integrity of our database. We create connections between keys. That way, we don't have to worry about incorrect data. So what do keys oops, and don't match? What do keys do for us? Let's write that out. Well, first thing, they protect our integrity, I made a video over integrity. So if you want to check that out, be sure to do that. So they protect our integrity. How do they do that? Well, we don't have to worry about only some values updating. If you remember, we had the members table. And when we updated gold, and we named epic, it updated every single row within the member table. So it protects all of our values. So updates, less maintenance for us less incorrect data. What else is the key deal keeps everything unique. And you might have saw that in the member table, we use the the key number one, multiple times, we had three people with the member ID, I'm sorry, the membership ID one. So three people had gold, for example. Well, it's still unique, because they're still only talking about one individual gold. Over here in our membership we had gold was one of the options, we could have three people. Alright, you see up for her arms, these three people can all have a gold membership. And it's still unique because we're only talking about one individual value with three references to that value if we got rid of these foreign keys, and we just had to say what membership status they had. Well, now, this guy's going to be gold. This guy's going to be gold, this gal is going to be gold. So now I have 123 times that gets rid of the uniqueness from the key. So it protects our uniqueness improves our speed. So basically, improved functionality of our database. Function I'm running out of room and it does a whole bunch of other things. Basically, it makes updating easier, so less work for us. Because we don't have to go through manually and check every value, we can just update that individual value. You know. The other thing is that allows for added complexity. Long, hands getting tired. Okay. Now you can even see all that. Okay, so here's some examples. I'll see, I'll show you how we could say it allows for added complexity. Let's see. I'm going to redraw the example we had, but I'm not going to draw super in depth. So don't, don't wait. Don't mean, don't worry, it won't take that. We had a members, and then we had the membership. All right. Now let's focus on this membership table. Let's make it a little bigger. Actually. What we had earlier it was we had an ID, we had 1234, and so forth. And then we had the title of whatever they are. So if you wanted to draw this better, you could put the column headers, so we could have, like ID. And then we could say, name of the membership. And then we could title the table, we could have it membership, membership. Yes. Well, what we can do, we can have added complexity, because now we can add a new column. And we could say more about the gold membership. So we could say, price. And we can have a monthly price of $60. For example, if we didn't do this whole lookup table thing. And we put it all within our members table. Let's see how that would look. Let's try it. All right, let's have the members table. And let's not even have a membership table. Well, now we have the guy with the ID, 789, and three, and we'll have their names be Caleb. Tom Tommie trace. So those are the three people I was talking about with the gold memberships. Well, what what membership to the heck, okay, gold, gold, and gold. Now, how much does that cost them 6060 and 60. Just with this little example of three people, we already have four unnecessary values. Because we have repeating data, we have G and G twice more, we have 6060, twice more that eliminates the functionality of our database, because we're having tons of repeating data. So you might think the best way to do is is get rid of this price. Well, now we get rid of the complexity of our database. And sometimes complexity is a good thing because it allows us to store more information. When we use the common or when we I'm sorry, when we use the lookup table. We can just have that as another column. And we can put more things we could have another column saying, how long does this membership last? We have another column saying? I don't have a special purpose, or what can use within our gym? Or what can we use? And are we allowed to bring friends? What's the price for friends, we can do all that kind of stuff within this table and just reference that within the other table. It's only a set number of options. So we reduce the repeating data. If we wanted to do at all Well, we'd have to put each individual comp column and we'd get tons of repeating data. You know what I mean? Yeah, so that is an example of how because that would take forever. So that's very common. It's common for things with a set number of options. So like membership type, there's only so many options. You can have states, sometimes sometimes people do it for sex, male or female. Some people do it for I can't think of anything else right off the top of my head. But anytime there's a set number of options, they will often put a lookup table with the ID, the name of that option or what that option is like gold, silver, bronze, or male female or California, all the other states, Wyoming, Texas Russia, well, then we only have a set number of options. So that's that's the, then we can put more information about it, we could say, male, and then, like what the male is allowed to do on the website, or anything that allows for extra complexity for our database. In the last couple of videos, we've talked about all of the characteristics of keys in general, we talked about the word key, and databasing. Keys are the key to success. Anyway, anyways, this video, we're going to be talking about specific kind of keys, because all of the last videos were kind of concepts. But now we're going to be talking, how do we actually implement that into a database, I understand that we want keys and we want references from table to table, and we want it all to be connected. But how do I do that? Well, the main two types of keys that you're going to hear are known as primary and foreign. Actually, we're not going to be talking about these two kinds of keys. In this video surprise, we're going to be going a little bit broad, more a little bit more broad. So when you're like a beginner and you're learning database design, you'll probably learn that these are the two kinds of keys, when, when you're want to learn more advanced database design, you'll learn about different categories of keys. So before we get into these two, which if you guys haven't heard of those just hold up, if you ever have those keep holding up. We're going to be learning about what's known as a super key. Now, for some reason, I think this is one word with no space. And I don't know why there's no space there. So if someone knows if like, what do you guys want to know, let me know. Because I really want to know, it's driving me crazy. The other one that we're going to be learning is candidate key. Alright, so before, before we learn about the candidate key, let's just focus on the super key. So I'm just going to erase this. Alright, so what is a super key? Well, a super key is any number of columns, that forces every row to be unique. Now, if for some reason your table is structured to where you can't have every single row be unique, you need to consider adding some kind of column to enforce uniqueness, such as an ID, or some, you need to make sure that everyone using the database understands how that table works, if it's a special case, but generally every table, you will be able to make unique. Now, when you work with databases, you'll often see IDs, so like user ID, this is a type of key, then you also sometimes find keys such as like username. The difference between these is that this one's kind of just made up for the sake of the database, user ID of 748. What does that mean? Absolutely nothing. It's just a number. username actually has some real world connection. We'll be getting into the difference between these in an upcoming video. And which one will be best to use or which situations. But as for now, we'll just be using kind of natural practical columns that are already in the database rather than IDs. But the same thing can apply with just like an ID field or a call. So let's learn about the super key. Super key, like Superman, but it's a key, just like a giant key that just goes and breaks through doors doesn't even use the handle. It just breaks through doors. Anyways, let's learn. Okay, let's say we have a table here. Big table right. Now I'm going to put some columns in this table. That's not straight enough for me. Close enough now it just looks worse, whatever. Let's say we have a user table because I always think that's the absolute easiest thing to use. And we're going to have a username, email, password. Then we're going to have first name, middle name, last name and then we'll have birthday so every single row Within this table is going to need a value for each one of these columns. I know this is really bad handwriting, but it doesn't really matter how it looks just understand the concept. There now looks a lot better now. You're welcome. Now you can actually understand what's going on. Okay? some instances, for databases, you will allow people to put in blank values. So for example, if someone doesn't have an email, for example, we could make it optional. That way, some people have no or no value at all other people have values. But for this this case, I'm going to say everything is required. Because let's say you use your email to sign in, or to verify your account. How do we make sure that every single row is talking about one entity, and every single entity only has one row, everything is unique? If we have a person and draw a little guy over here, right? And he wants to sign up for this website. So you guys this website, here's his website, you put this information in, and the information goes into a table. Right? he does, he probably won't actually have access to the SQL, because it's just some random user on your website has access to your database. Can I got some security problems, but anyways, what's going to happen is he will be entered into this and all of his values will be put into these values box. So we have username, like it, fi, email, DC at blah, blah, blah, calm pass, it's pi. First name is Caleb, real name Daniel. Last name, curry birthday, oh, to 1390. So this guy, inserted all that information within this database. If you can't read, it doesn't really matter. Now, when another guy comes, or let's say gal, so this lady comes, and she's like, Oh, I want to sign up for this website. So she signs up this website. And she puts her information in. And this also goes in the database as a new row. So this is just the column headers. This is separate from the actual data. The rows, though, they have individual values, she can put her values in, I'm not going to write them all out, because you guys get the point. How do we know that each individual row is unique? And that talks about one individual entity, in this case, a user? Typically, whatever the entity is, is what the table was titled. So the entity will be the user. How do we know that every single row is talking about only one user, and every user only has one row the rule of one. We don't want Caleb to be able to come in here, create a new account and make his username, cc and all the same stuff again. Now, in some instances, we may be able to allow them to make more than one account. Like I could make an extra YouTube account if I really wanted to. That would be okay. So I could have the same different username, different email, same password, same first name, same middle name, same last name, the same birthday, the only thing that changed is a username and email. Now, the default, like what the table is trying to define has kind of changed. We're no longer trying to define the person. We're trying to define the user account. Because now there's two accounts and they're completely separate entities over that kind of makes sense. But you don't really need to worry about all the specifics. I'm getting completely off topic. So what were we talking about 10 minutes ago, I think we're talking about super keys. It's any number of columns that forces every single row to be unique. If you take all this information and compare it with another let's let's create a new row. Let's say this lady here actually ended up putting your information in her username is a her emails B passwords seat. First Name is Hi, middle name is yo. And her last name is Hello. And she was born. Oh 316 72. That is her birthday. Well, now, if you compare these values it is in is there they're different. Not every single individual value has to be different. For example, this girl's last name could be curry and that'd be perfectly okay. But all of the values as a whole have to be different. So as for example, if someone came in here and put all of this information up here, down here, cc ccof, blah, blah, blah, pi, Caleb Daniel Curie Oh to 1390, that is not a unique row, and it would cause an error. So the super key could be all of these columns right here. Now, super keys are usually not defined within the database. That's because they're so broad, you have all this information. Of course, it's going to be unique, simply because you can only use a username one time, typically the emails only one time. So if someone comes down here and tries to use an MCC, well, then you already got an error, it's not going to work. So all this extra columns are here. Kind of unnecessary, because we can simply do that same thing with just the username. So a super key is any number of columns, that creates a unique row, unique rows. A candidate key is the least number of columns. That Can you see how new is a candidate? I'll explain why it's called a candidate. And second, or in the next video or something. The candidate key is the least number of columns. So for this example, we're not going to allow two people to have the same username. We may allow them to have the same email, we may allow them that same password first name, middle name, last name, and birthday, but not the same username, because then how do you tell them apart? You can't because all of their information is the same. They're exact clones to the database. And we do not ever want that within our database. Typically, I can't really think of any examples where you would want duplicate data. But okay, anyways, the candidate key is the least number of columns needed to force every row to be unique. In this example, the username is enough to make it unique. So we can take the candidate key and have it to be just the username, or I guess we could certainly call because all throughout the, the rows are going to be unique. So the can the candidate is least, and super key is any x. Now when it comes to database programming, let me clear this up. So your brain doesn't explode anymore. It takes forever. I want to tell you guys, stop watch. Alright. And I just got my chalkboard to listen to me and erase that stuff. When we're talking about database programming, there's not really a time where we're ever going to tell your database that this is a super key, because super keys are not practical. They are for designing your database. Only. When I looked at that column, first thing I do is I think super key. Yes, every single row can be unique. That's the first check. So you ask, can it can each row the unique? That's the question. That's the Define. That's what super key is questioning, can each row be unique? And even better word? Can every row be unique, because every single row within a table has to be unique. That's what the super key is asking. Once we figured that out, if it can be unique, then we move on to figuring out the candidate key. So now we ask how many columns are needed? Can you read that? Hopefully? So the first question, can every row being be unique? Yes, that means we have super key. That's all we have to worry about. We have to have to figure out which columns or anything like that, can every row be unique? Always think about all the possible values. Is it possible for a duplicate? If it is you got to figure something out? If it's not good, so every row can be unique. Yes, check move on, how many columns are needed. Now you figure out the least number of columns needed in order to create unlimited uniqueness forever so no possibility of uniqueness and we are allowed to put some kind of like limitations. We can put a unique we can use. It might be different from for different relational databases. But if we use a unique index, for example, every single column, I'm sorry, every single value within a column has to be unique. And that's, that's a requirement for people to put in. So if it's not unique, we give it an error. That's how we force values to be unique. So how many columns are needed? Once they figure that out, we have what's known as a candidate key. So in the last, the last example, it was just one because we could use the username. Now, the next thing I like to figure out is, how many candidate keys do I have? I'm just going to use ck for candidate keys. How many ck do I have? What does this mean? Well, we can actually have more than one candidate key. And the last example, we had a username, we also had an email. Okay, well, email could also work as a candidate key, because you could put a requirement that every account has a unique email, some websites are cool with people using the same email more than once. Some websites aren't usually they're not. So there might be a chance where we have more than one candidate key. Or if rather than using the username, you want to use something else, you could have it to where we can have, let's say, you can use your email one once, we could have first name, plus, last name, plus middle name, or middle initial. So that eliminates most possibilities of uniqueness. But there's still possibility that people have the same name. So then you add, birthday. So now you have to find someone with the same first name, last name, middle name, and the birthday. Very rare, still possible. So then we could add a third. I guess one to a fifth column, we could make it. email, or we could do something such as a dress. Alright, let's see, let's go with a dress. This could be a candidate key. Because we have first name, last name, middle name, birthday, and address, they all have to be unique as a whole. What are the what are the chances of a person living at a certain location born on a certain day? Also, the UK what are the chances of two people having the same name same birthday and living at the same place? Unless you had twins, and you named them the same thing? If that's the case, shame on you. It's gonna confuse people. That could be a candidate key now isn't the best candidate key? Probably not because we're using five columns. And it's still technically possible to have a duplicate, but very, very, very rare. Right, so how many candidate keys do I have? You figure them out, I like to write out the possible candidate keys. And then we move on to the next step, which is defining the key that we will use for our table. Once we have the candidate keys, we can choose one of them. They're called candidate. Because there's there's an options like you could have, you could have your first candidate, you could have candidate key one, candidate key two, and you've got candidate key three, this could be the username, this could be the email. This could be that long, first name, last name, password, blah, blah, whatever you want it to be. Now, you have to decide which one will be your primary key. We're going to be picking up from where we stopped last video, which we were talking about candidate keys. So this video, we're going to be talking about primary keys, which is the next step. So what is a primary key? Well, in the last video, we we define some possibilities for candidate keys, something we could use as the main key it's a candidate because you have to pick it to be our primary key. So we have a list of possibilities. One of them was a username. Another was an email and then if you don't have something simple like this, you could do something like your full name, which that would be like first name, last name, middle name, and then plus an address. Which address would probably be broken up into like, Street, city. You know. Like all the parts have an address, and then possibly your birthday. Or just dog. So here are just some possibilities for candidate keys, we could have username, email, first name and last name. Well, now, once you have all your possible candidate keys, you want to pick one to be your primary key. So how do you do that? Well, once again, you got to make sure we look at the the rules or suggestions for keys. This is specifically talking about primary keys. We want our primary key a unique, never changing and never know. So username is never going to change, right. So that's pretty good. You always need a username when you sign up for an account. So it's never know. And it has to be unique because two people can't have the same username. So this would be a good primary key, check out email. Well, it's you can make it unique, you can require people to only have like, you can prevent multiple emails. So when you create an account, you can only use an email that has not been used, that's an option. So that's possible, never changing it Well, it depends depends on if it if your use your email, like a user account username, that you sign in with your email and you have a password to well, then the chances are, you're not going to be able to change it. Although now, you can probably update your email and stuff. So it's possible to be never changing if you prevent people from updating their main email. So possible, possible and never know you can require someone to have an email. So that's possible, all three of these are possible. So this is a possible loss for a dash, first name, last name, middle name, address, date of birth, well, probably the whole group, as a whole will be unique, the chances of finding someone with your same name, living at the same place as you with the same date of birth is almost impossible. so unique, practically Yes, but possible, you can possibly have a duplicate. So possible, never changing. Oh, well, people can get married, so their last name can change. First Name can even change your middle name can even change, you can move, you can't be born a different date, you can't be born again, like unless you become a Christian. But anyways, never know, well, some people might not have an address. Either that, or they might not have a middle name, or something like that. So this is probably not the best primary key. So in this example, I'd probably go with the user name. And then you define that as your primary key. And then you can use that to make connections between tables. So you have a comment by a username, that's the connection. That is a natural key. And we'll be talking about surrogate keys in the next video, if you know anything about that. But basically, you can use your username as an Id rather like as a primary key. So you could have, let's say, this is my username. This can be the connection for all of my rows by me. So if I have a table for comments, table for users, a table for sales, will if I buy something, Caleb curry connection. If I post, if I'm a user will connection, if I post a comment, connection, it all kind of points back to the user table, because that's going to be the parent of the relationship here. So this could be a primary key. So that's a primary key. Now, this will also be indexed, because primary key is a source of is a type of index. So you can do SELECT statements really easily with that and everything like that. So that's going to be how you connect most of your data. Now, the other candidate keys that we did not choose, they're known as alternate keys. An alternate key. It works. I mean, it can be it can be it could be the primary key. But basically, it's all the candidate keys that were not selected as the primary key. You do not you're not required to define all of your alternate keys in your database. In fact, you don't even need any alternate keys. But you may want to because oftentimes they will be something you're doing searches for, or connecting tables or whatever it is. Alternate keys might be useful so you can create an index on the alternate key. So I could create an index on the email For example, because I said that was a possibility. Well, now, that can, that'll be an index, and it will probably be used because you might want to do data by the emails like selecting everyone by their email, you know what I mean? Maybe possible, basically, you can, you can do a select. And then you can choose the columns you want to choose and then the table. And then you can say, where. And then you could say, like, email, has the value, legit at awesome calm, because you have an index on that it's going to work better. So if you have a really good alternate key, I recommend you index it simply for good design. And it will help your database run faster. But if you don't really want to index on it, then don't do it. Because that's just another thing your database is going to have to maintain, it's going to have to update the index as well as book sort of how if I have a book here, and I change the insides of the book, well, then the index in the back is gonna change too. So alternate keys, you can set them as an index, if they are good and useful if you will be searching for that kind of information. So that is introduction to primary keys and alternate keys. Sir, I think I'm falling sorry, by the way, sir, good. And then we have a natural. Alright. So we talked about primary keys in the last video. That's as far as we'll get for defining a key. These are categories of primary keys. So we define a primary key. And this is kind of like a descriptive term, we don't have to define in our database surrogate primary key or natural primary key, this is more for database design. To know like the difference between types of keys, hopefully that makes clear sense. And if not, just stick with me. Okay, so Up, up to like all these key videos I've been doing, I've been using natural keys, let's define a natural key first, it's very natural definition. It's something that's naturally in the table, something you naturally want to store. When you have a user's table, you naturally want to store the user name and the email, those can be used as natural keys, because they fit all the requirements for a primary key. And it's already in there, you don't have to make up something, you don't have to think of a column to add just for the sake of having a key, we want to be able to have a key for every single table. But sometimes, the natural keys not very obvious or not so natural. So we want to try find a natural key when we're using natural keys. Generally, when you create a database, so let's say this huge box is our database. Within this database, we have a bunch of tables. And then within each table, we have a bunch of columns and rows. See, I'm saying these are all connected in some funky way, with relationships? Well, when we define a database, typically, I mean, maybe not always, we want to try to either always use natural keys, or always use surrogate keys, which are just made up keys. We don't want to kind of flip flop switching, I can't think of any other words. Because we want to try keep it consistent. Otherwise, people using the database, they're gonna be confused. Is this a natural key? Or is this a surrogate key? Is this something that actually has real world value? Or is it just a made up number? So let's say we are using natural keys, there are a couple problems with natural keys. And this video will be explaining the difference between these two. And the next video, we'll be talking about pros and cons of each basically. So natural keys are what's naturally already in the table, something we wanted to store to begin with, right? The thing is they have real world value. So your database might adapt, and the meaning of your keys kind of adapt to which, once again, I'll explain that on the next video surrogate keys. They are a column that's just added to your database, no matter what even if you have a good natural key because remember, you want to either use all natural or all surrogate usually. So if I decide to use surrogate keys, I'm going to give an ID to every single row within every single table. So we have let's say we have a user table and then we have a sale table. Then we have a common table, all we do is we give every single table, an ID column. So we have a user ID column and the say in the user table, we have a sale ID column and the sale table, we have a comment ID in the comment table. So now when we have like, let's look at the comment table in the comments, we're going to have a column, we're gonna have a column for the comments ID. And then we'll need a reference to the user that posted the comments. So we'll have the user ID as a foreign key, which points back to the user table. So that's kind of how that works. Now we just have random numbers, the ID is just a big number that has a feature known as auto increment. means every single time you make a new row within this table, it's going to raise the ID by one, then when you like, delete columns or something and you have gaps within your numbers, it doesn't really matter because the number has no real world significance. Typically, surrogate keys are kept completely private. No one knows the surrogate key, except for the people working with the database, you're not told your surrogate key. So if I created an account on a website, and then I put my username and my password, and then I go to like, I sign up, register and do my email junk, and then I sign in and go to my user settings, it's not going to be like user ID equals 45, or whatever, it's going to be private. The reason for that is, is because the world doesn't have I mean, the, the number doesn't have real world, meaning, therefore, it's private to the database only, if I started giving that number out to everybody who has one of these IDs, or, or something like that, or if I started putting it on sales reports, well, then I'm giving that number real meaning in the world, and it's starting to become a natural key. So hopefully, that kind of makes sense. Natural is something that's already in the database. surrogate is something we just add. The thing that's kind of cool about surrogate is if you're struggling to find a really good natural key, you can always use a surrogate key. The problem with that is you always want to try to be able to naturally make everything unique. Like any user table, we have a username, that's going to keep things unique. So even if we use a surrogate key, we can still index this username. And that can be used. Sorta as if we assign it as a natural key. The only thing is we're making connections between tables using the surrogate key. So for this table, we could have a user ID, and then we could have a user name. For just an example. The user ID would be the surrogate key. This would not be the natural key because we can only have one primary key. And the primary key is going to be the type surrogate. So the user ID is a surrogate key random number generator number, the username, we can index it and use it like a natural key with when we search the database and everything. But we're not going to use it to make connections between tables and references with foreign keys. So hopefully that makes things pretty clear. In this video, I wanted to talk about all the pros and cons of surrogate or natural keys. So basically, we're going to compare and contrast them here. So over here on the left side, we got surrogate and then over here on the right side, we got natural All right, we're gonna have a little battle here and we're going to figure out which one's the best. Me as boxes to keep this up. Here. Alright, so what are some good things about natural keys? Alright, well, first one of the first good things about what's really weak. One of the first good things about natural keys is that you don't have to define any new data. If you don't remember from the last video, let me just redefine them here. Natural keys are used, or columns used as your key that are already defined within your table. They're natural to what you want to store. surrogate keys are just kind of added like a user ID or a car ID or a credit card ID or a comment ID, their surrogate. They have no real world meaning so natural, not natural. Cool. So some good things on naturals, you don't actually have to define a new data, any new data, I don't know if you could hear me, they're not kind of stuttered on that. But anyways, you don't really have to define any new data because it's already within your table. And you can use that as your natural key. So you're required to store less information in your database. So you get smaller data, you get it, you have a smaller database. Right? So that's one good thing. So what's the downside to natural keys? Well, the downside is that sometimes you're not going to be able to find a really good natural key. Because if you look over here, we want our keys to be unique, never changing and never empty. Or No. Well, if we have a natural key, we have to find one that fits all three of these. And it has to be a good one that would make sense to use as a as a natural key, like, you might not want to use a combination of 20 columns to make natural key. But that's one of the downsides to natural keys, you have to figure out what natural key to use, and there's not always a great option. Now, optimally, you should have a natural key, or at least something that can be used as a natural key within every table, to best had to have the best design to ensure uniqueness, not just by a surrogate key if you use one, but by naturally what's being stored. So like this example, if we have a comments table, we could have a comment by a certain user on a certain web page or whatever. And what what forces it to be unique? Well, the combination of that as well as the date it's posted, or the time it's posted, that's going to be unique, because you can't have two comments from the same user at the same time, on the same web page. It's not possible, right? I mean, unless you like glitch the system somehow, which shouldn't happen. Well, that's an example of using natural keys. The downside? Like I said, you might not always be able to figure out what to use for your natural key. That one was a good example. Because naturally, we could figure out a natural key, right? Follow me. The other thing is, natural keys have real world values real world meaning, right. So you have connections to real world, meaning. So for example, in the comments, the actual comment in combination with the user who posted it in combination with the time it was posted, and the page it was posted on, or any of those four columns to be used as a natural key, well, those have real world meaning. And as your database develops over time, the database application might change or be updated, or new meaning is given to the database and what it's supposed to do. That means that your natural keys could change over time and have new, new meaning. Or the actual values could change in some instances. And we do not want that, obviously, because it has to be unique, never changing and never know. If for example, we make it to where we do have to change the natural keys. Well, that would kind of be like crossing this one out. And that's possible, like we could update the values of all the keys. But that's going to require us to update all connections between tables. And that's going to require a lot of resources from our server. And that's just a bad design in general. So those are most of the cons of natural keys. So yeah, let's move on to surrogate right. So surrogate keys, well, obviously, the first downside is you have to add a column to your table, no matter what even if you have a a, for example, if it's naturally unique. Well, you might like if you have a comment from a username, well, that username is gonna help ensure uniqueness. And you still add a surrogate key when you're adding unnecessary data, because you're adding a new column and you have to store all these numbers. One of the pros of surrogate keys or is that they're typically numbers were oftentimes natural keys can be words or anything like that. That's good, because typically, numbers are easy to work with, but not always. Now, a couple other things. Besides the fact that you have to add a new data new column, which requires you to store more data. It's that it can be confusing sometimes when all you're working with is just combinations of numbers. And I have like, we have a column let's say like komentar for a comments table, and we just have an eye we have the value like 700 407,462. Well, this kind of clear, it's talking about the user with the ID of 747,462. Sometimes columns won't be very descriptive though. And that would be full into designers. So make sure you name your columns, right. But when working with other people, you can sometimes get confused exactly what References What, and you have to do a little bit more thinking and check the actual database structure the way it's programmed, or however else like that. So that's a couple things that are confusing. Okay, so now you're probably wondering, okay, so which 1am I supposed to use? Well, actually, it just depends on what you want to do if you prefer to use natural or surrogate keys. But typically, you'll want to pick one and use that throughout all of the database. As a database really weird. There's like database, data base rather than database. So it's like database anyways. So like, set that aside to use surrogate keys, well, that means every column should get a ID, or at least a combination of IDs. Natural if I decide to use natural ones, every column should try to be used as a natural key, because if we mix and match, sometimes we're gonna have a surrogate key, sometimes we're gonna have a natural key, and it's going to be super confusing, and we're not going to know what's going on. So typically, you want to kind of keep one or the other. The other thing is, I don't know what the other thing was. But yeah, typically one or the other. Now, for my sake, I personally am going to use surrogate keys, although that's not saying that they're better in any way. I just prefer surrogate keys, because it's simple for me, because every time I make a table, I all I got to do is make an ID with it. Like if it's a user table, all I got to do is make a column user ID, and boom, we're done. That's all I got to worry about. And I don't really have to worry about ever changing or ever becoming something different or anything, it's just a random numbers 7254. Well, that has no real world meaning. Now keep in mind, though, that if you do use surrogate keys, you shouldn't, you shouldn't have them have any real world value. So for example, a student ID if we give these student IDs out to the students, that's not really a good idea to use as the surrogate key, because it's actually an actual key, it has real world meaning. So those are just some things and you can find tons and tons of stuff on the internet about natural or surrogate keys. It's basically like a war between two database sides. Were the natural keys. Oh, were the surrogate keys. You know what I mean? So yeah, don't freak out. Just figure out what works best for you. There's some minor performance differences. But I mean, I don't really know all of them. So yeah, believe it or not, I don't know something. I'm kidding. But anyways, try them both out, if you want, look up stuff, look up, which one performs better, or which one will work best for your application, or your database, because that's really what matters. So this is a big subject. And there's a lot of important things. So I'm going to be separating a lot of these concepts up into separate videos. So the next two or three videos, this video and two others I think will be over foreign keys, and a lot of things that have to do with foreign keys. Sorry, anyways, this video we'll be talking about introduction to foreign keys, and exactly what exactly they do. Alright, so foreign key is a reference. And what does it reference? It references a primary key. So that's it. That's the most basic definition of a foreign key. Now, this can be a primary key in the same table or a primary key in a separate table. So let's kind of see how this would play out. If we do do some drawing, right? Let's, let's think of a class for a college database. So this database has call has all the information for a specific college or multiple colleges. And one table is for classes. We also have a table for instructors and for users and anything else that's obvious. That's an obvious entity. So let's, let's look at the class table. So we have a class table with this, draw it out here. Within here, we could have a class ID which, like I said, we're just going to be using surrogate keys for most of this. You can do the same thing with natural keys. And if you have no idea what I'm talking about, just don't pretend I never say anything. So yeah, we have a class ID. That's an example of a surrogate primary key We can also have a an instructor. You see, so we can have an instructor ID, right. And we could have, let's say a room number, because we could have, we could have a table for every single room, which talks about more information about the room, or I don't know if that's a really good example. Or if they would say we have a bigger building, or a bigger college, and we have a table for each building. So that's probably that's probably a better example. So we can say building ID to say which, which building is in. Alright, so here, here are three IDs. So you can assume that they're all keys because almost all IDs are going to be keys of some sort. Which one is the primary key? Well, this one right here, the class ID, this is the primary key because it's the class table, every single row within this table is going to have a class ID, whether it be 1234 712-800-4364, whatever it is, it's going to have a class ID every single row. So every single class is going to have its own ID. We also have an instructor ID and we also have a building ID. Well, since these are kind of unrelated to the class, you can assume that they're foreign keys in this example, because we're not going to have an instructor ID for a class necessarily by itself. This is going to be an instructor ID for an instructor. So that means we have another table for instructor. So this is our instructors table. I think I'm spelling instructor right. And we have an instructor in here. Instructor ID. Well, this you can tell these are a reference here, we can also have a building ID. So this references another table. And that would be the building. And then we have a building ID. So I'm taking forever to write and I can't even write nice. But that says building ID right there. And this references that there. So you can see there's a connection between these tables. Now we'll get more into drawing relationships better like rather than just using wines, we'll talk about more of that in the future. But for now, we can see that these reference these columns, so the columns point to each other. And also each individual row is going to point to each other. So let me explain that a little bit more. When we have a row within our class table, let's draw this table a little bit better. So I can kind of see how it would work. So let's say this is an actual example of the values. Within this table, we have a class ID. And instructor I'm just going to put I and then the building Id just for simplicity, heck, I don't know. Because I don't feel like writing all that out. So we can have a class ID. And just just to make things more, just to be more accurate to what a normal database would be, we could say like the name the name of the class like biology. So we can have a class ID, we could have seven for value. We could have the instructor ID could be 63, we could have the building ID of 16. And then the name could be biology. Well, every single column is going to have column rules or requirements. For example, these two columns here. Individually, this one is going to have a rule that says every single value within this column. So as we go to the next row, the next value, every single one is going to have to reference something from the instructor table because it's an instructor ID. So that's one of the column rules. Now same for the building ID, we could say that every single row has to reference a value within the building ID. All right, so that means the row right here, let's say this row individually, so right here, the class ID with seven biology references, the instructor with the ID of 63 and the building with the ID of 16. So you can see that the column has a rule set to say that every single value within this column, every single row is going to have to reference an instructor and a building now Each individual row, we could say, Oh, this specific row references, the building and 16, and the instructor 63. You see. So that's kind of how foreign keys work. Let me clear this out, kind of draw it out a different way. Let's take a look at the, the building table. Okay, so let's just draw little squares to represent it. And let's say within this table, we're not going to structure the columns or anything, we're just going to say we have the building, 716, and 14. So these are three separate buildings. And we could have names for the buildings and all kinds of other things. Now, we have within the class, let's say this is the class table, right? And we have the class ID. So the class ID would be like 6438 123, it doesn't really matter what number they are. And eight, right. And now we're going to have a reference to the building ID. So let's say we have 777 14, right. So here's an example. So over here, we got the class ID. I got the building ID. And then we got the building ID. And then we got more information about the building. Here, if we wanted, well, you can see that there's a connection between every single value, every single row within this class table has a connection with a building ID. So that the entire column building ID has the rules that every single value within that column has to point to a value within the building ID of the building table. Now, each individual row has a value that points to a specific row, within the building ID. As you can see, these are all individual rows. So let's get a little crowded here. We can see that three of these classes in this class table, reference one building, and that's the building with the ID seven, and that can be the room legit building 74, or whatever you want to call your building, it doesn't really matter. So that's, that's an example of a relationship. Using foreign keys. foreign keys are what keeps things connected. So let's say the name of building seven was legit, because I really don't know what else to name it. Well, now we have the value legit as its name. And we connect with the ID. So rather than putting legit, legit legit, and having repeating data that could be easily messed up, for example, if legit, was changed to legit with two T's or changes change the name completely? Well, now with this foreign key, it's connected, so updates automatically. So yeah, that's kind of how foreign keys work. Just keep in mind that every table has one primary key. Now that primary can primary key can be a combination of multiple columns if you want, but it only is defined as one primary key. For example, we can have the combination of first name plus last name, this group could be the primary key. But we're not going to have the first name primary key. And we're not going to have also the last name primary key, because that's to my two primary keys. And we can do that we can have the combination of the two. But with foreign keys, you can have multiple columns having foreign key relations to different tables. But keep in mind, each column can only have one reference. So in the class example, we had a teacher ID, well that teacher ID, here's the class table. And let's say the teacher ID is seven, six and five. Well, this can be a reference and we could also have another foreign key we could have the building ID which can be 612, and eight. So although we only have one primary key in a table, we can have multiple foreign keys, referencing different tables, if we wanted it to be where the the, the instructor, I think it's supposed to be an AI. So the instructor to where the class can have multiple instructors. Well, that's a many to many relationship and we're going to have to design our tables with an intermediary table. So hopefully that's kind of making sense, I just want you to realize that the primary keys are what sort each individual tables, the foreign keys are what Connect tables, and that you protect that integrity with foreign key constraints, which we'll be talking about. In the upcoming next few videos. In this video, I wanted to talk a little bit more about foreign keys, and a characteristic known as not null. Alright, so this this keyword here might vary from relational database to relational database, but it's probably something similar to this not know, which we learned about No, and we talked about this over here, it's basically saying you, the database will not accept someone not answering that column, for example. So if it's not know, you have to give that call and the value every single row. So the rows are each individual entry within the table. If you have the column characteristic of not know, for every row, it's going to be required that it has a value. So think of like a fax number. Does everyone have a fax machine? No, most people don't. So if I set it as not, no, and I didn't give it a fax number, then it would throw an error and cause problems. That's, that's an example of how not all works. Now, this is important when it comes to foreign keys. Because it has to do with what's known as cardinality, which we'll talk about that when we start designing databases a little more. But it's basically whether or not a relationship is required. So this is going to dictate what kind of values we can put within our table back to the fax machine, if we have it to where it's not null. That means every single value, I mean, every single row has to have a value. That also means that if someone doesn't have a value, they might not be able to put any information in this table. So by putting this restriction, you're eliminating all rows that aren't going to have a value, you're preventing them from happening. You see what I mean? Yeah, so if the cases where you do want people who don't have fax machine numbers to be able to enter data into your table, well, then you would not want to use not no as a characteristic, because that's going to prevent them from doing that. Now, this is important when it comes to foreign keys. Because if we require a relationship, we said as not No. And that means every foreign key value for that column, every single row is going to have to have a relationship for that. So one of the biggest differences between primary keys and foreign keys is that primary keys are required all the time, never changing. Never know. Right? Drop munchak, dang. Well, foreign keys are a little different. Because we can have them to where they're not required, they can be empty, because sometimes that relationship is not there. Or they could update like we could, we could have a new relationship in this table, which we want. We don't we want to eliminate as much possible updates as possible. But sometime that's required. And that's why we use foreign keys to keep things connected. So when things are updated, they update across the platform. Now, we don't want the primary key values to change. But we could have the foreign key references change. For example, let's say we have a table over here. And we'll just have it will be will have to be carbs, okay? Or whoops, have the car, right. And we have two rows and have a car with the ID of six and a car with the idea with the ID of 10. Right. And we have this guy over here. And we have a table for him. What is making the user table and let's say for some reason, we wanted to make a connection of what car he owned. Well, he might currently own six, right? And we don't ever want the value of six to change and still be talking about the same car. Let's say this car is this one over here. This is some flames. Right and this one over here. We got a legit car. All right. So these are two separate cars. 10 talks about this car six talks about this car. We don't ever want it to happen to where this number gets up. Unit eight, or we change the primary key, and it's still referring to the same exact car, because then we have a changing primary key. But with foreign keys, it can change in the sense that, oh, I no longer own this car. Now, I own this car. So we didn't have any primary keys updated. But the foreign keys, which would be this connection, right here, did update. Alright. So that kind of is a little bit of a difference between primary keys and foreign keys, primary key value should never change, foreign key values can change, because references change. Hopefully, I didn't just lose my train of thought, I think I think I'm going but kind of hit a blank here for a sec. Okay, so let me just illustrate this in a college right. One of the important things between these tables that have a primary key and a foreign key connection to that primary key is that they have to be the same data type, which is the type of data whether it be int, or character or date. And basically, it has to have all of the same characteristics like the collation or the character set or the storage engine. Now, the only thing that can vary is the nominal characteristic. Because if we have the primary key over here, well, we know that the primary key is never know. So we always have not know, for the primary key, I'll just put an N for sure. The foreign key on the other hand, it's not always required to have a connection. For example, in that previous example, we had the person the user table and had this guy over here. And then we have this car table of the cars that the user owned, right? He could have one car, right? Well, he might not currently own a car, therefore, we don't have a connection to this table. Therefore, this connection is optional. Or not required. Yeah, that doesn't even look like English, but close enough. So let's think of a another example. If we have a college, well, we might be in early in the semester where every single class before people have registered, every single class doesn't have an instructor yet, and we're still trying to find an instructor. Although we know that's the class that we want to have. So that means we have an instructors table, we'll have the instructors over here, and we have a class table. A class table is going to have a foreign key pointing to the instructor for the instructor of that class. Unless the case of a many to many where multiple instructors can teach multiple classes, then we would need an intermediary table. But that is not the case here. You need to learn more about that check my relationships video. But for now, we would have a class ID and an instructor ID. Right. And then over here, we have an instructor at this it says ins instructor close enough right? Now this is going to be a reference to this over here. Now what kind of values is this going to accept? Let's give some examples. Let's say we have here's a row, we have the class ID seven, the instructor ID at three, and we'll add a call and we'll make the name of the class. We have a call minutes math. Right? And then we can have another value within here. I think I'm off screen now. Then we can have an instructor over here. His name, his ID could be seven. And his name could be shake. Right? So this class ID is seven yummy changes in the gonna get confusing. Let's make this 16. Alright, so this class ID seven. And it's by an instructor eight, which we don't actually have over here yet. So that means we either have another row or we have an error which because we never want that to happen. So let's say we have another class. Got any room. Okay, we have another class over here. And it's it's numbers eight and it's taught by Caleb right? So now we have an instance of a class. We got the class ID seven, the instructor ID eight, and the name math. So the instructor ID points to this Walker, this instructor over here. This is the parent I'm sorry, this is the parent This is the child because this is a foreign key reference to the parent. So that means This relationship is a foreign key. Now, with the not normal thing, if we made it to where it's not know, if we added the characteristic, not know, or that means this row is going to have to have a value for the instructor ID, that means every single value for instructor ID is required. What does that get rid of, that means it's impossible to put a class in this table that does not have an instructor ID, that could cause a problem if you have a class created, but you don't have an instructor yet, or if the instructor drops out, and you have to find a new one. So this one might be a bad time use not know, if we got rid of that. Well, now we could have a class with a primary key, but no foreign key connection. Now we have a class with the idea seven and it's met. And the the current teacher, we don't have one. And we still have this guy over here. He's not teaching that class, but he's still in the system. Jake's still in the system, the class is still on the system. That is a good example. So for this one, not an old might not be the best idea. Other examples, having nano would be a good idea because it forces that relationship to be there. You only think of another quick example. Real quick, quick example real quick. If we had two, two entities, we had a card for a credit card company. So we had the card. And then we had the person holding the card. So the person who bought the car or getting the card from the bank, and he's using it to buy their groceries and their gas, and they're breaking up 1000s of dollars in debt. Well, that person would have relationship with this card, because he's the card holder. That's the car. Now, if if we had a table for this, we probably have a card table, and we'd have a person ID, which would reference the person table, right? If we set this to not know, that means every single value or every single row within this card table is going to require a card owner. If we had if we got rid of this, that means we could have cards in here that were outdated, that don't currently have an owner because there's no connection, we broke that owner, or new cards that haven't been given given out yet. So if you want this table to just be for cards that currently have an owner, so we can have our own cards. Well, then we could have a a nominal column with the same data type as person ID. Now every single value is going to have a person ID Does that make sense? So that's kind of the introduction of not knowing how that works with foreign keys and how sometimes you will want them to be not known. Other times you want them to be nullable or not required. I got a bunch of writing on the board. But don't look at it yet. Because I just want to talk to you guys for a minute. I just wrote it. So you won't have to watch me write like 50 100 words is that to be boring. So instead, I'm just going to talk instead. So much better, right? Anyways, last video throughout this series, we've talked about foreign keys. And as we know, they keep connections between tables in a database. So table one, table two, boom, foreign key, connect from this foreign key references the primary key of this table over here, that's kind of how foreign keys work, they protect the integrity of our database. They keep everything consistent. Think about if if there was no foreign key constraints, which keeps keeps it updated and everything, then we would have to manually update all connections between tables. And that'd be a big waste of time. And it just be easier just to store everything in one giant table, which both solutions. either manually doing it or storing everything in one giant table is not going to work. So we need foreign key constraints to protect the integrity of our database, right? It's called referential, referential integrity, it's big words. Alright, so now this concepts these words might be specific to a certain relational database management system. So these are from MySQL, but if you're using a different relational database management system, yours might be a little bit different. But in general, the concepts are pretty similar. So you're not wasting your time if you don't use MySQL, because you need to understand how foreign key constraints work. All right, cool. So first of all foreign key constraints, make sure that if you update the parent, the children are going to update or cause an error or something like that, it's up to you, you can make the decision. And it's also going to burn prevent you from creating children, when there's not a parent that are that with that primary key. For example, if we have five users with the user's ID, 2468, and 10. And then I make a comments table, and I add a row that's like, from the user 745, well, that person doesn't even exist, so that's not going to work. So that's why foreign key constraints are important, because they once again, protect our database, and make it pretty much self managed, rather than us having to manually update everything. So let's begin. Alright, let me get a piece of chalk. This one looks nice, getting pretty low. This is the biggest piece of chalk I have left. Alright, so we when we create SQL statement, we can add these two key keyword phrases here on delete and on update. This refers to the foreign key constraint. So when we, but it's talking about the parent, so when we update the parent, we want the kid we Superfly. When we update the parent, we want the children to do something. When we delete the parent, we want the children to do something, that is what these keywords mean. So these keywords, it's talking about the foreign key constraint. And it refers to the parent. All right, so let's just draw some basic tables just to kind of keep things I'm not going to give any specific examples yet, I'll just kind of draw things out. So let's say we have a parent here. And as you learned in previous videos, the parent is the primary key, right? And then let's reference this primary key with a child. Let's not draw this line yet. And then we have a foreign key. This, this primary key is a column within the parent. And every single row has a unique value, the foreign key is a column within the parent, and it references the primary key. So the say the value 400 in a row could reference the value 400 in this row, and a row of this table, you get I've explained it enough. Now let's just make another child table. So these are both foreign keys that reference the same primary key. But they're two separate tables. So yeah, we can have a user table and then we could have a, a comments table and an Orders table, for example. They both are bought or commented by a user. So these foreign keys reference primary keys, so every single value within the table of the children within the foreign key column needs to be a value that reference that is present in the parent column. Now, these these values right here, these options refer to the parent. So when we delete the parent, or when we update the parent, we want the same thing to happen to the child or whatever the options are, and we'll talk about the options in a second. So that's kind of how these work now just to kind of clear things up I'm going to erase these words here just because it's getting a little crowded you know, I want it to be peaceful. Okay, so this is kind of how it's working foreign keys pulling back to the primary key these options refer to that what what happens to the parent? Now these these options right here these refer what happens refer to what happens to the child Okay, now it's getting real confusing. There's arrows everywhere, like graphs and diagrams and but you'll get it Don't worry restrict. You might also hear as no action is well at least for my SQL in different database management systems. These might be different, but anyways, restrict is going to throw an error. So if we try to like, we can give we can give one of these for each option. So on delete, we can either give restrict, cascade or set no or No. on on delete, we can have it restrict cascade or said no to To add more arrows, you know, but let's say we're working with delete right now. And we wanted to, we want to use restrict. So within our SQL statement, we could say on delete, restrict the on delete, restrict, and that is going to give us an error whenever the parent is deleted. So if I tried to delete the user, with the ID, I don't know I'm writing all that out, let's just say the ID of 504. And then this child has a row that has 504, within the foreign key column, well, then it's going to throw an error, and it's not going to happen, and the parent will not be deleted, not gonna happen. So that's how cascade on Delete works. I'm sorry, restrict on delete, now restrict on update if we updated that parent and change their primary key. So let's say we change 504. And these, these children both reference 504 with the foreign key, and then we change this to 508. And it's not going to work. And it will go back to what it was originally. Because we have it restrict, we're not going to let that happen. You see. All right, what about the next option, cascade cascade, we'll do whatever we do to the parent to the child. Let me clear this up a bit is confusing me. smearing chalk all around, you know, totally helping. Alright, so now if we cascade on delete, that means if we delete the parent, we're going to delete the child as well. So we delete 504. And both children are deleted, boom, child deleted. So now 504 doesn't exist at all. So like, for example, if this was a comments table, and we had a comment from the user with the user ID of 752, and we deleted 752, the user will then obviously, all of his comments by him should be deleted, too, because we'd have a comment by user that doesn't exist, which does not, which isn't supposed to happen. So cascade on Delete will do that on update is, if for some reason, we updated the primary key, which obviously shouldn't happen, because primary key should be never changing. But you still want that protection on your database. On update, if we have cascade, if we change like 504, let's say these people exist again. Skinny, right? So it's because, okay, so these people exist, again, 500, for 500 for these references, 504. And if we update this to 508, well, that means all of the children will update as well. So now, it connects within within different tables. Finally, we have Sentinel. And what that'll do is it will basically just set the children as normal for that specific value. So if we deleted the parent, that this is just going to get rid of the value, the row should still exist. But the actual foreign key reference, it no longer references anything, it's empty. So there's no it's not referencing anything. So the column for that row. So like the, let's say, we have a comment. And it's a comment from Billy Bob with the ID of 504. If we delete that, whenever we're going to have a comment, with no poster, or no comment, or at all, it just no value. That's all said no one will do for us on delete an update, same thing, if we change the ID of the parent changed it from 504 to 508. Well, all of the children are going to be set to no value as well. All right, now there's one important thing when he was set know that children have to have, they can't have I should say, the not no characteristic. We talked about that in previous videos. I'm just gonna put an end for sure. If this is not no, and we update, if we have on update set, no. And we update 504 to 508. Can you even I've ever gotten this off this whole time? reset five or four or 508? I think you can see that one. You can we change five or four or five away? And this is not No, it's gonna be like, No, we can't do it because we can't set it to No, because it has the nominal characteristic, which means every single row has to have a value for that attribute or that column. So I'll give an error. And I'll go back to how it was. So that's an introduction of foreign key constraints. Sorry if I talk fast, but yeah, cool stuff. It's important stuff. So learn it, study and learn it for your specific relational database management system. In this video, we'll be talking about simple and composite and compound keys. These are categories of keys. So like when you define a database, you can say, this is a primary key, or you can say its foreign key. But you don't have to say, Oh, this is a natural, simple. Primary Key, like those are just categories, you have to define that within our database. So these these terms that we're learning or more for design sake, so you know, the best design for your database, but you don't actually have to define it, right? Like, for example, this piece of chalk right here, you can define this as a white piece of chalk. Now, what kind of white piece of chalk? Is it? Well, it's x brand or whatever, you don't have to define that when you're telling someone Oh, here's a piece of chalk. That was the worst example ever. Moving on. Alright. Simple and composite. Okay, so simple means that the key consists of one called composite means it consists of two or more columns. This is most common with natural keys, because natural keys can be a combination of multiple columns, whereas a surrogate key is just a random number one column. So surrogate keys are simple composite keys or multiple columns. So for example, we can have a first name plus last name, plus email. This could be an example of a composite key because this whole group as a whole is the primary key. So we could have Jim, Jake, and legit are awesome at whatever's, blah, blah, blah. And then we can have another another one with the same exact name, Jim, Jake. And we can have a different email, these two are still separate, they're unique. Because the combination of the three are unique, Jim and Jake, repeat, but the email is different. So these as a whole, are different values for primary keys. So that's an example of a composite primary key. The reason is, is because it consists of three columns 123. This is an example of a composite key, a composite natural key to be specific, because we're just we're using things that are already naturally being stored within our database. Now let's think of a simple key, natural one, or username would be a good one is the neck username can be something that can be used by itself as a key, it's simple, because it's only one column. Pretty simple. I know. Alright, so I want to talk about another term you might run into. And that is compound. Okay, now, I want to clarify that some people and some relational database management systems don't clarify the difference between these two composite and compound terms. And they just use them interchangeably as meaning a key with multiple columns, some people go a step farther, and break these up into two separate terms. So we're going to be defining both of the terms, but you might see them used interchangeably. So don't be so super technical about the details. And there'll be like correcting people, Oh, hold up, that ain't a composite key. That's a compound key, the because most people are just gonna look at you like you're an idiot, and not really care. Alright, so anyways, moving on. A compound key is basically, when it's a call, it's a key that has multiple columns, and they're all keys themselves. So this is the most common example. This is for intermediary tables. So think about like this. We have a table used to store information about videos, we could say it's a video table. We could also have another table here, we can have a user table. Actually, here, let's let's do it this way. We're gonna have a comment state. Okay. And then within this comment state, we're gonna have a column of what video they're commenting on. You see, well, this is a many to many relationship, right? So this this might not work out well with database design, because one user can comment on multiple videos. And then one video can have comments from multiple people. So the best way to do this rather than having two separate tables like this We would create a video table. And then we'd have an intermediary table. And this would be video comments, or user comments. You see what I'm saying. So we get a connection between the user and the video to create an intermediary table. And it's split up with two keys. So we have the user ID, and then the video ID and then we could have another column for the message, or the date or whatever else. Well, you can see, if you don't know what I'm talking about, go see my videos over designing many to many relationships. But otherwise, if you know I'm talking about moving on, you can see that the entire key itself, every user and video ID combination has to be unique. That might, that might be a bad example, because a user could comment on the same video more than once. So this is probably not the best example. Let me think of a good one. But anyways, this would be a an example of a, we could add another column in here to make a unique, we could have the date and the message. So the message has to be unique, and the user ID has to be unique. And the video ID has to be unique. That'd be a composite key, a compound, a compound key, missing, for example, real quick. So a good example of this would be for college, we could have user we can have a student, student table. And we could have a class table, right? So this table holds information about the student, this information table holds information about the class. Now, how do we show enrollment, should we make columns for every student here now, because it's a many to many relationship, that means we need an intermediary table between the two. So this is a good example, sorry about my bad example in the last one, but that's a good example of a composite key, because the two columns themselves wouldn't be enough. So we'd have to add another one, which is not a key in itself. But for this one, we could have a column, we'll name the table, student classes, or the student class, whatever you want to call it, student classes. And within here, we could have the IDs, student ID, and class ID. Now, this combination between student ID and class ID must be unique. So let's say we have the value seven and four, seven and six, seven, and seven, that would work. But if we had something like seven and four, again, we'd get an error, because that's already in there. And that's not going to work out so well. So the combination has to be unique. And these are also both foreign keys, which point to another table. So both of them are keys. And the primary key is compounded. Because we have the student ID, and the class ID as one giant ID or one giant key, I should say, because the combination has to be unique. We can repeat seven on this side. And we can repeat like six on this side. But we can't repeat the combination of seven and six, they have to be unique as a whole. So composite keys. That's when you have at least one is a is not a key in itself. For compound, they all have to be keys. So for the composite. Let's look back at the the example we gave in the last one with the videos, the video comments, let's say this is the intermediary table blown up just so we can see it real big. This is the user ID. And then we have the video ID. Well, this isn't sufficient to make uniqueness because a user could comment on the same video more than once. Thank you for replying to comments or something like that. So we need another column, we could add the timestamp or we can just say the time column. And this is going to keep a unique time because you can't comment on the same video at exactly the same time twice. Right. So that would ensure uniqueness so we can ensure these three as a whole are unique. That's an example of a composite key, because at least one column is not a key in itself. This is a key because it points to a user's table. This is a key because it points to a videos table. This is not a key because it's just a column for That intermediary table. So that those are the three terms I want to talk about in this video. One other thing I want to talk about real quick, I'm not going to say what the best technique is. For this, I'm just going to say what some people do with an intermediary table, some people will give it a surrogate key on top of the compound key. So for example, with the college, keep switching back and forth between examples, I'm probably confusing you, they might add something, such as enrollment ID. And then we have the student ID. And then we have the class ID. These are both foreign keys. And they might use this enrollment ID as a, just a surrogate key for this table. Although it's not a requirement. For certain relational database management systems, some relational database management systems, or frameworks or something, are not going to be able to work well with composite or compound keys. So you might need to have a simple key which in that case, you could still have the idea of a compound key. This group right here, the student in the class, didn't have that index or whatever you want, or you can just enforce uniqueness with that. But then you can use an enrollment ID, so you can have like seven. And then you could have student six in the class four, and then you get eight, with the student four in class 12. And then you can have nine, with the students 16 in class four, and then we could have 10, with student 16, in class, 12, and so forth. That's just an example of what some people do. This video is going to be a quick review, I'll actually be using my notes here just to make sure I get the main everything because I'm not going to try to memorize like 10 videos or content has this crazy, crazy eye. But anyways, first thing we started with was super keys. So a super key was just any number of columns that make ensure uniqueness, but then a table. So we have super key. Which I still don't know why that's only one word, which just confuses the heck out of me. So someone please explain that, then we had a candidate key. Now the difference between a super key and a candidate key key is that a candidate key is the least number of columns used to enforce uniqueness. Now you cannot you can have multiple candidate keys if you like a certain group. So if that's the case, you have to pick one as your primary key. The primary key is the one you select as the the main key for your table, all the other ones you can assign as alternate keys. Which, if that's the case, they still can be unique, you can still enforce uniqueness upon those, but they're not used as the main key within your table. What else? Alright, well, then there's, that's pretty much the all the ideas with finding a primary key. And then when you reference that in another table, you have what's known as a foreign key. foreign keys reference, a primary key is a connection. That's how you make connections between tables. If you have a user user table, and then you have a is a comments table. Well, the comment is posted by a user, so you can reference that user in that comments table. And we talked a lot about that. So I'm pretty sure you guys got that pretty well. Now, this is pretty much all you got to worry about four primary keys and foreign keys, you just got to worry mainly about these two. These ones are less important about finding and figuring out. These two are essential. I'm not saying these aren't important, they're important. But really defining a super key is kind of pointless when you know there's the candidate key you don't have to say, Okay, well all of these define uniqueness. That's one. That's one super key. All of these define uniqueness. That's another super key. Oh, finally, we're down to the correct size for a candidate key. That's our candidate key. And then I'm check marketing By the way, just so you know what that is. And then oh, okay, we got a total of 10 candidate keys, five super keys. Now let's find a good primary key. Oh, this one looks nice. We'll use that one we'll set the rest is alternate keys. That's just complexity that is unnecessary when it's easier just to say, oh, there's a candidate key. It's already the smallest that we need. We'll use that as our primary key. And now let's see if we have any other alternate keys. We don't have any good ones aren't too bad. It's fine. who really cares? Oh, we do have some good ones. Oh, yeah. Let's set those an index and make sure they're unique. And we'll use this for some searching later on. That's an example of how you would find a primary key. You don't have to make this manual labor for like 20 hours to figure out which primary key to use. Alright? Then you reference that primary key in other tables foreign key. So that's why these two are the most important foreign and primary, those are the two you should focus on the most and memorize, okay, the alternate candidate super key, and they're kind of important, but not as important. Alright, so let's take a look at primary for. let's just, let's clear this out for a second. Alright, Rob Merritt. Now, there's some categories, dude, I dusted now it's like I can breathe. There's some categories of keys that we can talk about. So the first one was surrogate and natural. So surrogate natural keys. a surrogate key is just a random number that has no real world value, or no real world meaning, just like an ID, user ID, student ID, caller ID classroom ID, store ID, sale ID, record ID, song ID, picture, ID, whatever you want it to be, I don't really care. So long as it's an ID, and it's a surrogate key, then, yeah, yeah. I don't know what I just did there. But anyways, natural something that's already naturally in the database. And you just define it as your key. Now, typically, when you want to use keys in your database, which you will, because it's your relational database, you will pick either surrogate or natural keys, and you will use them for throughout the rest of the database, you're not going to switch from surrogate and natural, you're not going to be like, Oh, this one has a natural key, we'll use natural, but this one doesn't, and we'll make it surrogate. That's a bad design, you should be able to enforce uniqueness by the columns that are already there. And then you can add a surrogate key if that's what you want to do. And if by any chance, you cannot possibly define uniqueness without a surrogate key, you're pretty much required to use the surrogate key. But that situation should try to be avoided. But there's times when you literally cannot find a natural key that will be sufficient. So you guys kind of follow it. I mean, surrogate keys, random numbers, natural keys already in the table. Cool. Simple. Speaking of simple, we'll talk about the simple key. Alright, so we got simple. We got a composite compound. Okay, so simple is basically a one column key, composite and compound, our multiple column keys. We talked about those in the previous video, I believe. Yeah, so that's pretty much all the categories of keys. The when you define your primary or foreign keys, you're not gonna have to specify if it's a surrogate natural, simple composite or compound. This is for our sake, to design the best way, because it's, for example, do you want to use composite keys? Or do you want to use simple keys? That's a good question. If you decide you want to use simple keys, which by the way, surrogate naturals kind of thing you do throughout your entire database, you can kind of switch back and forth from simple and composite. For example, you could have a user's table, and we could have a look, I sorry, students table, a class table, and then we're gonna have a, an enrollment a with the students taking certain classes, we could use a week, I guess we can use a compound for this table. And then we could use simple for these tables. That's something you could do. But you should still use either surrogate or natural, just pick one and use it throughout your database. Hopefully, I'm not talking too fast, because I'm going pretty quickly. If I am just let me know as I see a column A couple inches, come down a couple decibels, because kind of hyper, okay. Now, the other thing we talked about was foreign keys and foreign key constraints. Well, the main kind of foreign keys, the main kind of foreign key constraints, were on update, and on delete, these protect our integrity of our database. You define these, whenever you make a table, when you when you use the CREATE TABLE statement or create or whatever your relational database management system does to create tables, the chances are, you're going to say that you have a foreign key. I'm just gonna put fk for short, and then you have a constraint and you can usually name that constraint. So you can be like blah, blah, blah, and name it whatever. And then we could say on update, or on delete, and the options were the ones I thought you were do nothing or basically thrown air or we could update the children or we could basically get rid of the value for the children. So that would be restrict cascade and set. No. But your relational database management system might use something a little bit different, which is fine. Yeah, that's, that's about it. Yeah. Yeah. That's pretty much it. Cool. Cool. So yeah, that's a summary of keys. So yeah, well, that was a ton of stuff. So there's a couple different names for this. So basically, it's just a standard. It's a standard for drawing databases. And the couple terms you might hear, or acronyms I guess, e AR, model, E, AR, D, or E AR model, or just database design, whichever one's easiest. But basically, these are some terms used to refer to a method used to draw out your entire database structure. These videos have been drawn tables and lines and drama, drawn tables and lines and stuff, and all kinds of cool things like that, but actually haven't officially designed any database. So what we're going to do now that we've learned most of the concepts, the only thing we haven't really covered is normal forms, which we'll be getting into those. But anyways, and I'll make them easy, don't worry. So what we're going to do is we're going to learn the standard for drawing these databases, we're going to practice with that. And then we're going to learn the normal forms. And then we're going to design databases for a lot of stuff oriented. So we're going to be getting into the standards of drawing databases. And obviously, the standards kind of rough. I mean, there's, there's not a specific way, you have to do everything, because that's annoying and dumb. So what we're going to do is we're going to be drawing pretty simple things. And I'll just be kind of illustrating kind of how that works in this video. So what this means is enhanced Entity Relationship model, enhance relationship diagram, enhanced relationship model. Yeah, I think that's what they stand for. I don't think it stands for relation. I think it stands for relationship, which there's a difference, as you should probably know, by now. A relation is a table, a relationship is a connection between two tables. Yeah, so anyways, yeah, so we're gonna be talking about these design techniques are probably this call and like, er, D er, er, er? Or, who knows? But yeah, the way you do it is you define your database structure. So anything that is DDL pretty much data Definition Language part of SQL, we talked about that a long time ago. We're not going to define the actual values. For example, if we have a table for users, we're not going to put in a row. Oh, Caleb curry with the password of sweet pie is delicious. 24. And the email of Yeah, yeah, yeah, yeah. yeah.com and the telephone number of 100 you wish or anything like that. We're not going to put anything like that in there. We're just going to put the columns username, password, email, telephone, three d that is so cool. Okay, anyways, let's get started. So to draw a table, database table, you just draw square, it's like super easy, okay. Well, how he would evolve and so easy, right. And obviously, doing this on a chalkboard is not very convenient, because for one, you're limited in space. And for two, most of you guys don't have chalkboards. And for three, I don't know a third reason. But most of this is actually done on a computer. And we will be doing that hopefully soon. So stick with these concepts we'll be getting into on the computer design for whatever reasons. And obviously, as I make new videos in the future, that's what I'll be doing. So for now, we'll learn the concepts. And you can download Entity Relationship design programs, such as MySQL Workbench, I made other ones like verta bello or something, I don't remember what it's called. And a couple others, which you can design for any relational database management system on that one. So go check out my other videos, or my playlist, I think it's under reviews or something. But I don't know, look up database design on my channel, see what I got. You can also just look up er, er, er D relationship, or design. And you can get all kinds of apps and programs to help you design your databases. And then some of them have what's known as engineering. So you could forward engineer your database design, and it will basically just program your database for you. And it's super easy. Yeah, but I'm getting way off topic. So let's get back on topic. Okay. So here's how we draw a table. What we do is we give it a table name. So you put the table name either right here in the in a complete column like that. or miss right above it. So let's put it up above it right now. So this is a user table. And then what we can do within here is we put the columns this way, vertically, when we write them horizontally, but we list them this way, we're not putting them across here. Because if we put them across here, we will assume that we're going to have row value, like Caleb, password, and his awesome email is legit, or cool, and then the password already set password. But we're not doing that, we're just going to put the columns. So the first column we can have is user ID, because we're just going to use a surrogate primary key in this example. So we're gonna have user ID. Now, when you get into it, you can also put other things like data types in here, and any other restrictions you have, such as not normal if you wanted. But for now, and for most of the design for design concepts, I'm not going to do that, because I'm going to be writing the same thing over and over again. And you kind of only do that when you're finalizing your database. So you can kind of worry about the data types later on. But for now, we're just going to get the columns figure that out. Good. We'll go from there. So user ID, username, the password, and so forth. Hopefully, you guys can read that. Probably not. And then when we do if we want to create another table, guess what we do? We just draw another box, right? And we connect these with lines. And we can, there are certain ways to draw lines that represent different linear relationships. Like that's one way to do it. Or we're gonna have like, yeah, I mean, you get the point. Oh, I found a nice big piece of chalk. Cool. Alright, so that's how you do that. Now, that's pretty much all the basics of entity relationship diagram drawing, just remember, we put these in here, the columns. And what we can also do if we wanted, we could put indexes down here, we could put like index types. And then we could also draw our relationship types with different lines. And we can draw our foreign keys by either putting them down here or doing some other kind of thing. Now, there's also other conventions you might have heard of, or have used, like UML, and other design techniques. And I'm not saying those are bad, I'm just saying this one super easy for beginners and for experts. We actually talked about this a little bit, but we haven't really designed it or anything, because now we're talking about actually drawing out databases. cardinality is basically the relationship type between a row of one table and rows or a row of another table. So I thought of the the best example I have for this is a credit card database. So like, basically, a credit card company has a table for credit cards, and then a table for card holders or the people who order the credit cards and use them to buy things and basically their cut the credit card companies customers, I guess. So I will draw those tables out, just so you can see this, I'm not going to draw them or vague or anything. But just so you see how the connection works. So we have a card holder. And then over here we have the card. And there's some kind of relationship between here between these two tables. Now, the only two possibilities for cardinality are one, or many. Obviously, for both sides, you can have one too many, one, too many, or many, too many, those are the only three possibilities. Now, another thing to think about when it comes to many to many, that might even you might even consider that as not a possibility because you'd have to have an intermediary table, which we talked about that when we talked about designing many to many relationships, when when we're talking about many to many relationships here. It's logical, which which What I mean by that is, we're we're drawing it as a just like this, and we would say many, to many. And then when we actually implement that in a database, we would have to have a cardholder card holding table and then the cards table that would be how to draw a many to many relationship. So hopefully that kind of makes sense for that. So in reality, when you're actually drawing out finalized products of databases, the only possibilities are one to many, or one to many, because even if you have an intermediary table this is good. Going to be a one to many relationship and then a one to many relationship. And there's certain way to draw those out. That's what I'm going to be drawing right now. So I wonder many, it's pretty simple. The one side, all you have is a straight up and down line like this, the many side you have this, this thing that we're doing right now is known as crows foot notation. Because it looks like crows foot, I think that's why they call it that I don't really know. But this is a one, this isn't many. So when we're drawing it, one, many, we could also have one to one, of course. So that's another possibility. I forgot to mention one to one. Now, before we started drawing on both sides, I think it's easy just to think about one side, like think of a cardholder relationship to a card, we would have one card holder. Now that we have that side figured out we got, we just think about the card holder, then we can add in the card, and we have many cards, so one card holder can have many cards. And that's how you would design the line between the tables. Like that. Now there's another thing we need to talk about known as modality, which we'll talk about in the next video. But for this, we're just going to keep it simple, just drawing these things here. But you might see also like circles, you might see a circle here, for example. We'll talk about that in the next video. So now you know how to draw a one to many relationship. So what does this exactly mean? That means one row can be connected with many rows and this table. It's not necessarily the table but the rows within the table. So we can think of a specific card holder, we can have a card holder with the name jack, Id of two. And we can have many rows over here cards, we can add a card with the ID of ID six, Id eight, Id 12. And you can see this guy has a lot of credit cards. So all of these rows can be connected to an individual row within the cardholder table. Now let's look at it the other way. We could have a many to one relationship, or basically a one to many going the other direction. So we could add this. So now one card can be connected to multiple people. But multiple people can be connected to one card, I'm sorry, one card can be connected to multiple people. But one card holder can only be connected to one card. So that means one person cannot own more than one card. So now we think of it the other way around, we have a card. With the ID three, we have a card with the ID for the ID of six doesn't really matter what the numbers are. And then we have a card holder, we just got got a guy named jack, which ID is three. And then we got bill ID two, well, we have we can have one card attached to many people. So this card can be attached to many people, which means these people co own a car, they're both the owner of one card. But now you have to realize an individual cardholder can't be connected to multiple cards. So jack cannot own another car, that's not going to work out. If you wanted that to work out, then you'd have to have an intermediary table, which we talked about and designing many to many relationships. And this video is less to be a review of relationships, but more of how to draw them. So you have to realize the difference between all of these possibilities on one, one to many, or one to many, it just depends on the actual application modality, it's basically whether or not the child is a required child the relationship is required. We talked about that when we talked about the column characteristic. Not No. So, think of it like this, we have a credit card company, we have a table for the cardholder. And then we have a table for the card. Obviously, the card holder, the ID for the card holder, the person who owns the card or or buys the card from the credit card company, they're going to have an ID, and that's going to be a not null column, meaning it has to have a value. That is obvious because it's going to be the primary key surrogate primary key, and it has to have a value. So we have like this guy, and he has an ID of seven, we have another guy with the ID of 12, and then 368, and so forth. Those are ID cc card, that is going to have a column, which is a foreign key, referencing the cardholder because the card is the child in this relationship. So the cardholder has the primary key, you sat down there, if you can, card is going to have a foreign key, which connects to these. So this is going to be a foreign key. Meaning, these are the people that own certain cards. So we have a card with the ID of 12, we have a card, a card with the ID of 48, I'm just making up numbers are here, in a card with the idea of 98, we have a card of the ID of 112. Well, they each have an owner, you see this is the owner. And this is a foreign key, which references the primary key. So seven, this card is going to be owned by that person, seven, this card is going to be owned by that person. Seven, this card is going to be owned by that person. So all of these three cards are owned by that person 368, that's going to be owned by somebody else. Now, if we gave this column right here, the foreign key the basically the cardholder column, because it's a child, if we gave it the column characteristic, not know, it would be required that every single row has an owner. That means we can't have a card in here that is either not activated in doesn't have an owner, or we can't have a card in there that is not being currently possessed by a cardholder. Because you think if you have a credit card, somebody owns that credit card. But when you first make a credit card or something or if you like, if if you get rid of the credit card or something, it might be a credit card that currently does not have an owner. Does that make sense? So basically just a credit card that's not being actively used, or is disabled or something like that, and does not have a cardholder? Well, if we wanted us to if we wanted this database to be able to store cards that don't have owners, we would not want the nominal characteristic, because that's going to prevent it. So that's kind of how that works. So if we take that off, then we could add a card down here with the ID of 101. That doesn't have a card. Well, I know we already explained this all but now let's talk about designing. Let's just make this simple. Pay attention to the Say, say one car, well, this obviously is going we're going to talk about a a one to many relationship, meaning one card holder owns many cards. And we're also going to be talking about a one to one relationship, meaning one card owner owns one card for one too many too many. We just need an intermediary table, the same concept kind of applies. But we don't necessarily have to go through that just break it into a one to many, two to one to many relationships. Alright, so let's try this. So we have over here we have the table. That's going to be the card owner. Over here, we have the table for the cards. And there's a relationship between these two as a really big gap but whatever. So let's start over here. We can say one card holder because that's what we're just going to say one card holder for each time. Let me bring this in so they can actually see. One card holder can have one card. So what we just did is cardinality. We talked about this in the last video. Now let's do another example was the same thing and you'll see why in a second. Let's do another example. And then finally, one more example, run out of room here. So we have the basic relationships. Now we can add one more thing to say if it's nullable or not nullable, basically, does the child. This is the child. Which is the card. We're asking, does it does the foreign key column have the nominal characteristic, that's so we have the possible relationships, one to one, one to many. Now, what we're going to do is we're able to add the modality in and the way we do that is we add either a little circle, or a little bash, or you can think of it as a zero or a one, which actually will make sense. And we'll see why in a second, zero, or one. So, over here, we have the cardholder, one cardholder for all these examples. And then over here, we have the card. zero means the column does not have the not null characteristic, meaning it accepts no value. So this is saying it can accept no value. So we have a one to one relationship, that's the maximum, the card holder can have up to one card, but it's not required. So zero or one cards, one or one, cards, zero, at least zero cards, because you can have up to many, or at least one card. So this means not No. Not No. Now when we design this, we're talking generally from table to table. So we would have card holder, and then card. And why do I have to say that, basically, because if we have, if we have zero here, that means there's not necessarily going to be relationship between a certain row. So when we're talking about rows, we might not actually draw a relationship between certain rows. When we draw relationships, we talk from table to table, and those relationships, explain what rows can have relationships with different rows. So in this example, let's just focus on the top one, just to make it easier, we would draw this within our database application, we would say we have the table cardholder. And then that has some kind of relationship with the table card. Now we could say one card order can have zero up to one relationships with one individual card. So we have a card holder with the ID of seven. And then we have an ID card with the ID of 68 owned by the card holder seven, so that points back and that's acceptable. We can also have a card we got a card 69. And we could have a person eight, and we could say oh it doesn't have an owner, that's fine, we could have another card 70 and it can have the owner eight, which points back to eight. And then we can have another person which could be a seven, we're gonna have a card, we're gonna have a card 63 which the has the owner 87 and that points back to the owner 87. So that's how those relationships work. Now, at first, this can be kind of confusing, so just practice. Make sure you fully understand how each one works. Just review we could have, there's four possibilities we could have. And if the down up there makes more sense. One card holder can have zero or one card, one card holder can have one, only one card because you can have one the least and then up to one and then you have one card holder zero or more than one card holder one or more. These two are required, meaning every single child has to have an owner. Or every single card has to have an owner, every single child has to have a parent. So hopefully that all makes good sense. I just want to introduce a topic. So in the next three or so videos, when we go over it, it's gonna be like kind of not all new to you, you'll kind of have an idea of what's going on. So we're going over what's known as normalization. So, normalization is a process where we go through our, our database plan that we have, and we start correcting things that may cause data integrity problems, or just repeating data, etc, etc, whatever you want to say. The normal forms are basically like a checklist that you're going to follow. And when you get to the end, you will have a pretty good design database. There's three main forms. And these are named, very conveniently, I have to admit, first, normal form. Second, normal form. And guess what the third one is? You guessed it. Third, normal frame. I'm kidding, it's form. That'd be weird, though. Alright, so God, first of all, form second normal form. third normal form. These are the three steps of normalization for databases. Now, if you really, really, really want to get into database design, you just like, man, I love this stuff a guy get more? Well, there's tons of more steps, you can add and make things all complicated if you really want. But for our practical purposes, these are the only three that you're going to really, really need. And I know there's like a glare right here, because these lights, and I know, I'm just working on it. So give me a couple videos, and I'll try to work on my lighting a bit. But for now, first of all form second Normal Form third normal form. Why do we have these? Well, it's because we want a systematic way to follow to produce a normalized or a good structure database. Alright, there's a couple things you should know when it comes to normalization, we still have to talk about the idea of things being atomic, we want everything to be atomic, you also have to kind of think about data depending on other data. So if you think of like, let's say a primary key, yes, 608, just a number that could refer to anything that could be a person, an order an event, whatever it is, is just the primary key. And that also is going to have different columns in the table, which have different attributes about whatever this thing is. So let's say it's a person. And we have a column for the name. And let's say this dude's name is Caleb, what a dumb name, right? This depends upon the primary key. So what I mean by that is, if we change the primary key, we're going to also change the person, right, so let's say we have six or nine, well, this is going to be a completely different person, right? Because they're two separate entities, this one's referring to some guy over here. And this one is referring to some guy over here, two separate entities. You can also think, oh, by the way, that that process or that thing, this is called dependency. So the next three videos will be going over the first normal form, second normal form. And the third normal form. The other main thing is that these are kind of comprehensive, I guess, you could say. So what that means is once you have first normal form, now you can go on to the second normal form, but you can't go jump in right here, you have to have it in first normal form before you can get into second normal form, same thing for second normal form into the third normal form. So that means if you have a database that is in third normal form, it is also in second normal form, and in first normal form. So I know I'm like getting all these technical terms in junk. So if it's freaking me out, don't worry. It's really simple. Guy, get my hair out of my face. Alright. But basically, it's, it's a list of steps. list of things to do. And basically, you can do this step two until you do step one. So this year second normal form, that's step two. Can't do this until you do that. So you got to go through the process and then once you're at third normal form, things are good to go. This video will be over the first normal form, which is the first step in database normalization this normal form deals with data being atomic or atomicity. Alright, so let's begin. Let's first look at three examples where there are problems. And then I'm going to explain a solution. If we take a look at this first example, we have a user ID, first name, last name, email, and address, I have some values in here. But those aren't really important. In this example, the main thing you need to pay attention to is the address, because an address if you think about actually consists of a street address, state, country, and so forth. So this column is not atomic, we're asking for multiple things within one column. So that is the problem in this example, the solution for this one would be to break it up into multiple columns, you know, a state, and so forth. That way, we don't have a bunch of things within one column as an individual value. This example, you can see, we have one row with a user ID, first name, last name, and email. But if you look closely, we have two emails, these emails were entered together as one individual value. So you can think of maybe on a website or ask you for your email, imagine putting one email in and then putting your second email in the same box and clicking Submit. Well, it's entered into the database as a single value, maybe with the comma or something, but it's still entered in as the email for this user. So two values within one, which is not atomic, in this example, the actual values given for the email are atomic, because they're only one email, but we put two entries within our database. So now we have this in here twice, this in here twice, and it's in here twice, and the primary key should never be in there twice. The only time you can see the primary key in there twice is if it's a reference as a foreign key, because the foreign key can be put in there multiple times. But the primary key of five should only be in there one time. So we have all that duplicate data. Now what if this Caleb curry goes and gets a name change? Because that's such a dumb name right now. Just kidding. That's, that's my name. And I actually like that name. But anyways, if he gets a name change, and he updates his information, what if only this one's updated. So now we have conflicting data. And see that's the problem with data integrity, we do make sure that doesn't happen. So in this example, two column is the problem. And this example, the values for the row is the problem. And in this example, the values for the row is the problem. So these basically have the same problem. How do we figure out the solution? Let's take over here, let's take a look over here. And as you can see, there is no solution. So yeah, guys, thanks for watching. I'm just kidding, let's, let's do this, the best solution for this problem is to have a user table. And I'm going to write the columns that are within this user table. So we would have a user ID, then we are going to have a first name. And we're going to have a last name. If you wanted the address in there, you could have street address, state country, all that other good stuff, then we're also going to have an email table. So the solution is to break off the problem column. You can see in this example, the email is the thing causing the problem, the address that we had earlier, that's fixable, if it's a problem in the column, we could just break it up into multiple columns. So we wouldn't have to make a new table for that. But since this one, even though the columns correct, it's the values within it that's causing the problem, we would need to create a new table for that. So that's why we have this email table. And now within here, we're going to have an email id. And then we're going to have a email, which is going to be the actual email address, which in this case will be these. And then we're going to have a user ID. Now this is going to be a foreign key. That references this user ID, the primary key of the user table. Now if we wanted to put two individual emails, we could have rows within this email table, let's let's spread out the email table and see what it might look like. Here's what the email table might look like. We would have an email ID which is just a randomly generated number for each email 600 dates fine. And then the email that's going to be the actual address of the email. So we could have this I try to make it just as bad as handwriting. Perfect, then the user ID of who owns that, which in this case would have been five, because five was the ID that was used for this user, this will be your first row, then we're gonna have another row, and it'll be a new email. So we'd have a new email id, it'd be a different email, we would use this one this time. And I'm going to make that beautiful handwriting you guys love. There we go. And then the user ID for that would also be five. Now this repeating value is okay. This, this repeat right here. This is okay, because this is a foreign key. And multiple emails are allowed in this situation. This email id, this is not not okay to repeat, because each individual number should be a separate email because this is the primary key of the email table. Now, when we look back at the user table, we don't have any repeating data, because we're only going to have a user ID such as five first name, Caleb, and the last name, curry. And there's no need to make a repeating row. Because we have all the information we need without having any problems. If we change the name, we will just update this keep the same ID change this to whatever we name our name ourselves, or whatever the person names themselves. So there you go, everybody, that's first normal format. Now, second, normal form deals, deals with what's known as a partial dependency, that's when a column only depends on part of the primary key. So in order for it to depend on only part of it, you have to have a compound or composite key. So basically, the primary key has to be multiple columns. Because if you just have one column as the primary key, a column can partially depend on half of it. That kind of makes sense. I don't know, maybe not. Let's say we just have a random table for people. So we have you know, the person. So let's go with let's go. Let's just use a person ID. And we have some attributes about this person, we have their name, you know, maybe their phone, or their pH level, and maybe their email. Let's first talk about what a dependency is, before we go and talk about partial dependencies. These columns depend upon the primary key. So in this case, this is the primary key. And there's a dependency here. Another way you can think about is if we have a another table about cars, we can have a car ID, which is also a surrogate key. Now this person's name, it does not have a dependency here, you can see this name is only dependent on the person ID not upon the car. So this doesn't make any sense here. Another thing is that the person ID doesn't have an unrelated entity or an unrelated attribute about the car ID so such as the car color. The car color has a dependency on the car, not the person, this isn't going to make any sense. That's what a dependency is. Now, what is a partial dependency. So let's see what a partial dependency looks like using surrogate keys, you'll see this when you have a many to many relationship, broken up with an intermediary table, which is the correct way to design a many to many relationship. So let's think of the example of books and authors because if you think about it, one author can write many books and one book can be written by many authors. So conceptually, it's a many to many relationship, we're going to store that in the database is one to many relationship on one side, and then one to many relationship on the other side. So we get this look at the table, intermediary table, and then a table. So over here, let's put the author's so this is the author. Then over here we have the book. So this is the book table. And then in between we have the intermediate intermediary table of book authors, or book author. So this is the correct way to design it. Now when it comes to attributes, all the attributes about The author are going to go over here, all the attributes about the book are going to go over here, and all the attributes that have to do with both the app, the book, and the author's connected, those will go in this table. So first, let's give them some surrogate keys will give us one author ID, we'll give this one a book ID. And this one's going to have two foreign keys of the author ID, and the book ID. And those foreign keys together will be the key for this table. So that's kind of how you would set this up. Now things about the author go over here. So you know, their first name, maybe their last name, maybe their birth date. Whatever you want to put about the author, the book would have stuff about, you know, the ISDN, which is, the ISDN is the code on the back of a book, you can look up, it tries to uniquely define that book. So the exact addition and everything else, so you can just use that to define it. You know, we will make maybe have the page numbers or the publisher, the publisher could be a primary key or a foreign key to another publisher table, or you could just have it in that table. If, if that's how it worked, but likely to go to another table. But we're going off topic. Anyways, it looks something like this. So author information goes over here, book information goes over here. This is the correct way to design this table. Because when it comes to dependency, the first name of the author has only to do with the author. And it has to do all about the author. Now let's look at things that have to do with both the book and the author, we could have something such as the author position. Now what that is, when when you write books, often there's a person who has like the first author position, which is what the big name is. And then you have the second and the third, and the fourth, people often compete to try to get first author, this has to do with both the book and the author. Because if you think about it, if we put author position here, that's going to depend on what book we're talking about, we can't just put author position one, and say he's first on every single book he ever is going to write, that doesn't make sense. We can't just say author position one on the book, because it doesn't say which author we're talking about which authors one, it doesn't make sense. That's why we have to have it in this table, because it has to do with the book and the author. So we could say book ID is 17. Author ID is 22. And the author position is one. So that would say the book with the ID of 17. And the author with the ID of 22 is in the second or the first position on that book. It's a little complicated, but we combine that with joins to make it make more sense for the actual viewing of the data. But anyways, this relies on the book and the author ID, that's why it's in this table, this is correct. Now an incorrect thing would be something like the ISDN. Because the ISDN has to do with the book only. So when you look at this, it relies upon the book ID it has a dependency on the book ID, but it doesn't have a dependency on the author ID. This is an example of a partial dependency. Now the correct way to fix this, in this case would just to be take isbm and put it in the book table, which we obviously already did, because I put that in there first when we started. But if we didn't have this table and you're working with to say like one table, maybe well, then the correct way to do it is to take the book ID the iasb and drag them to another table, and then use a foreign key to connect to that table, which we already have it structured correctly because we understood how to design that many to many relationship, which is why relationships are useful to understand what if you don't have it already set up correctly, you'll have to take the partial dependency and move it to another table and reference it with a foreign key for a second normal form, you want to first be in first normal form. And second, remove all partial dependencies by moving the columns as we did here we took that is being put in the correct table. Now another thing you can think about is if you have a table where there's a primary key of only one column, you're already in second normal form for that, like imagine it for this. How can birthday be dependent on only part of the primary key which is author ID because there's only one you can't you can't depend on only part of An individual column. Thermal form in order to get there, you have to first do first normal form, and second normal form. So it's kind of like a ladder, you can't just start a third normal form, you have to work your way up there. So this one deals with what's known as a transitive dependency, which is when a column depends upon a column, which depends upon the primary key. So I know that's kind of like confusing. So let me just kind of draw it out on a table, make a lot more sense. So here we have a review table. As you can see, I'm kidding, you can't see it's just a square. So we could say, review. So like, this could be on a website, it could be, you know, like a shopping website where you review a product, or it can be a review on a video or anything really, it could be review on a book, most likely, it'll be on a shopping website, where you buy something, whether it be a book or anything, or cars or whatever, it doesn't matter. But in this, you would have something like a review ID, obviously, which will be the primary key, you would also likely have what that review is, like in a text form. So like a comment. So you would have the actual review. And then you would have maybe a star rating, or multiple star ratings. So you could have a star rating, so maybe one through five, one being the worst five being awesome. And then for maybe being like really sort of Okay, you know, and then we might have something that has a star star, meaning. So that would be like a sort of okay, right. And finally, we might have something like either a user or the poster, I'll just put the user because likely, you will have some kind of user account on this website where you can post comments. So this will likely be a user ID. And it will be a foreign key to a user table. But we're not going to put that in here right now. Because that's unrelated to the actual point. So foreign key primary key. Now, let's look at right here we have an issue. This is what's known as a transitive dependency, because the star meaning depends upon the star. And a way you can kind of try and figure out if it depends on it is, okay, well, if we change the star is the star meaning going to change? Well, if we take the star, and we change it from a four to a three, and might go for, it might go from really super okay to really average. So the star meaning is going to change. So that's a dependency. But then the star actually depends upon the actual review ID. Because if you change the actual review, that star is able to change. So this is known as a transitive dependency, and it is bad. Now you can also have a multiple transitive dependency, which I don't even know what those are called. But that would be where I call them depends upon a column, which depends upon a column, which depends upon a column, which depends upon the actual primary key, it can go on forever, for as long as you want. But you don't want it because it's bad, which lever going to remove. And so to remove this, you take the columns that are causing a problem, you move them to a separate table, and then you create a foreign key and replace of them which references the actual other column. So let's do this. So since these columns right here are the ones causing the main problem, we take them, we put them in their own table, we have a star table, within the star table, we're going to have a star I make. And then we're going to have the actual star, whether it be 12345, or whatever kind of rating system you're using. And then we're going to have the star meaning. And then over here, we're going to replace these columns, let me make this a little prettier. And I just made it five times worse. But we're gonna replace that with a star ID, which is going to reference this table over here. So now we don't have that problem, because the only column over here is star ID, we don't have the star meaning, which was the real column that was causing the problem. Now over here, we could have if we drew out this table, we would have you know, we could have a star ID of three. And that star could be a three star rating. And it could be meaning it could mean Average. Now, we're likely to only have five individual rows in this table. So it's kind of like a reference table. So over here, we can have a store ID of 12345, or 01234, or 56789, it doesn't really matter what the ID is, because the ID is just a number used by the database that has no real world value. So we'd have five rows in here, and then we would reference them in the reviews table. So we could have, okay, this is too ugly, let's let me just explain it, we can have one review with the star rating ID of five, which could correlate to a certain rating from this ratings table. That's the main idea of the third normal form, is to take those transitive dependencies, move them to a new table, create a foreign key to reference that table. So under view, first normal form is making everything atomic second normal form is removing any partial dependencies. And then third normal form is removing any transitive dependencies. This video, we will be discussing indexes, the best way to think about it is to take a book, look in the back, there is often an index here, which basically has a huge list of topics. And it says what pages are on. So you'd be like, oh, wow, that looks fun queries, go to page 2078. Wow, it's right there, I found it so fast, I have to go search through the entire book to find it. That's an example of an index. It's basically a list of where certain data points are another kind of indexes, you can think of a phonebook, you go to a phonebook, and you search through the phonebook, you find someone's name, G, we go to Grandma, and it'll have the phone number right there. That's another type of index, it's where the data is sorted in a way that you can easily find it and the data is right there, versus an index in the back, or it's a list of data pointing to where it's at. The one we first discussed, is known as a non clustered index. What that means is that the data is not actually where the index is, the index is a separate thing. And it basically just sorts the data. And it's basically a point that tells you how to get to the data. So the way it works in a book is you find the topic, and it'll tell you where to go 365, for example, go to page 365. That's a non clustered index. A clustered index is like the phonebook, where it actually reorganizes the actual data in a way that's easy to use non clustered, points to the data clustered organizes the actual data. So we can have multiple non clustered indexes. That's because a non clustered index is basically just a list of references that point into the data. So we could organize it in different ways. But as for the clustered index, we can only have one of those. That's because it actually organizes the data that way. So think of a phone book, it's organized a to z, or however, you go through there and you find it. That's a clustered index, it organizes the data. Well, what if I wanted to put the data in by the phone number, so the smallest phone number like 0000001, which would be an awful way to organize a phonebook at the beginning, and then 99999989 at the end, that's an awful way to design a phonebook, like I just said, but that is another example of a clustered index. That's because it actually reorganizes the data. That's why you can only have one of those because a phonebook can't be listed A to Z, and also, by the size of the phone number, see what I mean. It's conflicting, you can have both at the same time, you can only have, okay, you can only have, okay, this is sorted by name, or Oh, this is sorted by phone number. You can't have both. Now, you could make the phone number size or whatever, as a non clustered index, where it would be like this, to have the actual phone numbers in the back, and it would point to where it's located. So at the beginning of the index, it would be like, okay, the phone number that's 0000000001 would be on page 743. The phone number with 0000000002 is on page 242. You see that the actual data isn't organized. By the, the phone numbers, but you can use that non clustered index that sorts it that way, and has a reference to the actual data where it's located. Now, you guys are probably sick and tired of me rambling about this. So let me just explain how this makes sense in databases. Well, when we're working, we're working with a database, often, we will use a certain column for certain things. And that column will be used frequently. And we want that to be like super fast. So we want the database to know how to use it the best way. So we create an index, the database understands the index so that way, when you tell it to do something, it can do it faster. Rather than having to go through all of the data, which is called a table scan, when you're working with a database with millions of rows that can take forever. If you have an index, it'll use an index seek, which will basically know where to start searching for that data. That's awesome, because it makes our queries tons faster, and saves a lot of resources. There's downsides to it, though, because when you create an index, not only does the actual table have to update whenever you update it, but you also have to update the indexes. Think of like a book, when you add something to this book, or you change something within this book, not only does the actual book content in the beginning of the book change, but also when you flip to the back, the index must also be updated. Otherwise, all this index information is going to be out of date and useless. So that's a downside to indexes is when you update your information, it might take a little longer in a database query, we might say something like select the first name, last name, and the phone number and the email of the user with the user ID of 72. For example, well, I kind of said that in English, but when you're actually working with a database, it might be more like select first name, last name, email, phone number, where user ID equals 72. That would be an example of a database query. Now the where clause is using the primary key. So basically, the database is going to go through the table, find the person with the user ID of 72, and then display their first name, last name, phone number, an email. Well, that where is using a column that has an index, the primary key is usually the one that's going to have the clustered index. So that's the way it actually sorts the data. So now, it doesn't have to go through the entire entire entire entire table to find the person with the ID of 72. Because imagine if, if it was just random data thrown in there with no organization, you have 2 billion rows, right? And the database has to go through and find the one individual row with the user ID of 72. Go to the first one, nope, not 72. Second one, nope, not 72. There, nope, not 72. And keep going on, on non non to finally found the person with the user ID of 72. Basically, that will take ever with a clustered index, it knows well, 72 is going to be in this part of the database, it comes after 60 and comes for 80. Because that's the way it's sorted. So it's gonna be like, boom, got done. That is how indexes kind of work. So clustered indexes are going to be faster and better. But you can only have one, that's usually the primary key. But it doesn't always have to be depending on the database, and what your goal for the databases, the non clustered indexes, which are like the index and the back of a book, those are still good. And they are recommended, if you're doing something where you're like, where first name equals john. Now, it has a list of all the John's and where they're located on the database are on the table, I guess, that's also a good thing. But you don't want to have an index on something you're not going to be using a lot, because you'll just have to have another thing updated. Another thing you don't often want to do where clauses, which is where like where first name is Caleb, without that column that you're searching, being a indexed column, because it can take forever. Another reason for indexes is to increase the speed of joins. So the way the join works is it takes data from one table and another table and combines it together by the primary key and foreign key connection. So think of a users, users and comments. So a user post a comment. So you'd have a user table. And you would have a comment table. And let's just draw a rose on here. So we have this guy I'm just gonna draw pictures. This because it's funner. So we have three people over here, we have four comments on a website or review, or it doesn't really matter. Well, this one comment is posted by one person and one person can posts multiple comments, it's a one to many relationship. Well, when we want to output this to where we can review the data, we want to use a join. So it's not across multiple tables. Because if you're thinking like, this comment was posted by the user with the user ID of 72, you're not going to know who that is right off the top of your head, because that's a foreign key connection. So rather, you want to replace that 72 with the actual name of that person or the username. That's a join. And it's all done after the database is created. It's all for the query side. So basically, this is what is defined in our database. And we want to output with a join a new table that looks more prettier and easier to read. When we do this, we're going to do something that says, the user ID of this call, the user ID of this table, is the same as the user ID in this table. So if the use if a comment has the user ID of 72, we know that it's done by the user with the user ID of 72. That's the foreign key connection with which we've talked about like some tire series, so I'm not going to explain that anymore. But now down here, in this table, it's going to say, Caleb curry, then oh, put the comment, you know, join that. Well, this is also done with a where clause, in some cases, for different databases of the terms might be a little different. But basically, that column that you're connecting by, which in this case would be the primary key needs to be indexed to make it faster. So whenever you're joining a certain column, the two tables, the column you're joining should be indexed. You can also have something that's known as a composite index, which is an index that is an index on two or more columns. So think of like a first name, and a last name, or last name, first name. Now, the thing about these is when you use the indexes, so let's say you select where first name equals this and last name equals that you have to do them both, or it's not going to use that index. Now, there are exceptions to this, think it for MySQL, actually, if you order them in a certain way, like like this, you put an index on the last name, and then also the first name, you can, you can do a where clause for both the last name and first name. So finding everybody with the last name, curry and the first name, Caleb would be really fast. But the way this is set up, you can also do the leftmost one, too. So you can say, find the person with the last name, Curry, that's going to work. But in this case, you couldn't do it with the first name, because it's not on the left, that's just the way MySQL is set up. So different database systems might have it set up differently, where you can search with them individually, or you can. But most likely, when you make a composite index, you should expect to always use that index by itself, unless you have it set up to where you make a pretty awesome index that can be optimized to do multiple different queries that you need to do like if I'm going to make a ton of queries where I put the last name as the where, and then I'm going to make a ton of queries where I put last name and first name for the where this will be a good way to set that up. Because I'm making one index index for last name, first name that can also be used for just last name. So yeah, that's just the basic introduction do indexes. Yeah, I'm not going to be giving you tons of syntax actually making them because like, you should know by now, this is a database design course, we're just trying to learn how to design our indexes on our tables, so that our database works good. So that's something you may have to learn more once you actually program a database, because you can test the speed of the actual indexes to see this for speed optimization to make the database as fast as possible. You can't really do that. If you're only designing a database, you can only get so far when designing indexes with the database, but you can get pretty far because I mean, obviously, you'll likely be using the primary key for where clauses and joins. So that's something that's good do avid clustered index on. What a data type is, is when you create a column, every single value in that column has to be of a certain data type. So you need to tell that column when you create the database, what data type it is to be. Also, when you're designing, you usually put the data types there as Well, so that way, you have an idea of how your database is going to work. So there's three main categories of data types. And the naming conventions, or the names of these data types from one database management system to another database management system might vary a little bit. But in general, they're organized into three main kinds. That's date, numeric, and string. So basically, we're going to be going over these in a little bit of depth, we're not going to go super in depth, because right now you just need to understand what these are. And just understand that your database application might be slightly different from others. Let's start from the bottom up just to make things fun string, a string is any characters or like letters. So basically, you put these within quotes. So hey, this is an example of a string. Even a number within quotes can be used as a string, in this case, this will be used as a string, basically, the the three doesn't have any mathematical value unless it's converted to numeric. Within string, you often have some subcategories, you will have something that's either char or var char, which does this mean character or variable character. What that means is, it's just like a list of characters, the character or chart, usually is stuff that's set in length, so like a phone number, or a zip code, something that doesn't change, size wise. So basically, with a char, you'll have something like, sharp eight, and every single value within that char column is going to be stored as a individual characters. Now even if you don't put eight characters, let's say you put six, then there's going to be two blank characters in there. So they're all stored as eight. var. char, on the other hand, is variable, basically, meaning that the size can change, you can have anywhere from zero up to eight. There's also often the subcategory or sub data type of text, which is often used for larger things such as comments, messages, and so forth. That's basic summary of the string data types. As for numeric data types, these are numbers. That's different than this. This is a string that has a number in it that's used differently than the numeric data type. The numeric data type doesn't have quotes. It's just the numbers that you're using. Just like string, there's often some subcategories of decimal, and then floating point or double decimal works in base 10. Now, I know we haven't really gotten much into basis. And I'm just gonna put date back up here. So we haven't got there yet. But I think it's really important to know and in fact, I've made a lot of videos over binary, hexadecimal and stuff, basically different numbering systems that use different bases. Well, decimal uses base 10 is the number system we're used to counting, you know, 0123456789, and then it goes back to 10 1112 1314. It's base 10. That's what deaths it means in decimal. Well, there's also binary. So we have decimal and then we have binary decimals, base 10, binaries, and base two. So binary, we only have the options of zero, and one. Now, you don't really need to understand how to like calculate in binary. But if you want to know that I do have videos for that, but just know when we're working with decimal is working with a base 10 system. When we're working with floating points. We're working with binary, which is stored differently. It's stored in binary, and then you can have it presented as decimal for math or whatever. So what's really the difference? Do you use decimal or do you use floating points? It just kind of really depends. decimal generally is more accurate on mathematics to a point. There are problems that can come up with decimal, but you'll just have to research those. But basically, this one has some errors binary it has problems with certain mathematical equations or numbers. For example, when we want to store 1/4, it's stored like this, we have 0.01 The way this kind of works is we have anything after the point that divides the number, basically, in half, so we would have half, but we don't have, so we have half of a half 1/4 kind of makes sense. So the place values would be, this one right here would be, we'd have half, this place value would be 1/4. And then anything after that would be 1/8. And anything after that would be 116. So right now we're where we have it, we would take one half, multiply it by zero, to get zero, because we don't have anything there. We take 1/4, multiply it by one, since we have it there, we'd get 1400. We add this up, we get 1/4. Basic adding and multiplying to understand how it works. But when we have a number such as point one, and decimal, that is basically impossible to store in binary, simply because we're working in a base two system. And we can't get to point one very easily. Because 1/8, that doesn't quite cut it, you have to basically have a combination of a bunch of different fractions to get as close as possible to point one, but you can never really get there perfectly. So that's problem with the floating points. So if you needed to source something like that, you would if it needs to have accuracy, you would have to have it stored as a decimal, which stores it differently. Sorry, if it completely overwhelmed you with that number stuff. And if it did, don't even worry about it. That was a little too fast. You need it slower, please check out my introduction to binary video. It's a super long video where I basically explain it step by step. So you understand. And then I have a couple other videos adding and subtracting. And also fractions. in binary, you will also have an integer type, which will only store numbers that don't have any decimals after them. So only whole numbers like 512 28, negative 52, so forth with numbers, we also have to worry about whether they are unsigned or signed. If they are unsigned numbers, that means that they can't be negative. If they're signed, that means it can be positive or negative. probably wondering why would you ever do that? Well, if you're working with something that doesn't go negative, well, then you wouldn't need to have that negative. When you have a signed number and it can be positive or negative, the range of potential values stays the same. But since you include negatives, the maximum value is different. It's not as high think about if you have a range from zero to 1000. But then you include negatives, now you have from negative 500 to 500. That is going to be told with the data type. So you might say something like, double, unsigned, that would work as a number. That's enough for numbers. I don't want to talk about this anymore. Let's talk about dates, dates, well, there's a couple different things you need to know about dates, there's date, then there's time, there's date, time, and then there's timestamp. So let's talk about date time first, because it makes sense the best. It basically just stores a date such as you know, oh two for February, then you might have a colon you might have 22 for the date and 22nd and then a colon maybe 1998 for 1998. And the year, then you have the time, which could be whatever it could be 12 3022, which would be the seconds and you might even be would have some milliseconds or microseconds. That's how you would store date time. The way it actually looks might vary from database management system to database management system. Some might have colons between the numbers, some some might have hyphens or commas or whatever, doesn't really matter. I don't need to think about that. Right now, I just want you to think about the date being a date. And the time being a time, date, time being the combination of a date and a time. That's pretty much that for dates. The only other thing is the timestamp. The timestamp is an exact moment in time that's recorded to establish when something was created or done. It can be in milliseconds or it can be in the month, the day and then the second hour and then the seconds and then the milliseconds. Or it can just be a number of seconds, so forth. It varies. But anyways, that's going to give you a number when something was done. It's usually for something like when an account was created. Either when a transaction was done or when something happened, when you update a column, that timestamp will often update. And that's going to keep it to like when that row was entered into the database. So if you enter a new row, and one of the columns is a timestamp, it's going to give you a value as soon as you submit that, and it's put into the database, and timestamps created. And it says, Okay, this row was entered at this time. That's the main thing with timestamps. So basically, we got three types of data types, we got string, snow, numeric, and then date. All of those classifications have different data types for each one. But in general, as long as you know, like, the gist of what is what, you'll be fine, you don't have to worry too much. Because when you get an actually programming a database, you will have more specifics on what data types to use. So this video, we'll be talking about joins. And in fact, the next whole group of videos will be about joins. So it's going to be a lot. Now, joins are a complex subject there, they can be confusing and tricky, and very syntactical, I don't know if that's a word. But basically, the syntax from relational database management to another relational database management system can be confusing and complex. But rather than diving in and just learning all that stuff, we're going to be talking about the concepts of what the types of joins are. And I know you might not even know what a join is yet. So you're probably thinking, I don't know what you're talking about. Well, that's what we're talking about today. So thank you for joining me, let's begin. So a join is when you take something that's structured in a database in a confusing, non user friendly way, and you're presenting it in a user friendly way. Now this is done so that way, we can store our data in the database in a way that's the best structured, to where we protect our data integrity, and it's normalized and separated across tables with the proper relationships and data types and keys and indexes and everything. But when we actually want to present that data, for example, on a webpage, or in a program, or to the analyzer person who goes in analyzes, and does calculations on our data, or just views those calculations after the database does all the work with the analyzation. Well, that end result is often in a way, it looks in a way that is organized and structured, it makes sense to us. That's the purpose of a join. So it takes a mess, and puts it out in a way that looks beautiful. And this can be done over multiple tables. So for example, we're gonna have a table here, a table here, and a table here. And these could all be three different tables, with relationships between the tables, we all get the specifics of what kind of tables but basically, a join is going to take all of this. And it's going to present us with a generated table, that is more pretty. It's more beautiful, and more structured. And it makes more sense. Now, the way this is done, is using foreign key primary key connections. So when we have these relationships between these tables, where I drew these arrows here, that represents a foreign key connection, that references a primary key. So one of these tables is the parent and one of them's the child, and we want the end result and combine them together into one. Now, when you do this, these columns that are connected are going to be indexed to make it much faster. Now, when you join tables, often you will replace names of things with more user friendly names. So for example, we could have you know, let's say we had a comment on a website. And it says, Wow, then and then this is posted by Caleb. Well, you can see that if you have a username here, it might actually be a user ID in a comment table. So for example, we can say, This appears the comment table. This is the user table. And there is a foreign key from the user ID referencing the user ID in the user table. Well, we when we present that data on a database, we will want to replace that user ID with the actual username. So rather than having you know we can have it to where it's like User ID of 4582. And then you're like, Oh, I don't really know what that means. Well, since this is all within the comments table, now we're going to take information from the user table, and put that username instead of the user ID. That is how a join works conceptually. That's another important thing. All this, think of it conceptually, these videos are going to be helping you understand the concepts of the different kinds of joins. And throughout it, I'll be giving you some examples of how you would type that in SQL, maybe, but I'm not going to be getting into the specifics. That's because joins are done differently from database management system to database management system. So you just need to know is how the joins work, and the expected results. And then when you start working with MySQL, or SQL Server, or Oracle database or whatever else, then you can just figure out how to do their joins, you will already know what the results are going to be because you understand the joints. And then you just type it out. And boom, you got to join there. Now there's one other thing to know that's important, is all of this is data manipulation. Most of what we've been talking about before is data definition. So that was in basically DDL. But now we're going on to DML. So DDL data Definition Language, and data manipulation, language, these are two parts of SQL, we all we all talked about this and the older videos, but we're on like, way farther now. So you guys might have forgot this. But this is important to know. Because now we already have the definition, the structure of the database, the database might already be completely designed and structured with the columns and the rows. But now, we're manipulating that data to look a certain way. So by doing joins, you're not actually changing the structure of the database, you are changing the presentation of that database, that is extremely important to know. And you may be wondering, why exactly do I need to know this, because if I'm a database designer, or whatever, really, all I need to do is design the structure of the database, and then the database administrator or the software engineer person, or whoever can go in there and use my database that I designed and created. And they can do all the data manipulation stuff to fit their application, right? Well, maybe, but not likely. That's because the joins are also very important to understand. Because when you design your database, you need to think of it in a user friendly way. And then normalize it, make it confusing, and then present it in a user friendly way. It's kind of confusing, because you start with let's say you are really bad at database design. And you're structuring a table, and you have it to where it's like comments comment table. And within this, you have a user name. And then you have the comment. And then maybe the blog post or video, whatever, it doesn't really matter. You can have some IDs in there. Basically, you don't know what you're doing. And you put the username, what is Caleb See? Comment lame. There got another one from Caleb See? Cool. Caleb see off. When you look at this, and you show it to your grandma, she understands it. She's like, okay, she might not understand it, but she gets if you explained it, she might. Caleb c posted the comment lane. Caleb c also posted the comment cool. Caleb also posted the comment upscales. He also posted the comment, awesome. But this isn't the best way to do this because we learned about data integrity, and also repeating data because now we have this in here three times. So instead, we use a user ID and get rid of Caleb C. And we get rid of the username. We have a user ID. And then in here we have the user ID six. Let's not use that number 777. And then over here we have a user table with the user ID column, which has all have our user IDs for each individual person. So we started off with this friendly design. But in reality, it doesn't work because it's not normalized, there's repeating data. But it looks pretty. Once we have that, then we can break it up and normalize it. So that way, we have multiple tables. So we have this table. And then we have this table over here. But that's not the only thing we do. The next thing we have to do is you have to take up those pieces of the puzzle, and put them back together in a join to get the final result. So we basically want to replicate the original table we had here by using a join of our normalized database, and recreating it. So what we're going to do is we're going to take the user ID column, and we're going to take the user ID, you guys see that all all good, it's everything. This This will likely also have a username column, so I kind of ran out of room here, but username. And that username, user ID seven will be connected to Caleb See, using a join, we are going to connect these two tables by basically using this foreign key which points to seven and finding the associated username with that user ID. And our join will look like this. username, comment. And the name of the table can be what mean whatever you want, it can be generated table. Or it doesn't even need a name, if you're just putting it on a website or something. And then under the username column, it'll be Caleb See, and then Caleb See, and then Caleb see. And then under the comment, you're going to have lame and cool and awesome. But this wasn't just like this to begin with, we got the username from this table. And we got the comment from this table. So not only do you have to design the structure of the database, but you have to structure the joins, to basically put our database back into simple language or grandma readable tables. Okay? That that's basically what we want to do. So let's say you get a job, or you're working for a developer or something, and they say, okay, on my website, I want a table on this Users page, where I can sign in with my username, password, and I can see their username, their password, their email, their date, they signed up and everything like that. And if I want, I can go in there and edit it. That way. They don't have to go in and type SQL, you're you're going to have that programmed in the in the actual application to send that SQL. But we need to structure the tables, we need to make a join, that's going to be able to combine those all to one pretty table, which that's the job of the database designer, likely. I mean, if the applications or the database is huge, you might have someone who does the, the designing, and then someone who does the implementation, and then someone who does the joins and then someone who does this other junk. But I mean, if you're just working on your own, and you got to do everything, you're also gonna have to design the joints. And I highly, highly, highly recommend you do that before you go in and programming your application. And then you're like, Oh, crap, I don't want to do this, or Oh, my database is structured wrong. I want to do this all beforehand. Don't rush into things because then you're going to be like, dude, dawg, my application ain't working. Right. And the reason is, because you didn't take the time to structure your database the right way, and develop your joins. Okay, once you got that, you're good to go. And yeah, so there are a couple different kinds of joins. There's Inner Joins, outer joins, cross joins, unions, there's all kinds of different things that we're going to be talking about. Okay, so, when energyne. The way an inner join works, is it's going to take one table and another table. And then wherever there is rows from this table and this table that are connected, it's going to present them in a new generated table. So let's just break this down and see how it works. Let's go with an example. Because you gotta use examples. We're gonna have a customer table We're also going to have a card table. So think of like a bank, for example, you're going to have a customer table where you're going to put all your customers in there. And then you have a card table where you're going to put every single card that's been given out, or will be given out to people who have credit cards, where you could make to where you can even include debit cards in there, but we'll just worry about credit cards for now. So basically, we're going to draw this out. And we're not going to have it required that a card has a customer, a customer has a card. So neither of these columns are going to be labeled as not null. So basically, the card does not need to have a customer. And a customer does not need to have a card. So we'd have one table over here. And another table over here. And generally this will be designed as a one to many relationship, because one person can have many cards. And usually one card is owned by one person, you could design it as a many to many relationship where you can join a card or multiple people can own a card. But yeah, you can modify the examples as needed. So let's say this is the customer, customer. And this is the card table. Now we might have something like of course, we have an ID for the customer. So we're gonna have the customer ID, you know, their first name, last name, their address, their date of birth, all their personal information that we need for their data for the bank. That's where that's going to go. Over here, we're going to have a card ID. And then we're also going to have a customer ID. And because this is a one to many relationship, we have the child and the parent. If this is a one to one relationship, we could have it to where there is also a card ID and the customer table if we wanted. But I'm not going to do that. And we can get into one to one relationship design best practices in another video. But for now, the customer ID is going to reference this customer ID. and here we can also have more information about the car, we could have the the total amount or the limit max amount, I guess I can say, max amount, you can have monthly bill. And because this is a really simplified example, we don't really have to worry about other tables such as, you know, bills and invoices, I mean, and all that other junk, we're just keeping it simple here. And let's say this is how we structured our database. And we want to make it more user friendly with a join to where we join information across the table. Because we don't want to have to have the customer ID for the card. So we create a join. And we make this new generated table. I don't know why I made that like circle, but and within here, we could say we could say it's the card customer. For example, it doesn't really matter what we give the title of the actual table. What we're going to do in here is what we decide which columns we want to take. Well, the first thing we we always start with the ID because that's how things are going to be joined. And we don't necessarily have to return that Id within our table. But we need to think about it because that's how we connect the tables. So generally the the ID is just for the databases sake, it's usually not something you use outside of the database, it doesn't have any real world value. So we usually don't need to record it in the join for a view the application or whatever we're using this for. Now, if you're the database administrator and you want some more presentable tables like this, but you want to keep the IDS in there, that's fine too. But we're not going to we're not going to return the customer ID. There's no need to it's just a number. But since it's in both tables, if we did join it down here, it would only be in one column. So we would have you know, cost a customer ID. And that's going to take the combination of these. We're gonna have the customer id like let's say 62 and then card. I Be like seven and then customer ID 62 and current ID eight. So that's kind of how that would work. But we're not going to return those. So what we're going to do is we're going to return just the first name, last name. And we're also going to return the Let's add another column amount paid. So basically, it's saying like how far they paid off their credit card bill, let's say you borrowed 10,000, and you owe 6000, you've paid back 4000. So far, just to see how far people are getting to paying off their bills. And like I said, this is very simplified example, you can make it to where it gives a percentage. And you could have more tables for bills and invoices and all that junk. But just so you guys understand the joint concept, it works the same for basically any columns you use. So within here, we could have first name, last name, and the amount paid. These are our columns within our table. I'm just gonna open this up, so we can write some. And we could say that we have individual people we could say, okay, okay. And then his amount paid on one of his cards is 2200. And we also have Caleb curry. And the amount paid is 720. And, if you wanted, you could go in and edit the join adding maximum out. So that's basically, or we could have a column on here such as amount owed. And we'd have the hat in here too. And that would basically say, how much they paid out of how much so look like. Then here, we got amount of, like 3000, we're just basically saying 2200 out of 3000. If you wanted to do that, you could do that. But for simplicity sake, let's keep it like this. And then we could have Jimmy john. And the amount paid could be 2800, and so forth. You can make more rows if you wanted. Now you're wondering, okay, what's an inner join, we talked about that earlier? Well, basically, an inner join is going to take only the rows that either a card has a customer and a customer has a card where they intersect. So this new table is joined table is going to eliminate any customers that do not have a card. And it's also going to eliminate any cards that do not have a customer. So I'm going to erase this, I'm just going to draw a little illustration to show you how this works. Now, you can find more pictures like this online, which I'll link to in the description, if I can remember, think of it like this, here is the left table, which was the customer. And here is the card table. And this part in the middle. This is kind of like the join table or where they intersect. So here are the customers that have a card. And here are the cards that have a customer. So within here, this is going to contain all of the rows that have a customer that also owns a card or a card and is owned by a customer. So if you have a card with no customer, it will stay out of the end result. And if you have a customer with no card, it'll stay out of the end result. So this is what you can think of it as all of these things within this part of the Venn diagram or whatever, that's going to be returned. Now, there's different kinds of joins where it'll only where it will only return, for example, this side or this side. But that's where we're going to be getting in, in future videos. So if you want to do a left join, if you want to a left join and go on here, right join and go over here. But an inner join is just the intersection of both of the tables. I have this habit of just like wiping it off with my hands and then my hands are like dark green. So I'm trying to be a little more civilized, you know. So how do you do this? Well, it's done within a select statement because remember we are in we're using data manipulation language which is done through selects. So it would be something like this. Now to warn you, the syntax is going to vary a lot from relational database management system from relational database management system, but the concept stays the same. So you just got to replace a couple things, insert a couple things, boom, Bada, bang, you got a new join. So we're going to say select. And then whatever comes after the Select, I'm going to switch over. So everything that comes after the Select is going to be the columns that we want in the joined, generated table. So in the last example, we had first name, and then we also had the last name. And we also had, what was it amount paid? I don't remember. But something like that. So that's going to be the columns we want return, then we are going to say, what table are we getting these columns from? So we do from? And then we say, Okay, well, you may be wondering, well, there's multiple tables. So how do we know which one we're going to use, or we're going to put the left one first, which, in our drawing, we had the user or the customer, and then we had the card. So we're going to pull it from customer. And then you're going to do a join and join the other table to the customer. So we're going to say, enter join. And then you're going to say the other table, which was card. Now what you're going to do is say how are these tables connected? So the way we have them connected, or the way we want to join these tables, is by the foreign key connecting to the primary key. And the way you signify that in syntax would be with this little statement here on which is basically saying what are we joining it by. And you're going to say, customer dot customer ID, and what that what that.is is basically saying the first part of this is the table. The second part is the column. The reason we have to do that is because both tables have a customer ID. And then you're just going to connect that to the other table by saying equals card dot customer. Id. Now that's basically all you have to do for this, it's going to connect, if you look at the syntax, it's going to connect and get a different marker here. Colors makes things so useful for us. Okay, we've got the customer joined with the card. Those are the tables. And then we're going to say what columns we want to return and the new join. And that's first name, last name and amount paid. And then we have say how they're related. That's the customer ID of the customer table is equal to the customer ID of the card table. And that's how you get your join results. So the end result would of course look like this. And you're going to have first name, last name, and then mount paid. There you go. So that is how an inner join works. The next thing we're going to be talking about is left joins and right joins and all the other cool kinds of joins. Main thing though, don't worry too much about the syntax. If it's overwhelming, just worry about the concept. Understand that if you do an inner join, it's going to exclude all of the rows from either the left or the right table that are in both tables, it has to be in both tables in order for it to work. That's the most important thing to understand when it comes to these joints. Because if you understand that, all you got to do literally is look up. Let's say you're using my SQL, just look up my SQL syntax, inner join. And then you're going to get a little syntax code, replace the columns they gave you with the columns D one and the tables with the tables that you want. It's really simple. So you don't really need to like yeah, obviously you should learn this in the long run. But if you're new to this, don't worry about diving in studying that for 20 years. Worry about the concepts because once you get the concepts, it's easier to pick up on this junk. So in this video we'll be discussing Inner Joins crossed multiple tables. So you're probably wondering, okay, what does that mean? Well, when we have a join, we join multiple tables. But what if we want to join that join a multiple tables with more tables? So basically, we're going to have a join over three or more tables trying to find the green marker. There it is. Okay, how does this work? Well, let's just draw some boxes to kind of get an idea of how this might look. Okay, so you might be like, okay, cool, we figured out the Inner Joins and how those work. But how do they work when you have three tables? Let's, let's find out. Here we have three tables. And when we do an inner join, let's say in the last example, we join these two tables, we had some kind of join condition, which was something like customer dot, customer ID equals card customer ID, you know, well, it doesn't always have to be that it just depends on what you're doing with it. But basically, you're going to have a join condition right here. And you're also going to have a join condition right here, if you want to join in another table, these tables don't necessarily have to be connected, like a many to many relationship, because when we design many to many relationships, it looked like this with like the intermediary table, it's not always the case, for example, this table could have relationship with this table. And this table could have relationship with this table. But this table doesn't necessarily have to have a relationship with this table. Think of it like this, when you do the join on these two tables, you're going to get a result set. Right, then if you want to know what your next results are going to be when you have this table and think of joining these, and you're going to get a final result set. So if you do an inner join across multiple tables, the end result set is going to be smaller than this, because you think about it. If you join these two tables, you get a result set. But now, you have to take all of these rows, and you're only going to display the ones that meet the requirements of this join condition, which is going to decrease the size, or the amount of rows in this table. And it's going to be shorter. Most cases. Now, let's talk about this a little more. Let's go through the example we had before with the cards and the customer. But we're going to add a new table, right. So we're just going to draw a style. But this is the customer table. And we had a customer ID, a first name, last name, you can also have any other junk you want to store about the customer in there. Then we have a card table or the card, card ID and then the customer ID and then the like maximum out or whatever else you want to store about the car, and the bill or whatever the price or the interest rate. Now, let's throw in another table in this equation, let's throw another table into this equation. It's going to be a card type table. So you know when you get like a credit card, because all you college students are like, yeah, I need pizza. Well, there's the card is a type such as Visa, MasterCard, or whatever else American Express. Well, that's kind of like the card type. Generally, a bank would just give out like one card type. But we could say that they can, like we could just say what kind of card type the card in this table is. Now when you structure it that way, how many different card types are there, four or five main ones, Visa, MasterCard, American Express, maybe two other ones that are somewhat popular, there's not that many. So if you were to store that in a column, and this one, you're going to get a lot of repeating data. And then for example, with MasterCard decides to change their name or they get bought out by somebody else. And all the cards now are named ultra card, then that's going to cause problems because all of the you're gonna have, you're gonna have a MasterCard and then ultra card, it's gonna be confusing. So what you can do is use a lookup table which we discussed in a long video ago. And that's just going to be basically a reference to another table with all the possible options. And that way if you need to add a new option, you can do that easily and then just reference that or if you need to update an option, or if you want to put extra information about that option. For example, if MasterCard has a standard fee or something you could put that within the card type table, rather than within the card table. Because if you have the card table, like, Look, if you have card, ID, and then you have card type, and then you have something about that card type. So card type interest or something. Well, this is a transitive dependency, which is a violation of the normal forms, I believe, second normal form. Don't forget, third normal form, I'm sorry. Anyways, that's not good. So we want to take these and put them in a new table. So it lookup table is a defense against problems like that. And it also reduces integrity problems. And it can also reduce repeating values. So now let's design this, we're going to have a lookup table, because card type. And this is going to have a card type ID, and then the card type, which is going to be like the name of it. And then you could have any other rows about the card type in there. And then in the card, we're going to reference that with the foreign key. So it says looking up the values through the through the lookup table. So we're going to have card type ID, that is a foreign key. Now let's draw these connections. Customer ID is a foreign key to the customer. And the card type ID is a foreign key to the card type. So you can see we have all this information spread out over three tables. And we want to add age, we want to join these and create a new generated table that contains information from all three, we want it to be an inner join in this case, but you can do the same thing with outer joins and other types of joins. We'll be discussing those soon. But for this, we want it to look like this, this is what we want our generated view or a new table to look like, we want to select stuff about the customer, just add some columns in there just to put them in here, we can have the first name, last name, email, phone, now paid. And then we could also have the card type right here, it's more, because we're gonna do some cool while we're completely off topic just for a second. I always like drew comics as a kid and I always like drew a bubble around like either a thought or a talk. And then I tried to fit all the words into that and never worked. And then my master drawing friend was like, dude, draw the words first and then draw a bubble around, it was like the best idea ever. But I still don't do it. I mean, I always have to erase the size of my database tables, fit more junk in there should always write the words first and then draw a square around them. But this work, these are all from this table. This is from this table, the card type is from this table. It's a join across three tables. And you could of course add more. Now, the join conditions, well, it would be for this, the customer ID is the same as the customer ID. And the card type ID is the same as the card type ID. So this table versus going to exclude here, I'll just write out what it's going to exclude. This isn't part of the table, but I'm just putting it here, it's going to exclude customers with no card, or cards have no customers, that's going to be the first joint condition, because we're going from here. And then the second join condition, it's going to remove cards with no card type. And also card types with no cards. And that means basically if you have MasterCard as a card type, and you haven't created any card rows that also have the foreign key that references MasterCard, or whatever I said, then that's not going to be included because there would be nothing to put with it. And just say MasterCard, no normal normal. That's because no cards have it and no people have cards that have it. So that's the one INNER JOIN is going to do. It's going to do all this. So the end result is going to basically get smaller and smaller and smaller and smaller for every single table you add. Now, a couple other things to think about here. It has to do with things being no or not No. Okay, let's talk about the way we have a set up is the customers. They can have a card but they don't have have to have a cart. That's because there's no required column in here that says you have to have a card ID. So you could have a customer that has multiple cards, for example. We also have a card table. And well, the way we have set up the card doesn't necessarily have to have a customer. If you want to change that, you could take this customer ID right here. And you could make it not not, that is going to change the end result. Because now, every single card is going to have to have a customer. That means cards with no customers. Although, functionally, it's removing all the cards with no customers, every single card has a customer. So the end result doesn't eliminate cards with no customers. Functionally it does, though, because that's what the inner join does. But since there's nothing to remove, it doesn't do anything. So you can kind of think of it as not doing this, even though functionally does remember that it's the difference. All right. Now, we also have it where the card, I would say should have a card ID or a card type ID, I should say, because almost every card you're going to get ever is going to have a card type, right? Unless for some reason they wouldn't, you'd likely have that as not know. So functionally, this is going to remove cards without card types. But since it's labeled as not null, every single card is going to have a card type. Therefore, in the end result, it doesn't really remove cards with no card type. Because every card has a card type. So there's nothing to remove. Now card type, there might be a possibility, you're offering a new card type such as, let's say, so I can't think of any cool names like legit card type 12. And like a new company comes out in there, you get a legit card type 12 card and all through your new bank account. And that might you when you first get that there might not be a card that has been issued that card type yet. So in the end result, you're only really removing the customers with no cards. And I guess, the card types with no cards, I'm going to keep that. So that's going to stay so these two are going to be removed. And this is the end result here. We're going to keep card types of no cards and customers with no cards. That was a big one man with our hearts. We talked about him last year, but never actually talked about the join conditions very much. So I'm going to go over a new example in this video. Let's do this. So we're gonna have a user table, not a customer table user table. And then we're going to have a comment table and a video table. So think of like a video sharing website like that one called, starts with the you know, but anyways, we have a table here, we also have a table here. We also have a table, right there. All right, these are all connected, right? Boom, boom, boom, I'm just gonna write the name of the table. I'm not gonna really write any comments in here, just low on space. So got the user got the comment. And then we got the video. So the the only way the user has any connection with the video is through the comment table because a user post a comment on a video. So it makes sense, hopefully does. I don't even know ask you still, I'm trying to stop screaming crazy. What are the joint conditions for this, if we want to do an inner join, let's discuss the conditions and also the results we should expect. Okay, so let's just define the keys that are going to connect these tables. So we have a user ID. And then we're gonna have a common ID, and then a user ID, and then a video ID. And then this video ID will be referenced in this table by a video ID. Okay, now, if you think of this as a many to many relationship between user and the video, you could potentially get rid of the comment ID and just use the combination of user ID and video ID. But that is not the best way to do it. One user could post multiple comments on one video. So you see, like, if we had this common table, we could have user ID and then we could have video ID and then we could have a Comment, we have the user ID with seven video ID was 106. And the comment was lame, and then the user ID of seven, video ID of 106 could be JK. So this is a bad idea, you wouldn't want to use those combined. Unless you are working on some kind of thing where you can only leave like one review on a product to say instead of just use a review product or user review, whatever. In that case, you could use user ID combined with the product ID or the video ID to allow that, but I'm going to use the comment ID in this. And this is going to be a surrogate ID that's going to be used as the primary key. And these are only going to be foreign keys, the user ID and the video ID, they're only going to be foreign keys, and they're not going to be part of a compounds are a compound key in any sort of way, they are just another column, that's a foreign key referencing a table to see you guys understand the structure of how I set this up. Alright, now there's going to be a join condition right here, which basically a joint condition is what columns are in both tables, right? You don't always have to say user ID equals user ID, or customer ID equals customer ID or comment equals common ID, you can do other stuff with like greater than less than and all those fancy things, but we're just going to be using equals because it's the most common one and the easiest to understand the one here, it's going to be the user ID. So we're gonna have user dot user ID is equal to comment dot user ID. This is the first joint condition. Now for this, we're gonna have a second join condition between these two. And the only thing we can use is the video ID because it's the only thing in both tables. So we're basically going to have comment, dot video ID equals video, dot video ID. We got to say what table there and because they're in multiple tables, so we have to use the dot syntax, table dot column equal to table.com. I know you guys probably can't read that I wrote it so small, but it's okay. As long as you understand this is the john, this is the joint condition for these two, and this is the joint condition for these two. When you join these, you're going to join them by those columns. And now what about not knowing no columns? What are they going to be returned as? Well, the way we have a structured user can comment on as many videos as they want. So they can be in the comment table multiple times, so got one too many there. Now, a comment can only be posted by one user, and can only be posted on one video, because each individual comment is going to be a separate row in the comment table. So for video ID, we're going to have that set to not know. And also for a user ID, we're going to have that set to not know that's because there should never be a comment that doesn't exist on a video, there should never be a comment on a video that wasn't posted by a user. Right? Now for video. We don't necessarily have to have comments on it. So there's nothing we need to worry about that. Just like a user, we don't have to worry about having to have a comment. So the only time we need to put not know is on these two columns. To make sure the comments exist with everything they need. You see, the user ID would be not normal. That's because every user is a user and every video is a video. But we don't have to say every user has to comment that would make it impractical. So that's the setup for the nominal and the relationship types. So then when we do an inner join, in this case, what are we going to get? Well, I just want to think about that for a little so we can kind of understand what our end result is going to be. So what do we want? Well, we want a table that looks like this up a little bit. We want a table that looks like this. And that's because we wouldn't want it any other shape, right? Um, Steven, but basically, we're going to take the user with the user ID and likely we'll have more information about them, such as their username. So if you wanted, you could also include that in the table, which would likely be done from the beginning, because you'll likely do your joins and stuff after that. Make sure you put your user name in there. You don't have to add that in there because then half your users won't have a username and half of them will which isn't good. So username from the beginning, likely include your username, and then will likely include the comment. So we will need to include the actual comment in the comment table. The reason I didn't put these in there earlier because I just wanted to think about the keys for now. But now we can put those in there that says we're designing this right. And then we also want the video title. Maybe a link. But we'll just stick with the title for now. So these are the three columns that we want to include. And we could have set it up in a table, and each individual row is going to be a new comment. Well, let's see, let's think about what can be excluded from this. So then we can decide which ones actually will be excluded? Well, let's just ask questions, can a user be excluded? Well, if we're using an inner join, we're only going to return users that posted a comment on a video. So yes, any user that did not post a comment will not be included. Now, what about videos? Will videos be excluded? or sensitive INNER JOIN, we're not going to include any videos that do not have a comment from a user. So yes, videos can be excluded. And they likely will. So videos, where it's going to put x by to say that it can be excluded, users can be excluded. Now, can comments be excluded? Well, let's think about each individual key, we have the user or the foreign keys, user ID, which is labeled as not know. So that means every single comment is going to have a user and also, the video ID is not. And also every single comment is going to have a video. so in this situation, all of the comments will be returned. So we can't remove those. So within here, every single comment is going to be included. So if we had 10 comments, we're going to have them labeled one through 10, we're actually going to label them but we would have all 10 of them there. And then only some users would show up here. And all their comments, all the comments would show up in this column. So it'd be like, like, it'd be like, like, black and lame. And then the video title, only some videos are gonna be included here. It's a scrap chance. That's how it's gonna work. And let's just see how it goes. So yeah. Let's see this in action. Here I have some snapshots of some tables I created. And I'll just be illustrating this and I'll be drawing on the screen as needed, and have one of them walk them tablets. And if I am drawing all over the place, it's because I haven't installed the software, not because I can't write better than a kindergartener. Alright, so let's begin. Here we have what would be like a user table or something. We have the user ID email username and the password, which is like just encrypted, basically. And this is going to be joined with the comment table over here. I mean, sorry, that's the video table here. Me. This is the comment table. Alright, so here, we got each comment, and foreign keys to the user ID and the video ID as well as the actual comment. And then we have the video table. Video table has the title of the video, and also the description, which would go like under the video or on the side or appended to the title, whatever you want to do for your website. And this is just what the video is about, right. And as you can see, I'm advertising my database design course, which is pretty awesome, right? And what we want to do is we just want to join these. So we have the username, the title, and the comment, all from three different tables. So the username comes from the user table right here. And then the title comes from the video title right here. And then finally, the comment comes from the common table right here. So let's go back to the join table. What we want to do is, is if we're using let's say the comment table, we want to replace the user ID with the username and the video ID with the title So let's look at the comment table, we want to replace all these numbers in the user ID column with the username, which means we have to reference the user table. And then the video ID, we want to replace with the video name, which means we'll have to reference the video table. And then the comment ID, that's just the primary key for this table. Alright, so let's just kind of look at how this works. So yeah, this is how it works, the username from the user table, the title from the video table and the comment from the comment table. Now let's go through how the join looks. I have all the tables here. Now, it's kind of small. So sorry, if you can't see well, but we have the join table right here. This is the joint table. That's the join. Then we have the comment, the user, and we have the video. And we're going to basically just borrow some columns. So the title that's going to come from the video, so I'm just going to go over here to do to do, it's going to come from over here. And then the username is going to come from appear. And then finally we have the comment. And the comment is going to come from right over here. You can even see that probably not. Okay. So there's a couple things though, not all of the columns of values are included. So not every single row. That's because this is an inner join. Now, if you look at the username, we've got all guide pizza, let's follow this user. So let's go to the user table. They got all sky pizza, right there. Then we got if we keep going down this table, we got ha 12 which of course is he's my mouse now right there. Then we got Pablo is all dot dot, which is Paulo was awesome. Then we got milk and COO, which is milk and cookies. And then we got yellow swag, which is yellow swag, hashtag yolo swag. And then we got that I don't even know what that says cifs which is this person, right? Best. Best speller in the world. Okay, I made these A while ago. So you can see these two people, though they're not included. That's because it's an inner join, and they are not part of the end result. Now let's look at the video. So we got the videos, we got database design 1234567. where's where's eight, though I don't see an eight on here. Well, you can see eight. On the video side right here, it's not included. So eight is not included. Basically, we're taking different columns from different tables, but not every single row is included. That's because it's they don't meet the qualifications for the inner join. So the syntax or something like this would be like this. Said the Select and then the username, title and the comment. These are all from different three tables. But it might vary on how it looks for your database management system. But it looks something like this. And then from the user, comment in video table and we just do the inner join inner join the comment by the user ID and the comment user ID and then the inner join a video by the video video ID being the same as the comment video ID. So what's the difference between an inner join and an outer join, that's what we will be discussing in this entire video. It's going to be about 20 videos. So hold on a notch. It won't take too long. Once you understand what an inner join is picking up outer joins and the other junk is like super easy. Okay, kind of how like once you learn a one programming language, the second one is only somewhat less hard. Right? See I have some junk written behind me. Let's look at it. Alright, so we have what's known as a join condition. That is at the end of the joint statement. Do you remember doing this? I can't hear you. I'm just kidding. Okay, so we got customer customer ID and card customer ID. That means there's two tables, the customer table and the card table. And then both of these have a column customer ID to basically just simple dot syntax The table dot column. And this is making an association. This is saying that we want to join two tables. And the rows that need to be connected, are connected by the column of the customer ID and customer ID. So the two columns here, this is going to make a connection. And our join is going to build is going to be built upon that so we know where to connect things. So the way this will work with an inner join me to show Yeah, we got, let me just redraw those two tables real quick, we have a customer. And then we have the card. So this is like a bank, and people can get credit cards. And a person doesn't necessarily have to have a credit card because they can go to that bank without having a credit card. And also, a card doesn't necessarily have to have a current owner because it could be a disabled card, or it could be a card that hasn't yet been activated, or something else could happen to where a person doesn't own that card. So there's no required columns for this. Now, when we join these, the join table is going to only include customers who own a card, that's going to be the end result. That is with a inner join. So we have the customer card, join table of card and the customer that is exclusive, pushing away all of the customers with no cards, and all of the cards without an associated customer who owns that card. The Outer Join works a little differently, in that it's still going to return all of the rows of from one of these tables, you can decide which one whether or not, they have an associated entity on the other side. So that joint statement comes in here, it's important. So we had it to where it was the customer ID, I'm just going to put C for short. If we have it, where we have it what's known as a left join, I'm just going to write it, it's going to basically ignore that statement. And it's going to return all of the rows from the left table. The right one, though, is still going to be a Bane that join statement. So the joint statement on the right out was a cause customer. So with this left join, all of the rows are going to ignore this on the left side. So it's going to return back every single row. On the right side, though, it's going to only return. The ones that have a customer ID that has a customer ID have the same value within the customer table. So for example, we could have a card with the ID, the customer ID of seven. So that's customer ID not card ID. So that's referring to the person who actually owns the car. All right, we can have that person over here. And since if you basically insert this into the join condition, we have seven equals seven, because this card customer ID has an equal value in this table, which means this card is owned by that person. It's kind of how it works. But with an outer join, we are going to ignore that. And basically say, we're going to return all of the rows for either the left or right side. So there are made that are and there are three main classifications of joins, outer joins, that are I'm going to write them all and then next video is we're going to discuss them, there is a left outer join, Right Outer Join. And that seems to be an outer OUTER JOIN. I'm so dumb. I meant to write. Full OUTER JOIN, excuse me. But basically, we're going to be discussing these two right off hand and full, that can be a little more complicated for some database management systems. So we're gonna probably discuss that a little farther down the road, hopefully Now the right outer joins, we're basically the same way as the left outer joins. But instead of the left table, it's going to be the right table that keeps all of the rows. Now, if you don't know what I'm talking about, please watch the last video or two, that's going to solidify your knowledge. And you'll be like, ready to tackle this, because I'm not trying to get you to watch my other videos, I'm trying to teach you something that's a requirement to understand this. And I don't wanna have to reteach it all, for the sake of one little concept. So let's draw this out a little bit. We have it to where we have two tables, a user, our customer table, and a card table. So this would be like for a bank or something. And let's first see how would a left join would do a left join would take all of these customers. And then it will take only the cards that have an associated customer and return those. So we returned, we basically connect the dots. And then we would get a joint view, which is going to list all the customers, but only the cards that have a customer. So this is going to be like customer listed card they have. And if they have another card is going to list that customer again, list the card they have we discussed that in depth in the last video. Well, for a right join, it's going to work the same way, except now it's going to do it for the other team. So we're gonna have the same exact setup, we're gonna have a card, I mean, a customer and card table. And now, if we set it up with a Right Outer Join, it's going to take all of the cards, put them in the join table, so every single card, and then it will list the associated owner of that card. So card, and then the customer. Now there's something to consider with this, if we set it to where within the card table, we have customer ID and we give that a column characteristic of not No, that's gonna, that might confuse you a little bit because then every single card here is going to have an associated customer, which kind of that's the whole point of the Right Outer Join, or the left outer join for that sake, is to include the rows that don't have that association. So if it's set to not know, it's going to return them all anyways, whether you use a Right Outer Join, and so forth, you guys understand, because basically, if we use the Right Outer Join for this, it's going to return them all. But it's going to return them all anyways, because every single one has an associated card, we'll discuss that more in the future, don't worry about it. Now there's something else to consider. This basically works the same exact way as the left outer join, except is switched around. So when we discussed in the last video, how you could say which table was the left, right where the froms the basically the form be like select, maybe and then you'd have like the columns first name, last name, and card amount, or you can have your card amount of us good. And then you have something that says like from Well, the column that you put right here, the table you put right there is considered left when we're talking about this design stuff. So in theory, you could switch it around to where a left join works the same way as a right join, you're just switching the direction of the tables. So in practice, most people don't even use right joins, I'm only teaching you so you understand when you see one. Instead of using a right join, they'll often use a left join. And then they'll just flip the column the tables I'm sorry. So like this, you have your two tables. And I'll give you two possibilities that are going to give the same result. You could have your customer and then the card. And you can use a Right Outer Join. Or you could have it to where you have over here you have the card and the customer with the same exact columns in same exact foreign keys. Nothing changed except in design, you just switch the things and when you make the SQL statement, you're going to switch the location of the call of the tables within that statement. But overall the function works exactly the same. And now you're going to use a left outer join. These are going to return the same exact thing. Because if you do this one it's going to return All these rows and the associated columns with those rows, if you do this, it's going to return all of the rows, and all the associated columns, those rows. So either way, you're getting all the cards in part of the customers z. So in practice, a lot of people will only use left outer joins for consistency, and clarity. So if someone else sees it, they already know it's the left outer join, they don't have to worry about left or right, and then draw it out and get all confused. Most people just use the left outer joins, but you can always use right outer joins, if you really wish that is up to you. Well, I've been talking a little bit about not no columns, basically columns that aren't allowed to have no value. And that can cause some confusion when it comes to joins, because you might get results you don't expect. And I'm just going to be explaining that just a tiny bit more, this video will be quick. So if you already understand it, go ahead and skip it if that's what you wish. Alright, so I'm going to be going over an example of a user comment system. So basically, you're going to have a table with the users, and then you're going to have something such as a common table, and you might have a table such as video, or blog, or whatever you're commenting on. But that's out of the illustration for now, don't worry about that. Well, in this example, we have some columns that are labeled not know. And what this means is that you're not allowed to have them to where there's no value in that row for that column. So for example, if in comment, we had comment ID, and then we had user ID. And we gave it a column characteristic of nominal, well, what is that going to do? That's going to basically say that every single row within this comment table has to have a user ID associated with that comment. And that would be the person who posted that comment. Now, the user, on the other hand, doesn't have to post a comment. It's a one to many relationship, and the user does not have to post a comment. But a comment has to be owned by a user. That's important to know. So there's three real possible options of joins that we can do on this table. The first is an inner join, I'm gonna put these in caps, because it's probably better to do that. Now the inner join does not matter which one is left or right, because it's going to take the middle of both of them. So the positioning of your tables within your query of the select statement does not really matter. All it matters is that you're using the inner join, what is the inner join going to give you and the inner join will give you all of the users who have comments. That's because if a user doesn't have a comment, it's not going to be in his ID won't be in the comment table. And he'll be left out. There's a comment that doesn't have a user, well, then it would be left out too. But that's not the case. Because it's not No. All right, the other option is a left outer join. And I'm going to go into place places in a second left, outer join. And then we could put the comment on the left. So if you need help position, if you need help thinking about that, in your head, switch these two tables around. But I'm not going to do that, because that's part of the illustration already. We're good. Right? The other option will actually let's talk about what that's going to do first, left outer join is going to give you all of the comments, and then give you the associated users who have posted those comments. So every single comment will be included. So the return result will be all comments, and associated users, okay? Now think about this, though, these things are actually going to give you the same exact thing. That's because if you do an inner join, it's going to give you all the users who have comments. Okay? So it's going to take you all of the users who have a comment and put it in there. And then it's going to take all the comments that have a user and put it in there, ensure that all the comments that have a user, all the comments that have a user. So these are going to give you the same result. So just understand that. The third option is to flip the left join the tables that is so got a left outer join with the user table and left. Okay, now what is this going to do? It's going to give you all of the users and the associated comments from those users. Now, let's talk about so we're going to have it's either it either includes them all or exclude some of them. So basically, if it includes With all of the users, I'm going to label it with a star. If it only includes some users, I'm going to label it with an X. Same for the comments, all the comments, it's going to be a star, some of the comments is going to be an X. So users do have comments will only some users are going to post comments. So that's going to be x on the user end. But all of the users who have comments, that's going to include all of the comments, because all of the comments are labeled as not know. So every user posts a comment has a row within the comment table. So it's going to return them all. So it's going to turn them all on the user end. That's a star. Now, left outer join with the comments on left returns, all comments and associated users. All the comments. So all on the comment side, it's going to return them all. But the users, not all of them just part of them. Now, left outer join is going to return all users and associated comments. So the first one is some users all comments. The second one is some users all comments. The third one is all users and all comments, this, that's what you kind of need to do when you're trying to plan out which join you need to use, go through all the possibilities with that join in is right amount. So you see that these two are actually the same, this one is just a tiny bit different. And that includes all of the users. So you may be wondering, well, which of these two should you use? Well, to be honest, it doesn't really matter. If if you wanted to return all of the comments, don't worry about whether it's not normal or not just use a left outer join with the comments on the left. And then you're guaranteed to have that you don't have to worry about whether it's not normal or not. So that's what I would do if you only want the users and the comments included. And you don't want to include any comments that don't have a user, which is never going to happen, then you do a inner join. So what you do is you just think pretend all the comments, all the columns are knowable, or they don't have to have a value. And then you plan your joints around that. And it can still work the same way. If that's what you'd like to do. Otherwise, you just plan out like this. And realize you can use either one of these, and it's going to work out fine for what you want to do. This video we will be discussing using outer joins. When there are three or more tables, we will be discussing the example using three tables. But you can basically apply it to any example as long as you understand the concepts, I really recommend that you watch the video about using Inner Joins when there are three or more tables, because the concepts are going to be basically the same except different results will be returned. So for this, the best way to think about it is let's say we have three tables, ignore these two right now, we have a user comment and video table. So basically, it's a system where you can comment on videos, each video that does get a comment will be in this table. And every user that post a comment will also be in this table connected with that video, right? What I'm going to do is I'm going to use a left outer join on this side and a Right Outer Join on this side. And we will just see what those results are like. Alright, so first thing is first, let's join the user and comment table. That's the best way to think about it is to imagine these two tables being joined here to make a new temporary join table. And then imagine this new temporary table being joined with the third table to give a final result join table. So if we do a left outer join here, every single user is going to be brought down to this table in this situation of a left outer join. It will only take the comments that have users because there has to be an association of the user ID in both tables for that. But it's likely that every comment is going to have an associated user because that that user ID column will likely be set to not know that way you don't have comments that aren't posted by anybody because that would be weird. So this will also return all comments. Great. Now that we have that new table. You see we basically just took everything from up here. Let's imagine taking this and joining it with the videos and we're going to use a Right Outer Join. So this is going to be Right Outer Join for Outer Join, we're going to take all of the videos and bring them down here. That's basically how right outer joins work, they will automatically take every single row from the right table. And then we're going to take any row in this table that has an association with the video table. So that means the comments will have to have a video, which all of them will. Alright, so what what will be removed here? Well, some of the users will be removed. The reason that is is because some of the users aren't going to be in the comment table. And if they're not in the comment table, they can't have an association with the video table. So think about the join statements here, we'll likely have something where it's like, user ID over here is equal to the user ID over here. And then we'll have it to where it's done, we'll have it to where it's a video ID over here. And that's equal to a video ID over here. So when we grab all of these, we have all the users and all the comments, because all the users will have a user ID. And we don't have to have those users in the comment table. Because it's a left outer join, which automatically takes all of the users, all of the comments will be returned, because every comment will have an associated user. But once we get over here, it gets a little different. Now it's about the video, not the user. So we're going to take all of the comments that have the video ID, and all of the videos that have a video ID. Because it's a Right Outer Join, we're going to take all the videos whether or not they have a video ID in the comment table. So the first thing, take all the videos, we're also going to take all of the comments are only going to take some users. The reason that is is because if a user, let's give some example data, let's say we have a user over here with the ID of seven. And he he doesn't leave a comment, but because we do a left outer join. He comes down here anyway. So now he's in this new generated table. Well, he has a user ID. But does he have a video ID in the comment table to be associated with the video ID in the video table? No, he does not. So he wouldn't make it on past this stage. And he wouldn't be in this final outer table as joined table. So the final result is going to be all videos, all comments, and some users. That's just an example of using right and left outer joins when there's three tables. Or you could do the same thing with inner join. So if you did an inner join, it's only going to take the users you have comments. So I mean, you can basically apply the same example to a lot of different situations to get the results you want. As long as you understand how a left join works, how a right join works, and how an inner join works, you can get essentially everything you really need to get. The only one we haven't really discussed is the full OUTER JOIN, which you might need sometimes. And that's a little different. So we'll be talking about that soon. In this video, we'll be taking a quick break from joins that we've been discussing for the last eight or so videos. And we will be discussing something that's a little unrelated, but kind of related, if that makes sense. But anyways, we'll be discussing aliases. So an alias is when you rename something and you give it kind of like a nickname or more of a user friendly name, or a name that you are more used to seeing. And you can do this when you're writing SELECT statements and SQL and you can do that to either make it easier to write or easier to read. So when you want to give a table an alias, you will use the A S or as keyword. And that's going to say anything that follows as we which is what we want to call the table. So let's say we had that user table. And instead of using user you wanted to say a customer, well then when you use the table and the rest of the query and you use your joins, rather than using user, you can use the customer. And often you won't take a short name and make it into something longer. You will do something like replace user with a U and then instead of you doing user dot user ID, you would be you die user ID. That's an example of an alias. When you do it on tables. It's often to make the select statement easier. But you can also do it for columns. Now when you do it on a column, it's going to work a little differently. So when I select statement, you put all your columns at the top. So here's just an example of what you might say to a database, you're going to select the email, first name, and last name. When you use an alias with columns, not only do you change what the column is called, but it's going to change what it's presented as. So when you get that new structured view, that generated table, and it looks something like this, rather than saying email, first name or last name, you can make it say, email, fn. Whoops, fn, Ln fOr first name, last name, or you can take out that underscore, and you can replace it with you know, first space name. This works similar in the way that the other alias works for the tables, except you put it right after the column. So you can say select Email, as contact, comma, first name, as, and then you put in quotes got a quoted, that's because you're telling it this is what you want it to be, you want to put it in quotes, so that way, it's understood to be a string. You can see here, I put that space there. And without the quotes that can just cause a problem. Comma, and then underscore last last name. Okay, see all that. That's basically how an alias works, you can take the alias to make things easier to write easier to present. And also, it can change the way the view looks. So it looks more pretty, right? In the next video, we're going to be talking about joins, that you reference the same table or a self join. And basically, you're going to take the table you have and basically tell them last Do you want to make a copy of it essentially, and join these two tables which are the same. And the only way that's going to work is if you give it an alias, so this table will be called T one and this one V T two, for example. So that's an example of when you would use an alias. Other than renaming your columns or your tables. The way a self join works is you take a table, and you are essentially joining it with itself. And that's what kind of makes self joins confusing. And a lot of people can't find practical purposes for them, I'm going to be giving you a drop my marker, I just got white marker, all sub joints are complicated. And a lot of people can't find practical purposes for them. In this video, I'll be giving you at least one good reason for them. So hopefully, this video is helpful. So when you have a self join, think of basically duplicating the table you have and making an exact copy. And then joining those two tables together. That's a good way you can think about it. So for this example, I'm going to be using a user account system on a website. And you know how some websites have like a referral option. Basically, if you refer somebody you get like 20% of their money and or their points or 10% of their money, so forth. That's like referral marketing. And when you do that you are assigned a person who referred you, right, and it can become a chain like this person or for this person or for this person or for this person. So within a user table, you might have a user ID. And then on top of that, you're going to have, you know, email, first name, last name, and the person who referred you, I'm putting the person who referred you within a column known as referred by, and this is actually a foreign key to the same table. So within here, you're going to have numbers such as 12, which is going to point back to the user ID of 12. So for example, the user ID 11 might be a person who was referred by the user with the user ID of 12. Now, the only way you can make this really work for a join, is if you use a self join. Basically, what we want to do is we want to replace referred by with the ID, we want to use a join to replace this number with that user's email, which I also forgot to put in here. So you know, just throw an email in there. So we'll take this, and then our outputted result will look something like this. And this is the table that we're going to have. And then we're going to insert data into that table. So you might have a user such as Caleb curry, and his emails is going to be swag@yolo.com. And then it's referred by that and we're going to replace that number with a user's email such as Howard South Yeah. dot org. So that's just an example of what we can do with this self join. Now in order for this to work, we have to tell the database that we're going to be using the same table in basically two different tables. And that can kind of be confusing. And in order for that to work, you have to use what's known as an alias. I talked about those in the last video, so be sure to check those on out. So to begin, we just kind of think of a name of what we could call each table, we can have the first table, which would also be the user table. And then the second table, which would be another copy of the user table. Now we're not actually like copying the data. This is just for illustration point, it's going to use that data that same table to withdraw the data from, and we could say this one is v one and v two, for example. Those are our aliases. And in order to define those, we put as after our data after our table and put v1. So we can say, user as v1, and then join user as v2. This is something kind of like what you would do. So now we have a user as v1. And then we're joining user as v2. And it kind of gets the idea that oh, this person wants to use the same table for this situation. Let's try to figure this out. Let's begin, we're going to put this within a select statement. And I'm going to give you some specific syntax just so this can kind of make sense. So we have a select statement. Then right after the select statement, we always put the columns we want to take our data from. Now this can get kind of confusing, though, because we have two basically two different tables that we're going to be taking data from. So we have to think which table we want to take which columns from the way we had it set up before is we have this table and this table. This was v one and this was v two. Well, that's kind of in our imaginary brains, which we don't have, we're gonna say this table is for the user. And this table is for the referrals. So we're going to take from this table, we'll take the the user ID, maybe for the first name, the last name, and the email. And then from this table, we're going to take the email of the person who referred you, so we'll take the referred by person's email. So the way this will work is we're going to have the user IDs such as seven and Caleb curry, and then their email, and then it'll have an ID of seven or something around eight, for the referred by, and then that's going to go to the second table to get that person's email, you got to keep this in mind. And you have to figure out what you're going to name your tables before you start the Select. Because now that we are going to say the columns, we have to qualify those columns by putting a dot before them and saying what table they're from. And we can just use the Alias Name that we've had or made up. So we could say, select v one, period or dot. And we could say first name, comma, and then we can say V one dot last name, comma v one dot email. Comma. Finally, we're going to take v two and we're going to take that person's email or you can do their first name or last name, whatever you want to put in that referred by thing, if you wanted to say their first name, you can say v2 dot first name, if you want to be their last name, it can be v2 dot last name. Or if you want a combination of both, you can use a concatenation function or something. So we're just going to use the email. So we're going to take the v two dot email. Now that we know what columns and what tables that we want to take the data from, we actually have to put the from statement still and say something like from and then put the table we're taking this data from. This is where we are going to define our alias. So we could say, from user as v one that is going to basically tell the database that the user table is known as v one. Now we can do a join. Join, this is going to default to an inner join. If you just use join like that. We could say join user as the two that is going to basically tell them that user is also going to be known as v2 but it's going to be considered a different table in this situation. Then you're going to do the on which is going to say where we are going To join these things. Now this is where it can get really confusing, because there's so many possibilities that you can put right here. For this situation, we are going to take v1, and we're going to take the referred by column, and that is going to be the same as V to u user ID column. So if you can think about this, imagine literally two tables with the same exact data. Here's the one, here's v2, and we have a user ID. And then within here, we also have a referred to by or referred by guess that's going to reference the user ID of the v2 table. So that means we need to basically make this connection and say that if there's a six over here, and they're referred by table, it needs to be the same as the user ID over here that has six. If you need more practical data under to understand this, imagine, we have this guy over here. And his name is Samuel. And we have this guy over here. And his name is. Yeah, that's his name. And also refer Samuel, that means it's going to have the user ID of six, within this user row of the user ID of let's say, 12. So Samuel has the ID of 12, when he was referred by the person with the user ID of six, which is I've heard of, and then this other table, which is also the same user table, it's going to refer to the person with the ID of six, which is this guy. Hopefully, that makes sense. But in reality, you're not going to have two exact copies of the data, it's smart enough to understand that these are all one table. So it's basically going to take that referred by, and then go back to the user column, and join that into what would look like taking two tables and putting them together. That's why I get kind of confusing. But when you take this query as it is, and you put it in your database, you will get something that looks like a normal join table, there'll be one really big table, and it will have the columns that we selected, v1 dot fn. On top of this, we're also going to have the V one class name and the v1 email, and the v2 email. Then if we were to put example data into this, it would look something like Caleb, Curry, and email and then the v2 email would be like another guy's email. So such as so calm, that's what it would look like. And it would just give you all the rows for that. So yeah, you can mess around, do all kinds of cool joins and all kinds of stuff. But there's one more thing that I want to tell you. And that has to do with aliases. Well, you know, this might not be too pretty how you have right now. And it might be confusing having an email and then right beside it another email, like what is the what are those mean? Well, you might want to say that one column is the user, and another is the person who referred them. So you can use an alias for that, too. So for that you also use as, but instead of putting in the from, you're just going to put it right after the column that you want to change. So you can put here, here, here, and here. So if I wanted the table to email, or the person who referred, you could make it say like, v two dot email, as referred by, and this you're going to want to put in quotes to say you'll want that string as the title for that column. And then when you get that new table, join, instead of saying email, it's going to say referred by, and then a list of data. Yo, what's up my homie homies, this video are the section of the video, I guess, we will just be discussing this in a little more depth by showing this real example that I explained on the board. So you can see I have a user table. And within my user table, you know, I got user ID pass email for first name, last name, and a referred by this referred by is a foreign key referencing the user ID. It's kind of interesting how that works. So you can see here is this guy allama swag at YOLO calm is referred by to, which in this case would be Hi mom at low level calm, which he basically referred everybody that's why has a zero here. And if you scroll down, we also have some people who are referred by one. And basically what we want to do is we want to replace this referred by with the person's email. And to do that we can use a self join. And who knows there might be a way easier way to do this. So if there is just be sure to let me know. Alright, so here's a query that would work to do that. So let me just refresh this. And here's what it's going to do. It's going to select the first name, last name, email. Mail of the person who was referred, was also going to take the email of the referral person, the person who referred the other person, the person within this preferred by column. Okay, so you can see we got first name, last name, email. That's one thing, and then the referred by email. So how does this work, we can take the columns, and we also replaced the email of the referred by and put referred by, that's why it shows up right here, which is what we want, that's good. Then we say what table we're going to take the user table and call it v1, which you can basically name it whatever, as long as you update right here, then it'll still work. That's just the name I made up, I couldn't think of something cool. But base, basically, a version one, you know, this inner join is taken our other version of the table and just say, no, this inner is optional. So you can take that away, and it's gonna work exactly the same way. Then we're going to say how it's related. So basically, this referred by email, that needs to have an ID in that table that refers to a person. So the referred by is going to have an ID, and that needs to match a user ID. So within the first table, let's go back to the user table. This two, for example, that needs to reference an actual person, it needs to be actual data within that second version of the user table. And it's going to go to the second version of the user table, find that person with the ID of two. And it's going to take that email and display it right here. So hopefully, that makes sense, guys, and of course, you can mess this mess up the square, you can just mess around with it and put your own stuff in it to get your own results. And you can even change this statement right here to get even more crazy results. So yeah, guys, thanks for watching peace on earth, and catch you in the next video. And subscribe, as well as v one dot last name and V Ah, it's a one night now moron. For the aliases of the table. We're going to tell my SQL ah crap nuggets. It's not my SQL databases.