In this video we will deal with the topic
of normalization from the field of databases. Normalization is necessary to avoid anomalies
in relational databases. I have already made a video on this topic,
or rather, I have already shown what anomalies are and what problems they lead to. I link the video above. Again and again this leads to problems or
difficulties. But it is enough to remember the rules and
just apply them. I will explain these rules to you step by
step. And it should be said that the normalization
is a database design and has the sense to prevent anomalies and redundancies. We will start with the zeroth normal form
and for this purpose we will we will look at the database of the computer
company hyperEDV, which has a database, but it is still renormalized. In a database of the zeroth normal form, all
the data is unnormalized in a single table. And this is exactly the case here. Our database consists of a single table with
order number, name, date of birth, etc.. And as you can see, this doesn't look very
good and shouldn't stay that way, which is why we convert the tables to the first into the first normal form. I already said, there are always quite rigid
rules for normalization that have to be followed and the same applies here. The rule to convert tables into the first
normal form is: "All attribute values must be atomic". By an attribute itself we mean a column like OrderNo
or the name. And by atomic we mean that we cannot break
down these attributes further. This is not the case here at the moment, because
our table is still in the zeroth normal form and we are now looking at the name, for example. Our customers have a first and a surname and this means that we have
to break down these columns as far as possible. possible. We make two new columns out of the name column,
namely first name and last name. Because only in this way we can sort by last
name at the end and display all customers who begin with an S in the last name, because
our computer would not know if Ludger Sievert is in a cell, how it should divide the whole
thing. where does the last name begin here, or what
is the first name or what is the last name. The same still applies to the address. also the address we would have to divide still
into the attributes street, place or postal zip code, because also the computer does not
know, how can one distinguish here between street, place and postal zip code. For example, if we display all customers from
Münster and they are in one field, this is not very convenient. Means; we must divide everything so WIDE that
everything in a single column is atomically in a single column. And then I would say, let's do that and take
a look at the result. And yes the attentive ones among you will
notice now the house number one could divide nevertheless also still. Of course you could and that would be absolutely
correct according to the rule. But we neither want to sort by the house number
nor do anything with the house number. That's why it doesn't give us a big advantage. Here it is of course always a question of
weighing up whether it makes sense or not, but if we were to apply the rule in this way. Then we would have to transfer the house number
of course also in an own column. So far clear? The rule is relatively easy to apply. And we have, if we don't see it strictly,
a table in the first normal form. We continue this with the second normal form
and for this we look as usual first at the definition which is needed for the second normal form. The rule is as follows: A table is in second
normal form if it is in first normal form and every non-key attribute
is fully functionally dependent on each key candidate. This sounds a bit more complicated than it
actually is. Because it is
not that complicated at all. We will notice that now. As first information is important that each
normal form requires the previous normal form, as with the second one. Means, if we want to have a table in the second
normal form or transfer it to the second the table must first be in the first
first normal form. And then we can we can bring the table into
the second normal form. This rule is also valid for the third normal
form. Means, there the table must be first in the
in the second normal form. Now
let's look further, because for the second normal form we make a small excursion in
relational databases. In relational databases we work with primary
keys. The purpose of a key in a relational database,
a key is used to identify a data set without identified without any doubt and this value
may therefore only occur once. Our primary keys can be recognized by the fact
that the columns above are underlined like customer
number and order number are underlined. For this
one or maybe more columns are defined, which take over this task. For an order for example we can we can well imagine the order number for this. An order has a number and this number is unique. It does not exist more than once. There is nothing to do with it. The same applies to the customer and to all articles. Here, too, the customer number or article
number the article number is unique. Each customer
has its own number and each article has its own number. And now we turn our attention to the
to the story of the fully functional dependencies. We now get a total of
four tables as you can see, which have been created one table at the beginning. These are Order, Customer, Article and OrdersArticle. We will now look step by step at why these tables have been created in this
form and how we deal with them. Each of these tables in a relational database has
a primary key and non-key attributes which in the second normal form are fully
functionally depend on primary keys. We look at at the customer, for example. Since
the first name is fully functionally dependent to the customer ID. We can see this even better in
the table OrderArticle. Here we use we use a composite primary key
of order number and article number. So both columns form the primary key here. Order and article must not appear together
in this must not occur again in this table. In addition there is the column number, which
and there one can you can see it relatively well, equally
from the order number as well as from the article number as well. So we always need both
information. So the order number and the article number together so that the number
is meaningful. That means, the number is fully functionally dependent to the whole
primary key. So we have each column that was not fully functionally dependent
on a key kanidata from a key channel data in the previous table
into a sub-table. We remember,
we had our own key candidates, respectively the article number, customer number and order
number in one table and now we have moved them to into their own tables, which is normal in
the second normal form. normal form. Now you ask yourselves how do we get the second customer number and
the second we get the second customer number and the
second article number and the second order number. We will now look at this in detail. Let's have a look at the relationship
between the Customer table and the Order table and the relationship between
the table Order, OrderArticle and Article and the table OrderArticle
table. Let's start first with the order and the customer and we can see that the customer
number appears in the Order table. Here is a relationship between the
the Customer table and the Purchase Order table and
it is a 1 to n relationship. The 1 means in this case, that each order can have one or no customer
customer. And the n means that each customer has not
made one or more orders. orders. This is resolved in relational
databases so that the primary key on the on the customer side acts as a foreign key
in the Order table. Thus each order number can only appear once, since the
Order number column is the primary key and the customer can appear multiple times
and can have multiple purchase orders. Thus we solve only one to n relationship in relational databases. databases. now here we have a second relationship and this one is a bit
more complicated. Here we are dealing with an n to n relationship. If we have a 1 to n relationship,
as we have just seen, the primary key as a foreign key in the other table
we need a new table for an n to n relationship we need a new table in
any case. The n to n relationship says that an item
can have can have several orders and an order can have several articles