hello and welcome to everybody on cloud fitness so now again this video is in continuation to my delta table play series and this part we will discuss regarding the merging in data tables we will see how merge command works in the delta table we will see how easy it has become to do a merging when you have delta tables or when you are working in delta tables so let's directly jump on to the example that we have so uh talking about uh you know the merging we already know what merging is right so when we have to update or insert whenever we have to do any kind of upsearch operations in that case merging comes into picture correct so in delta tables you have something called as merge into statements to do that so let's say you want to take data from one table and add it into the second table now in this case what happens is if you have new rows in the table 1 and you want to move those new rows into the table 2 that will be called as an insert operation and when you have and you know rows which are in table 1 but are also present in a table number 2 where you want to merge them so if there is any update on those rows if there is new data coming in in the table 1 for the rows which are already present in the table which you want to merge into in that case you do an update based on certain expression so this operation is called absurd or this operation is also called as merge in the delta tables this merging was really very painful before the onset of this delta table but with delta table it has become really very easy and we will see how we can do merging in the delta tables so here basically i am trying to read the data read a delta table which i have also showed you in my previous uh videos on the delta table so this is a by default this is a delta table you know delta format which is present in the databricks data set by default databricks give you some set of data sets so it is present over there even you people can also go ahead and try the same thing and you can see that you will be able to read this delta table in in the form of ah you know data frame so right now i am reading the delta table in in a data frame named as people and then i'm again writing that particular data frame as a default delta table and you know talking about the default managed unmanaged delta tables i have already explained in my first video on delta tables i do recommend watching that if you are not clear on those concepts so now taking that demo data i am writing that demo data in form in the delta format and i'm saving it as a table named as people 10 m so this is my table right which i'm trying to write it as so this is another delta table so i'll read the command run and in the meantime i will also show you so okay so this is the uh you know the table is created and this is the select star from people 10 m the table which we created the delta table that we created from the default data set now we are also creating one more table because we need two tables for merging right so i'm creating another table with the same schema right id first name middle name last name gender date birth ssn number and the salary so same thing i'm again creating i'm creating a uh again a delta table now this is a managed delta table i have explained this in my first video do watch that and understand what is you know a managed delta table and what is an unmanaged delta table now this tilde table is also created now what i'll do i'll insert few records in my delta table right i will insert few records in the table which i have created just now okay this is the table that i have created just now i will insert few records and those these records are actually already present in the first table which i created so these records i have taken from this particular table the first table which i created right first default table which i created from the databricks data set so i have taken few records from there and then i am adding those records into the table number two so my table number one and table number two both have some matching records right so they have three matching records now what i'll do i'll start my merge statement so let me in fact zoom it a bit more so now uh if you see i have two tables table one and table two so table one uh you know is people ten m and then the table two is people ten m upload zero one so these are the two tables one this is the table one so this table 1 was created by the default data set and this table 2 i have created manually and i have inserted 3 records which are already present in the first table right now if i try to do a merge operation right if i try to do a merge operation it should update it should update right now let's see how does it update so we we we are going to see uh it you know step by step now i'll start my merge statement merge into so i want to merge into my people 10m my table number one right using this particular table so this is a syntax actually of your merge statement on so you have to merge it on some particular condition right so you have to you know merge it on some particular condition it's it's like a join right so you need to define a join condition on people10m.id so i am merging it on id this is the id right this is the id so i'm merging it on id and then i'm saying when matched then update you know if not matched then insert if it matches then update otherwise just insert it so let me just execute this command now you can see that the merge operation is executing we'll just let it execute now you will see the merge statement is complete now and you will see it has given the statistics as well right it has given how many rows are affected so we have three rows which should go as an update so those three rows are you know it is mentioned that you know number of affected rows is three the number of updated rows are three deleted rows are none because we did not have any condition on delete and then we had a number of inserted row as 0 right and now in the similar case what we will do is if you see i have merged my data into this table right now what data i have merged if i go back so these are the rows that i have updated right these are this is these are the ids which i have updated now what i'm going to do is i am going to do a select star from the table which i have merged to and i am filtering it based on the ids basically i'm trying to get the data whether i just want to see the data what i have merged right so these are the ids that i have merged right and you have this data over here so you see this data is reflected in the target table right so let's let's go above and match it so these the and okay let's just check the salaries 55 48 and 90 right so let's see whether do we have this or not 55 48 and 90 and we have same ids over here so you can see that the data has been merged right now let's do another thing in this case we will uh you know insert into table number two which is people10m upload01 in this table i'm going to insert new records which are not present in the final target table right these are the new records these are not present in the final target table so when i run my merge statement what it will do it will insert these records into the final target table correct so or in fact let me uh you know check these ids as well so let me go back to this particular statement let me filter out these ids and check whether these ids are present or not right so this is the id right so i am doing a select star from the target table using these ids because i am inserting i want to insert these rows so if i did a filter i could see that there are no rows present in this table with these ids so i have insert i have you know fired the statement already now uh we have people10m upload01 this delta table has these three new records now if i go ahead and run a merge statement now i'll run this merge statement so my table name is upload01 i'll just yes and now if i run this merge statement now you will see that those new rows gets inserted into my final target table now you can see this the number of affected rows is six the number of updated rows is three number of deleted rows is zero the number of inserted rows is three now what the why did this happen so uh we okay so let me tell you why this number of affected rows is three uh i mean six why the number of affected rows is six because i am merging my one table into another my one table had three rows which i marked as an update earlier right and then i inserted three more rows which were inserts right and now those three rows also got updated and the three rows which i marked as insert those three rows also got inserted right so this is what has happened here now uh if i do select star on the same table and try to find out these uh values for the ids which i have inserted let's see whether these ids got inserted into my final table or not and then you can see that these got inserted right so this is how your merge statement works in your delta table and this has become really easy with the onset of the delta tables so thank you so much for being till here and do let me know in the comment section if you have any queries any doubts thank you so much for being till here