Transcript for:
ETL Pipeline Setup for Sales Data

hello everyone we have the following spreadsheet which is sales data let's create an ETL pipeline those we could load uh this data into Microsoft SQL Server okay let's get started for that purpose we are going to use Microsoft Visual Studio we need to use ssis we are going to do that with integration Services let's create a project create let's create a data flow task okay now let's add source and destination source is going to be Excel because our file is Excel file now we are going to select this spreadsheet and do okay click ok now we are going to add destination destination is going to be SQL Server please look it here we created here we have my shop database but this database is empty we are going to load the spreadsheets data into this my shop database my server name is Acer we are going to load to my shop okay then we're going to connect this to this let's take a look at errors at first we need to mention um the tab then we are going to check columns okay we see here our columns okay click on ok now we are going to set settings for SQL Server here we should mention table name because we don't have any table in this database we're going to create new create table let's let's rename this with sales say sales and I know that transaction ID should be integer this is a primary key transaction Day date time uh we can write just date payment card virtual okay month here key month year key is going to be integer the other is employee name we can right here reduce this number and write 100 payment card also we can write 30 but we will normally have only 16 numbers employee address customer customer this is customer name customer name let's reduce to 50 customer email let's produce uh 50 customer address industry industry cannot be I uh no let's let's Maybe 70 here product name product name let's write 50 for this product category let's write 50 for this product cost float okay price float quantity now quantity can be integer why I change this integer to integer because integer takes um less space because of that uh throat not for fraud uh we have here only yes no um okay let's let's put 10 characters okay keep nulls mappings pink looks good okay okay now we don't have here any errors this is not an error okay let's start we see the process is completed successfully uh by the way we might face with some errors like with data types and we might need to do data conversion because this is not uh educational video because of that I don't show the way how to resolve that just we have completed process and that's fine now let's let's go and check if we have if we have this table we see we have here a new table sales table this looks good okay those we could create an ETL Pipeline and load data from Excel file to Microsoft SQL Server now let's do it more tables into my shop database let's create a flow task okay Source assistant we have Excel files and also we have csvs let's take Excel files it's in here so first one is category okay okay let's put it here let's add a destination destination is going to be asql server but for this table for this spreadsheet we don't have um we don't have a table because of that we are going to create a new table let's connect okay so table name we don't have a table for this we are going to create a new table for this okay now let's select the tab on Excel category ID okay that looks good for this now for this table we don't have created an empty table for that now we create table create table this is going to be category category ID category ID should be integer and this is also a primary okay of course we can add keys after when we load the data but let's go ahead and add it here now category name let's put it 50 and status status uh is showing is it active or no so let's put 30 for this I think looks good okay let's add another source this is Excel again new okay this time we are going to add customer okay let's add the destination SQL Server okay server name is okay sir database is my shop okay let's add another source we also have one more spreadsheet um that's a CSV file let's add let me show also the way how we could add a CSV file CSV file is not Excel so we choose flat file new okay let's browse that that's CSV open this looks good columns comma okay ah data Rose to skip one foreign about that no sorry header rows we should choose okay now let's add a destination the destination is SQL Server eraser oh my show okay now for this spreadsheet for this spreadsheet we should create a table this we can name marketing expenses okay here for Char month voucher now let's put integer because this is just a year for example 2014 or Etc per month also it's going to be integer because here also only number of months a month year key this is also integer and marketing this is marketing this is marketing expense okay this is going to be integer this is going to be integer okay keep notes pings okay looks good now let's start the process we see that the data from all spreadsheets has been loaded to our database let's refresh this let's go to tables we see now new tables let's open for example this one okay this looks good okay so we loaded all the necessary spreadsheets