hi guys good morning good afternoon good evening okay so interesting thing today which we are going to do it okay uh today we are going to conduct a mock interview with one of my student who already succeeded in interviews as well as he's having more than three years of experience in currently which is uh he's working in uh very good company so uh for uh one of the important thing let me go with the my in section today so my name is ravindra I'm a trainer here at Sr technologist before we dive into the interview let me give you a background of uh me I I'm ravindra I'm having uh initial uh overall 7.8 years of experience in IT industry initially I have worked uh in ETL testing four year 3 years three and a half year and post that I completely jumped into the data engineer so the agenda of today today's video to give it a mock interview in ETL testing the interview will last approximately 30 to 40 minutes before uh and we'll cover topics such as in ETL testing what are the content which we have to discuss right like those those only we are going to ask you so in ETL testing so we are going to cover Oracle data warehousing Concepts Informatica and units these are the concepts according to the according to the three to four years of experience and other than that uh very very important thing is these are the questions which is asked in uh to my students in different different interviews so I have collected which is more important questions in terms of interview those questions only I'm going to ask you so let me introduce my uh student his name is Faruk okay he's having four plus years of experience in IT industry hi Faruk good good morning hi r hi good morning hi sir yeah uh let me go with the so I encourage you to ask any questions uh you may have through throughout the conversations before we We Begin do you have any initial questions for me no R thank you okay let me go with the questions okay uh let's go with the uh first question very common question tell me about yourself Faruk yeah hi RI uh I am Faruk having four plus years of experience inil testing and data warehouse testing currently I'm working at Rockwell Automation my textt consist majorly of ETL and data warehousing Concepts only in my project I have also got chance to work with Azure data components uh like Azu data Factory uh we are using Informatica as integration tool and our database will be Oracle I also know Unix as well uh regarding efforts and defects purpose we are using J and a devop tools uh so this is what uh I can brief about myself yeah uh can you please uh elaborate your project like data flow as well as your roles and responsibilities yeah sure actually uh coming to my roles and responsibilities uh primarily as I'm a e ETL tester uh so I will the main responsibility of me should be uh guaranteeing the integrity and quality of the data as it moves through the different layers I am having strong background in testing and validating the data my Prime responsibility would be ensuring the data is extracted accurately uh from Source systems and it is transferred to business requirements and and it is loaded into the target systems so I am very with writing the writing and executing the test cases performing data validation identifying reporting any data quality issues uh throughout my career I have collaborated closely with ETL developers business stakeholders to ensure the data is accurately transformed and loaded into the data warehouses enabling the accurate reporting and data analytics actually uh in our project We are following aile methodology where we have a sp specula Sprint cycle of two weeks uh usually our CSM or project product owner is going to create the user stories for a feature which we will have after after which we will be having a grooming on where product owner explains us the user stories uh here we can get any clarifications required according to the user stories uh if you have any doubts we can ask them uh we should have a clear understanding of all the requirements because we have to do the validations upon upon that uh post this we will be having a Sprint call where we have to give out the story points story points is nothing but the like uh uh the timing which we take to our effort which we take to complete those user stories like testing and all those things so we usually follow T-shirt size model uh giving the story points for giving the story points uh as a individual my work starts here I have to analyze the requirements create the test cases uh Tex test execution has to be done test test results has to be documented and then all the test test results which we have documented has to be uh reviewed from the team and then accepting of the user story is done and coming to the data flow in my project uh I have been working with the healthcare project of Ro uh in my project we are having mainly four layers Landing layer staging layer Foundation layer and access layer prior to this we have to do few pre-checks uh such as availability of the requirement document mapping document test case document which we create by analyzing the above uh then we start translating test cases into the SQL scripts actually so uh in landing layer we will dump all the tables from different sources we can call it as a henus sources uh where we have to check all the tables are required uh and also uh and we have to check all the load all the tables which we are which we require are loaded success sucessfully or not uh then all the entities will be loaded into the staging layer here we do uh actually here we do schema validation metadata validation data validation and constraint check is also done n validations Etc are also done then we have to run the workflows uh we have check we have to check if the workflow already running and test and test data availability also we have to check if everything looks fine then we run the job using the workflow manager uh and it will navigate us to the workflow after running the job in the workflow manager it will navigate to the workflow monitor here we have the job status if the job status fails we have to check the session logs and analyze the issue and raise the defects in the a devops tool uh we can after raising the defect we have to assign it to particular developer uh it will automatically send email to the notification to the assigned developer uh if everything is fine then the data is moved to the foundation layer uh with all the business Logics here we mainly focus on the data Integrity uh completeness and also let me stop you okay I understand completely okay you are giving detailed explanation uh let me jump into technical area okay so it's a good uh good uh thing which you have given all the introduction and in detail about the how the your project works as well as what are the tools and Technologies which you have used oh okay let me jump into the technical area like let let's go with the SQL okay uh first thing is let me share my screen okay let me share my screen and then uh we'll go ahead and we'll discuss so sure here okay let's say here we have a table one okay and then here we have a table two okay here in table one we have a column one okay here also column one so we have a we have a rows are 1 1 2 2 33 and n n and four and 1 2 3 n n and six so here I want you to solve okay so inner join okay inner join yeah okay and then left join yeah so left out join okay then we'll go with the left out join okay and then go with the right outer yeah right outer join and then full join nothing but a full outage okay so these are the things which I'm expecting from you okay let me uh give the access okay just give me one sec so you can request the access so that uh yeah once again yeah you can request the access I can provide you the access yeah I have I have asked yeah you can you can prepare this actually so uh this question is essential for even ETL testing so let's go let's let's go with this particular question yeah sure we start with the inner join yeah sure yeah uh here uh for inner join first of all we will be having only the matching columns okay far let's prepare the data set then we'll discuss about the solution you can explain after that that would be [Music] good sorry it is as it is remote it is making no ises take time yeah that's it uh actually here we are having 1 2 3 only are the matching columns between two the two tables and we are having in the left table as we I'm considering table one as the left table and table two as the right table so in the left table we are having uh 2 * 2 1 2 * two and 2 * 3 so we are getting this 1 1 2 2 3 3 as uh the inner joint output good please go ahead and then I will go with the left joint [Music] yeah uh left join will be just an extension to the inner join so I using the same things uh so left Jo will also be taking all the considering uh like uh all the all the matching records and also from the left table what are all the records which are not matched that will that will be coming as a combination with the null in the right table yeah go ahead with that here 13 is missed I think sorry we are having two threes there for yeah so here right outer joint uh opposite to the left outer joint uh here we will be matching the record with the right right table and also what are all the records left out we will be giving as the null okay uh so what is the combination of write out a choice can you please explain uh here uh T table to having we I'm taking table two as the right right table so one is matching with the one two is matching with the two three is matching with the three and null is match null I null doesn't match with anyone so I taking null to null and then null to null and then six it is not matching with any others so I'm taking as a null and four the it doesn't matches with any of the right table so I'm leaving out it as a it is a left left table okay see to me to me make sure right so whenever you are explaining left order right order right both are extension of inner joint which means whatever it is coming in a join right all the records which is going to be present in the uh in right join and left left join both even here also six records which is going to be present in the right join as well yeah so you have given 1 2 3 right that is incorrect okay so when I say incorrect in the sense it's kind of you know you whenever inner join right here also it it matches 1 equals to 1 1 equals to 1 okay two records which is going to come in the right joint 2al to 2 2 equals to two so two records so you have to copy this okay by default you can in joint okay yeah yeah so to make it clear I'm explaining this but in real time it won't happen they will just end okay so correct correct let's go here this is the output okay let's ignore the full outage let's go with this row number rank dense rank okay how it works so let's consider it I have arranged the data already okay th000 2,000 2,000 and 4,000 4,000 now I want you to be giving the row number rank rank can you please it's a ascending order okay which means it's smallest to largest yeah sure uh so as it is a ascending order so row number just directly gives the gives us the uh serial number simply uh so I am assigning it directly the serial number and then coming to rank uh rank will give the uh ranks according to the number and we as we are considering from lowest to highest so uh lowest will be the rank one uh and we are having 2,000 as two so it will give same rank to both the numbers but coming to the next next consecutive row it will take uh as the uh it will not take as it is rank it doesn't take the next consecutive number it will take the next number uh that is coming after this so as we have completed 1 22 instead of taking three it will go for the four and also for the next next record also we will be having four okay good yeah for the dense rank instead of this we will be having one two and two it will take the next consecutive number instead of skipping the number three and three so this is how DSE rank Rank and row number works okay good now I have one more question okay how to identify the duplicate records you can write the query here uh yeah we can write the duplicate query actually uh you can write here okay let me let me do this okay just let me match it okay you can write it up here yeah yeah let's say you have a employee table employee ID okay and uh employee name okay and yeah salary these are the three columns are there employee name is the one which employee ID is the one which is consider as a primary key here okay now you can write to identify the duplicate records yeah then here uh we can take the count of the employee ID okay uh where is it uh here it is right employ ID employe name and salary correct correct uh one minute for let me make it bigger so that it will be visible to others yeah actually for me it is uh remote right so it is taking bit time yeah yeah it's okay just give me yeah yeah you go ahead please yeah you can take control now yeah I have taken it is gone I think okay just you can request again or you can share your screen okay that would be good already we have done this exercise right you can share your screen and write it up that would be great because uh I'm seeing it's kind of know delay in the network so you can share your screen and we'll go further yeah one second I think now you are able to see my screen yeah yeah I can see your screen yeah please go ahead and uh it will be bit fast I think now yeah uh where is [Music] it [Music] that down okay yeah this will give us a count of star uh here count of star one greater than one gives us the uh Records which are having uh greater than one which gives the duplicate records okay good now okay let's go let's get into another area okay now uh I have uh email ID okay can you please provide me the domain name of it yeah uh email ID uh so domain name you want yeah uh yes so here I will take uh select email column I'm taking I'm considering email email email column so select email comma substring of so to to the audience okay uh let me give you can you please write one thing uh to the audience email name means Ram rindra mention that okay ram rindra ram rindra at gmail.com at gmail.com okay to answer this right so to make it the question should be correct so when I say domain name after at symbol gmail.com is the domain name okay so the expected output is gmail.com okay that is the expected you can write the query now yeah so to get to the uh ad theate point I am using in string you complete it and explain me okay make it faster bit yeah so uh here uh this query gives us the only gmail.com how it will go means uh so select sorry this will give email comma gmail.com so so this will copy this in one column and in another column it will give gmail.com so here I taking email so it will directly take the total uh email ID in the domain name it will take the uh substring email in string it will place up to the at the date and it will give us the uh gmail.com okay great okay let's jump into the other area okay so we have covered skl now I would ask you how to get the top 10 records from Unix file so let's get into Unix okay I have a requirement I wanted to get okay top 10 records in Unix I have a file ABC file. ABC file.txt I wanted to get top 10 records can you please help me out with the command yeah then I can use head command uh I will take n uh n will give you a number of Records which you which we require then I will I will type 10 and then I will I will give the file name uh I will give take it as file pH okay great okay now let's go with the other question okay suppose I wanted to have a I wanted I have a ravindra in a file I wanted to replace with ram okay for example let's say how can you do that yeah then I will go for S command uh s uh uh so you want ra ri to be replaced by Ram right yeah exactly yeah uh so in this way I can do that uh so in uh if uh if I have to like uh like I have to give the into the same file then I will directly give the file name file.txt or if I have to create another output with the same then I can give another file okay to correct that right if you wanted to make changes within the same file you have to make mention after SD right you have to mention iph I it is going to be sorry yeah I forgot yeah in place okay so let's go with the other other thing okay now uh can you get the wherever Sia is present consider it you have a file which is present thousand records okay in only two rows which present SAA okay so to make it to get that right which command you are going to use uh there I can use G command uh okay G uh so using GP command uh I will give the uh name uh in the hyon then uh I will give the file name okay that's good uh can you please to close this unit right can you please help me out how what are the commands which you have used can you give me some commments at least five to six commments yeah like I have used LS command to list out uh remove directory RM to remove the directory C command uh make directory command grip head tail a sort Etc Great okay let's get into Data weing okay uh let's go with the first thing very very important thing what is fact table and how it is what are the types of fact tables uh typically the tables having the measurable data are called fact tables and contains the foreign Keys generally uh for which the dimension tables are mapped uh add uh coming to the types of fact tables we are having additive fact TBL semi additive non additive cumulative uh snapshot fact tables factless fact tables and transactional fact tables uh in this if you want uh explanation about uh you can ask me any okay let's go with the additive facts okay yeah additive facts are the tables contain measures uh generally uh they can be aggregated across all the dimensions uh for example if you want to some any of the uh Dimensions uh so for example if you have sales or uh like a Prof we are having so we can we can do that by performing aggregations upon any any of the dimension tables okay to summarize it it works with the all the dimensions and you can calculate it key performance indicators across the indication across the dimensions so what is dimension actually or just go with the dimension yeah Dimension table uh generally has the measurable data actually uh where of the fact tables will be have sorry fact tables will have the measurable data and also Dimension tables will be mapping the data and giving the context of the fact table okay you mean uh descriptive information about that yeah correct correct correct okay got you then let's get into what is uh SD types of SDS uh SD uh we can abbreviate it as slowly changing Dimension uh is a concept of data warehousing that deals with uh uh deals with managing the changes in the dimension data uh actually the period of time the data changes generally if a if a new product comes in or if a new student joins we will be having the changing in that or a student is getting migrated to other other location then there will be uh changes in the dimension table so his location has to be updated so similarly we will be following a pattern called SD where we will be having different types of SDS uh we will be having uh SD type 1 SD type 2D type 3 and there are other forms of SD uh which I have less knowledge about so SD type one generally has data which doesn't have any historical records uh it will update the data whenever it uh a new record comes and it will override the older data uh then coming Tod type 2 it will have the all the historical data and it will be having the current data as the active flag having it is it will be having active flag and whatever the old data will be it will be marked as expired or inactive uh so it will create a new row whenever a new record is uh uh coming uh so if I am if we are having two uh number two employee having uh having working in Bangalore if he is shifted to Hyderabad then we will have another record with the number two as having shifted to uh from Bang from Bangalore to Hyderabad so location will be extra and uh there will be a active flag inactive flag coming uh to Side by and then we will be having ACD type 3 where we will be having a limited historical data and for every new record coming uh we will be having a uh generally we'll be having a partial historical data here we will be having another uh column beside the column uh which we are having ACD so in that it will be having the previous data okay great uh that's good okay uh let's go with the uh because of time limitation okay let me limit the things okay have you worked in Informatica uh when I say have you worked in Informatica so being a tester what level you have worked done okay can you please explain me okay how can you see the logs as well as suppose if if any one of the job is failing okay how can you navigate through Informatica yeah uh actually uh uh clearly stating uh I have worked in Informatica uh Informatica Power Center actually so using this uh we used to run the jobs so whenever uh we run the jobs we have to monitor those jobs as a tester so for running those jobs we first of all we will be connecting to the server by giving uh connecting to the repository by giving the server port and username and password uh so after logging into it uh after connecting to the repository uh in the uh we will be go we will be having a go to menu in that we will be going to the workflow monitor in the workflow monitor window uh so uh it will allow us to view and manage the workflows uh where we can see like uh what is the current status and also whether if it it has already run we will be having completed our failed status or uh successful status so uh for viewing the workflows uh to view the logs associated with the workflow F uh we have to like first uh uh right click on that and uh we will be having the view log so uh if you want to check the log we will be be choosing the workflow log there uh in the workflow log window a new window the workflow log will open uh this window displays detailed logs including session logs transformation lcks and error messages which whichever have occurred in the session logs uh if you click on the session logs we can see uh like uh in the workflow in the workflow log window uh we can see the specific sessions which which are which are all failing and also we can have the review review logs there if you click on the review logs for the errors that's a good thing okay so final question which I have okay uh yeah how did you write test cases in your project what basis you write the test cases and another important thing what are the tools which you are using for your defect and defects and effort strugging purpose yeah uh coming to the test cases I I I have been scripting is using the mapping document mainly uh so first of all uh we will be having a connect as I explained before uh the PO usually explains us the uh user stories based on the user stories we will be having mapping documents based on the mapping documents we have to scri basing on the mapping documents and also the business Logics which they provide in that uh so Bas based on these two things we will be creating the uh test case documents uh sorry test case document and the logic which we have to execute so through this we prepare the test cases and coming to the defects and efforts we have been using Z Tool uh in which we create the tickets uh through the uh Sprint basis uh we will be having epics and based on that we will be creating the stories okay uh thank you for it's a nice talking to you do uh do you have any questions for me uh no thank you okay really nice discussion with you and uh thank you so this is the end of the mock interview so uh thanks Faruk for joining with us and giving you uh time and uh as well as you know it's really nice discussion with you okay couple of things which I wanted to iterate it okay when I when I say here in mock interview right so uh whenever we are going ahead and we are going to discuss about the questions right so thing is actually he is very keen to listen and that is the reason he's able to answer all the questions and moreover whenever we are going ahead and we are going to discuss about skl right most of the people will struggle skl stuff when whenever we see scenario based questions okay so we have to clear with the basics if you are clear with the basics then uh whenever you have any question right related to SQL you will be able to answer easily you'll be able to answer easily see conceptual knowledge which you should have completely at the same time you have to practice as much as you can SQL SQL by looking it is easy but whenever you are translating from business requirement to or interview questions to technical ter then that would be the quality right the the important thing which you are going to face is the important issue which you are going to face is so how can you translate these Concepts into real time that is very very important by practice you can get it okay so uh thank you far thank you thanks for your time uh really nice talking to you thank you guys thank you thank you thank you and and one more important thing we are going to start new batch if you are willing to join please join with us thank you thank you have a great day