📚

Mock Interview on ETL Testing with Ravindra

Jul 11, 2024

Mock Interview on ETL Testing with Ravindra

Trainer: Ravindra

Background of Ravindra:

  • Trainer at SR Technologies
  • 7.8 years of experience in IT
    • 3.5 years in ETL testing
    • Data engineer afterwards

Agenda:

  • Mock interview in ETL Testing
  • Interview duration: ~30-40 minutes
  • Focus areas: Oracle Data Warehousing, Informatica, Unix
  • Questions based on commonly asked interview questions

Introduction of Interviewee:

  • Interviewee: Faruk
  • 4+ years of experience in IT
  • Current role: ETL Tester at Rockwell Automation
  • Experience: ETL and Data Warehousing, Azure Data Factory, Informatica, Oracle, Unix

Interview Questions & Responses:

General Questions:

  1. **Tell me about yourself: (Faruk) **

    • 4+ years in ETL Testing & Data Warehousing
    • Knowledge in Azure Data Factory, Informatica, Oracle
    • Familiar with Unix and project management tools like JIRA, DevOps
  2. **Project Roles and Responsibilities: (Faruk) **

    • Ensuring data quality and integrity as it moves through layers
    • Writing and executing test cases, data validation, identifying & reporting data quality issues
    • Collaborating with ETL developers and business stakeholders
    • Working in Agile methodology, handling sprints, assigning user stories, providing story points
    • Handling Landing, Staging, Foundation, and Access layers

Technical Questions:

SQL Queries:

  1. SQL Joins

    • Inner Join: Matching columns from both tables
    • Left Join: Matching columns + all records from left table
    • Right Join: Matching columns + all records from right table
    • Full Join: All records from both tables
  2. Row Number, Rank, Dense Rank:

    • Row Number: Sequential numbering
    • Rank: Gaps in numbering for duplicate values
    • Dense Rank: Sequential numbering without gaps for duplicates
  3. Identify Duplicate Records

    SELECT employe_id, employe_name, salary FROM employee GROUP BY employe_id, employe_name, salary HAVING COUNT(*) > 1;
  4. Extract Domain from Email

    SELECT email, SUBSTRING(email, instr(email, '@') + 1) AS domain_name FROM table_name;

Unix Commands:

  1. Top 10 records: head -n 10 abc.txt
  2. Replace Text with 'sed': sed -i 's/ravindra/ram/g' file.txt
  3. Find lines containing 'SAA': grep 'SAA' file.txt
  4. Common Unix Commands:
    • ls, rm, mkdir, grep, head, tail, sort

Data Warehousing:

  1. Fact Table:

    • Tables with measurable data, usually containing foreign keys
    • Types: Additive, Semi-Additive, Non-Additive, Cumulative, Snapshot, Factless, Transactional
  2. Slowly Changing Dimensions (SCD):

    • SCD1: No historical data, overwrites data
    • SCD2: Historical data with active flags
    • SCD3: Limited historical data with additional columns
  3. Dimension Table:

    • Descriptive information related to fact tables

Informatica:

  • Usage: Running and monitoring jobs, navigation to log files, error analysis
  • Logs: Right-click -> View Log to see detailed session, transformation logs

Test Cases & Tools:

  1. **Test Case Creation: (Faruk) **
    • Based on user stories and mapping documents
    • Writing scripts, validation based on business logic
  2. **Defect & Effort Tracking: (Faruk) **
    • Tools: JIRA for defect management, tracking efforts

Conclusion

  • Importance of conceptual knowledge and practice to translate business requirements to technical solutions.
  • Encouragement to join the new batch for further learning.

End of Mock Interview.