Power Query Workshop Recap

Jul 15, 2024

Power Query Workshop Recap

Introductions and Setup

  • Participants from various global locations
  • Questions posed:
    • Where are you joining from?
    • Do you use Power Query via UI or write M code?

Key Questions from the Audience

  • Mixed responses: Some use UI, some M code, some use both
  • Pertinent to go slow addressing both UI and M code users
  • Complexity of today's session, ensuring clarity

Problem Statement

  • Cleaning and restructuring data in Power Query
  • Given data with two columns: Transaction Number and Payment Column
  • Goal: Create multiple calculated columns (Total, Past Due, Discount, etc.)
  • Complex last transaction data including additional columns
  • Aimed to automate process rather than manually delimiting columns

Solution Approach

General Steps

  1. Group Data by Transaction Number
    • Combined individual rows into a single row by transaction number
  2. Extract Position of Column Labels
    • Used list.accumulate to find positions of columns within text
    • Used Excel to list column headers for reference
  3. Insert Delimiter at Column Positions
    • Insert unique delimiter (e.g., ||) at identified positions
  4. Split Data Based on Delimiter
    • Split data into multiple columns using Power Query UI option
  5. Clean Up Column Values
    • Removed header labels and any other junk characters from data

Detailed Execution

  1. Grouping Data
    • Used Table.Group to consolidate rows by transaction number
    • Concatenated text entries into single line for each transaction
  2. Finding Column Positions
    • Utilized Text.PositionOf with list.accumulate
    • Created list of column headers to dynamically find positions
    • Cleaned up list to remove zeroes and -1 values (nonexistent columns)
  3. Inserting Delimiters
    • Applied Text.Insert method within list.accumulate to not overwrite processed text
    • Adjusted positions to account for newly added delimiters
  4. Splitting Based on Delimiters
    • Utilized Power Query's 'Split Column by Delimiter' feature
    • Ensured columns were named dynamically using list of column headers
  5. Cleaning Column Data
    • Implemented Table.TransformColumns and nested List.Transform
    • List of possible junk characters dynamically applied to clean all required columns
    • Utilized text.replace to remove unnecessary text and junk characters

Questions from Audience

  • Covered questions related to providing depth of explanations in M code
  • Addressed efficient methodologies for querying and transforming large datasets
  • Shared practical tips on learning sequence for Power BI, emphasizing data modeling, Power Query, and DAX

Final Recommendations and Insights

  • Understand function inputs and outputs while learning M code
  • Focus on effective data modeling as primary skill
  • Suggested practical approaches to problem-solving in Power Query

Additional Information

  • Provided personal insights into transitioning career, staying motivated, and continuous learning
  • Answered queries on Power BI modeling, handling large datasets, and dynamic list management

Thank you for participation!