📊

Orchestration Techniques in Microsoft Fabric

Mar 11, 2025

DP700 Exam Preparation Course: Orchestration in Microsoft Fabric

Overview

  • Focus on orchestration in Microsoft Fabric
  • Main orchestration tool: Data Pipelines
  • Patterns and use cases for orchestration
  • Using Fabric Notebooks for orchestration
  • Triggers in Microsoft Fabric

Data Pipelines

  • Purpose: Orchestrate and trigger execution of items within Microsoft Fabric
  • Activities:
    • Notebooks
    • Data Warehouse
    • KQL Experience
    • Spark Jobs
    • Semantic Model Refreshing
  • Invoke Pipeline Activity: Allows running another pipeline from within a pipeline
  • External Service Integration:
    • Trigger Azure Databricks Notebook runs
    • Trigger Azure Functions
    • Use Webhooks
  • Notifications: Use Office 365 and Teams activities for notifications

Copy Data Activity

  • Key for data ingestion in data pipelines
  • Connections:
    • Wide variety of sources, including Azure services, SQL Server databases, REST APIs
    • On-premise data via On-Premise Data Gateway
  • Destinations:
    • Load into Lakehouse files area or external data stores like Azure SQL database
  • Activity Dependencies: Use condition flags (OnSkip, OnSuccess, OnFail, OnCompletion) to link activities
  • Active/Inactive Activities: Deactivate activities for debugging purposes

Orchestration Patterns

  • Metadata-Driven Pipelines: Avoid hardcoding to scale pipelines for larger data sets
    • Store connection details in a metadata table
    • Use Lookup Activity to read metadata and iterate using ForEach

Parent and Child Architecture

  • Invoke Pipeline Activity: Used to call child pipelines
  • Pipeline Parameters: Pass parameters from parent to child pipelines
  • Dynamic Content: Use dynamic parameters in child pipelines

Notebook Orchestration

  • Use Notebook Utils library for orchestration
  • Running Notebooks:
    • Parallel execution using nb.run_multiple with Python list
    • Ordered execution using Directed Acyclic Graph (DAG)
  • DAG Structure:
    • Consists of activities list containing path and dependencies
    • Validate DAG using validate_dag method

Triggers in Microsoft Fabric

  • Schedule Trigger: Set start/end times and time zone for scheduled runs
  • Event-Based Triggers: Trigger based on Azure Blob Storage events
  • Real-Time Hub: Preview feature for real-time event triggers
    • Job events, OneLake events, Workspace item events

Semantic Model Triggers

  • Auto Refresh: Keep Direct Lake data up-to-date
  • Scheduled Refresh: Use data pipeline for synchronized refresh
  • Semantic Link: Use refresh_dataset method in SemPy for programmatic refreshes

Conclusion

  • Overview of orchestration and triggers in Microsoft Fabric
  • Next video will cover data ingestion and data stores