📝

Azure Databricks Notebooks Overview

Jul 10, 2025

Overview

This lecture provides a detailed walkthrough of working with notebooks in Azure Databricks, covering creation, import/export, permissions, and language usage.

Navigating the Databricks Workspace

  • Access the Databricks workspace from the Azure portal after deployment.
  • Workspace overview includes details such as pricing tier, URL, and a delete option for unused workspaces.
  • Always delete unused resources to avoid unnecessary billing.

Working with Notebooks

  • Notebooks are the main interface for coding and data analysis in Databricks.
  • Create new notebooks by selecting the language (Python, Scala, SQL, or R) and associating a cluster.
  • Notebooks are organized under users and shared folders within the workspace.

Importing & Exporting Notebooks

  • Import notebooks from files (formats: .py, .scala, .sql, .r, .dbc, .ipynb, .html) or by URL.
  • Export notebooks as source files, DBC archive, iPython notebook, or HTML.
  • Use export to download notebooks for sharing or backup.

Managing Notebooks and Folders

  • Options include: create, import, export, rename, move, clone (duplicate), and move to trash (delete).
  • Copy file path feature gives relative notebook path within Databricks.
  • Organize notebooks into folders for structured collaboration.

Permissions & Collaboration

  • Set permissions at the folder or notebook level to control access (read, run, edit, manage) for users or groups.
  • Useful for managing projects with multiple team members.

Using Multiple Languages & Magic Commands

  • Default notebook language is chosen at creation but can be changed for notebook or specific cells.
  • Use magic commands (e.g., %sql, %scala, %python, %r) to execute code in a different language in a cell.
  • It is recommended to keep notebooks primarily in one language for readability.

Notebook Cells & Execution

  • Code is written in cells; each cell can contain a segment of code.
  • Add new cells using the plus button.
  • Run cells individually (Shift+Enter) or run all cells using the 'Run All' option.
  • Output is displayed below the cell; by default, only the first 1000 rows are shown.

Other Features

  • Use dark or light themes for the notebook interface.
  • Clear output and state to remove results and reset the notebook.
  • Add comments to code using a hash (#) in Python.

Key Terms & Definitions

  • Notebook — Interactive document in Databricks for code, output, and visualizations.
  • Magic Command — Special notation (e.g., %sql) to specify language for a cell.
  • Cell — Section in a notebook to write and run code.
  • Cluster — Group of compute resources for running notebook commands.
  • Permissions — Access controls for users/groups to view/edit/run notebooks.

Action Items / Next Steps

  • Practice creating, importing, exporting, and organizing notebooks in your Databricks workspace.
  • Ensure unused resources are deleted to avoid extra costs.
  • Experiment with magic commands and multi-language support in notebooks.