Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Use the Notebook activity to run notebooks you create in Microsoft Fabric as part of your Data Factory pipelines. Notebooks let you run Apache Spark jobs to bring in, clean up, or transform your data as part of your data workflows. It’s easy to add a Notebook activity to your data pipelines in Fabric, and this guide walks you through each step.
Prerequisites
To get started, you must complete the following prerequisites:
- A tenant account with an active subscription. Create an account for free.
- A workspace is created.
- A notebook is created in your workspace. To create a new notebook, refer to How to create Microsoft Fabric notebooks.
Create a notebook activity
Create a new pipeline in your workspace.
Search for Notebook in the pipeline Activities pane, and select it to add it to the pipeline canvas.
Select the new Notebook activity on the canvas if it isn't already selected.
Refer to the General settings guidance to configure the General settings tab.
Configure notebook settings
Select the Settings tab, select an existing notebook from the Notebook dropdown, and optionally specify any parameters to pass to the notebook.
Set session tag
In order to minimize the amount of time it takes to execute your notebook job, you could optionally set a session tag. Setting the session tag instructs Spark to reuse any existing Spark session, minimizing the startup time. Any arbitrary string value can be used for the session tag. If no session exists, a new one would be created using the tag value.
Note
To be able to use the session tag, High concurrency mode for pipeline running multiple notebooks option must be turned on. This option can be found under the High concurrency mode for Spark settings under the Workspace settings
Save and run or schedule the pipeline
Switch to the Home tab at the top of the pipeline editor, and select the save button to save your pipeline. Select Run to run it directly, or Schedule to schedule it. You can also view the run history here or configure other settings.