Quickstart: Create your first pipeline to copy data

In this quickstart, you build a data pipeline to move a Sample dataset to the Lakehouse. This experience shows you a quick demo about how to use pipeline copy activity and how to load data into Lakehouse.

Prerequisites

To get started, you must complete the following prerequisites:

Create a data pipeline

  1. Navigate to Power BI.

  2. Select the Power BI icon in the bottom left of the screen, then select Data factory to open homepage of Data Factory.

  3. Navigate to your Microsoft Fabric workspace. If you created a new workspace in the prior Prerequisites section, use this one.

    Screenshot of the workspaces window where you navigate to your workspace.

  4. Select Data pipeline and then input a pipeline name to create a new pipeline.

    Screenshot showing the new data pipeline button in the newly created workspace. Screenshot showing the name of creating a new pipeline.

Copy data using pipeline

In this session, you start to build your first pipeline by following below steps about copying from a sample dataset provided by pipeline into Lakehouse.

Step 1: Start with the Copy data assistant

  1. After selecting Copy data assistant on the canvas, the Copy assistant tool will be opened to get started.

    Screenshot showing the Copy data button.

Step 2: Configure your source

  1. Choose the Sample data tab at the top of the data source browser page, then select the Public Holidays sample data, and then Next.

    Screenshot showing the Choose data source page of the Copy data assistant with the Public Holidays sample data selected.

  2. On the Connect to data source page of the assistant, the preview for the Public Holidays sample data is displayed, and then select Next.

    Screenshot showing the sample data for the Public Holidays sample data.

Step 3: Configure your destination

  1. Select Lakehouse and then Next.

    Screenshot showing the selection of the Lakehouse destination in the Copy data assistant.

  2. Enter a Lakehouse name, then select Create and connect.

    Screenshot showing the Create new Lakehouse button selected on the Choose data destination page of the Copy data assistant.

  3. Configure and map your source data to the destination Lakehouse table. Select Tables for the Root folder and Load to a new table for Load settings. Provide a Table name and select Next.

    Screenshot showing the Connect to data destination page of the Copy data assistant with Tables selected and a table name for the sample data provided.

Step 4: Review and create your copy activity

  1. Review your copy activity settings in the previous steps and select Save + run to finish. Or you can revisit the previous steps in the tool to edit your settings, if needed. If you just want to save but not run the pipeline, you can deselect the Start data transfer immediately checkbox.

    Screenshot of the Review + create page of the Copy data assistant highlighting source and destination.

  2. The Copy activity is added to your new data pipeline canvas. All settings including advanced settings for the activity are available in the tabs below the pipeline canvas when the created Copy data activity is selected.

    Screenshot showing the completed Copy activity with the Copy activity settings tabs highlighted.

Run and schedule your data pipeline

  1. If you didn't choose to Save + run on the Review + save page of the Copy data assistant, switch to the Home tab and select Run. A confirmation dialog is displayed. Then select Save and run to start the activity.

    Screenshot showing the Run button on the Home tab, and the Save and run prompt displayed.

  2. You can monitor the running process and check the results on the Output tab below the pipeline canvas. Select link for the activity name in your output to view the run details.

    Screenshot showing the Output tab of the pipeline run in-progress with the Details button highlighted in the run status.

  3. The run details show how much data was read and written and various other details about the run.

    Screenshot showing the run details window.

  4. You can also schedule the pipeline to run with a specific frequency as required. Here's an example showing a schedule for the pipeline set to run every 15 minutes.

    Screenshot showing the schedule dialog for the pipeline with a 15-minute recurring schedule.

The pipeline in this sample shows you how to copy sample data to Lakehouse. You learned how to:

  • Create a data pipeline.
  • Copy data with the Copy Assistant.
  • Run and schedule your data pipeline.

Next, advance to learn more about monitoring your pipeline runs.