Tutorial: Ingest data into a Warehouse

Applies to: ✅ Warehouse in Microsoft Fabric

In this tutorial, learn how to ingest data from Microsoft Azure Storage into a Warehouse to create tables.

Note

This tutorial forms part of an end-to-end scenario. In order to complete this tutorial, you must first complete these tutorials:

  1. Create a workspace
  2. Create a Warehouse

Ingest data

In this task, learn how to ingest data into the warehouse to create tables.

  1. Ensure that the workspace you created in the first tutorial is open.

  2. In the workspace landing pane, select + New Item to display the full list of available item types.

  3. From the list, in the Get data section, select the Data pipeline item type.

  4. In the New pipeline window, in the Name box, enter Load Customer Data.

    Screenshot of the New pipeline dialog, highlighting the entered name.

  5. To provision the pipeline, select Create. Provisioning is complete when the Build a data pipeline landing page appears.

  6. On the data pipeline landing page, select Pipeline activity.

    Screenshot of the Build a data pipeline landing page, highlighting the Pipeline activity option.

  7. In the menu, from inside the Move and transform section, select Copy data.

    Screenshot of the Move and transform section, showing where to select Copy data.

  8. On the pipeline design canvas, select the Copy data activity.

    Screenshot of the Copy data located on the design canvas.

  9. To set up the activity, on the General page, in the Name box, replace the default text with CD Load dimension_customer.

    Screenshot of the General tab, showing where to enter the copy activity name.

  10. On the Source page, in the Connection dropdown, select More in order to reveal all of the data sources you can choose from, including data sources in OneLake catalog.

  11. Select + New to create a new data source.

  12. Search for, and then select, Azure Blobs.

  13. On the Connect data source page, in the Account name or URL box, enter https://fabrictutorialdata.blob.core.windows.net/sampledata/.

  14. Notice that the Connection name dropdown is automatically populated and that the authentication kind is set to Anonymous.

    Screenshot of the Connect to data source window showing all settings done.

  15. Select Connect.

  16. On the Source page, to access the Parquet files in the data source, complete the following settings:

    1. In the File path boxes, enter:

      1. File path - Container: sampledata

      2. File path - Directory: WideWorldImportersDW/tables

      3. File path - File name: dimension_customer.parquet

    2. In the File format dropdown, select Parquet.

  17. To preview the data and test that there are no errors, select Preview data.

    Screenshot of the Source page, highlighting the changes made in the previous steps, and the Preview data function.

  18. On the Destination page, in the Connection dropdown, select the Wide World Importers warehouse.

  19. For Table option, select the Auto create table option.

  20. In the first Table box, enter dbo.

  21. In the second box, enter dimension_customer.

    Screenshot of the Destination page, highlighting where the changes made in the previous steps.

  22. On the Home ribbon, select Run.

  23. In the Save and run? dialog, select Save and run to have the pipeline load the dimension_customer table.

    Screenshot of the Save and run dialog, highlighting the Save and run button.

  24. To monitor the progress of the copy activity, review the pipeline run activities in the Output page (wait for it to complete with a Succeeded status).

    Screenshot of the Output page, highlighting the Succeeded status.

Next step