Connect to Azure Data Lake Storage

Completed

Connecting to Microsoft Azure Data Lake Storage unlocks hidden insights in your data. Azure Data Lake Storage organizes data by using the Common Data Model (CDM) metadata system in an Azure storage account.

Using Azure Data Lake Storage has advantages compared to using a bring your own database (BYOD) system, including:

  • Data is already present, so you don’t need to conduct an export. Azure Data Lake Storage integration manages continuous data export.
  • The cost of data storage is reduced.
  • You can reverse existing downstream and consumption pipelines before the export process occurs.

Azure Data Lake Storage stores data as comma separated value (CSV) files in an Azure storage account.

Currently, export to the Azure Data Lake Storage feature isn’t available in Tier-1 (developer) environments. You need to have a cloud-based, Tier-2 or higher sandbox environment to enable this feature. However, you can use a prototype from FastTrack to enable Azure Data Lake Storage for Tier-1 (development) environments.

Connect a cloud-hosted development environment to Azure Data Lake Storage in CDM format

GitHub provides a solution for connecting a cloud-hosted development environment to Azure Data Lake Storage by using the CDM format, which is a standard entity that you can use to provide integration between systems.

Unlike a standard export process to Azure Data Lake Storage, the FastTrack prototype conducts a full export. In the standard export to Azure Data Lake Storage, you can select entities for export.

A benefit of using export to Azure Data Lake Storage is that the system exports entities in near real time, and you can export denormalized tables.

To use the FastTrack prototype that’s on GitHub, you must meet several prerequisites, including:

  • An Azure subscription.
  • An Azure storage account.
  • A Microsoft Azure Synapse Analytics workspace.
  • Connection to a SQL on-demand endpoint by using a supported tool.
  • Access to Microsoft Azure Data Factory.
  • Microsoft Visual Studio 2019, to build the export project.

When you’re ready to deploy, complete the following steps:

  1. Clone the repository from GitHub.
  2. Open the repository and build it.
  3. Deploy it as an Azure Function.
  4. Enable the Windows Installer MSI files that are built from the repository.

To prepare an Azure storage account, Synapse Analysis, and Azure Data Factory for deployment, complete the following steps:

  1. Set up the storage account.
  2. Set up Azure Synapse Analysis and SQL on-demand.
  3. Collect Azure Data Factory deployment parameters.
  4. Deploy the Azure Data Factory template from the cloned repository on GitHub.

To connect to the finance and operations cloud-hosted development environment, make sure that you complete the following actions:

  1. Use Microsoft Dynamics 365 Lifecycle Services and Environment details. The SQL server login is in the Manage environment section.
  2. Connect your Azure Data Factory to the finance and operations cloud-hosted environment by creating a self-hosted integration runtime for your Azure Data Factory.
  3. After you deploy the Azure Data Factory template, go to Azure Data Factory, and run the pipelines. Notice that the data structure and metadata are built in CDM format. You can also create reports by using this format and any reporting and business intelligence (BI) tools.