Import data sources using Power Query
Before you begin ingesting data into Customer Insights - Data, you need to decide the best method to use for collecting the data. This varies depending on factors such as the volume of data being ingested, and the amount of data transformation required. In scenarios that require a large amount of data transformation or when you're not going to be connecting to an Azure data lake or Dataverse sources, importing data using Microsoft Power Query is often the best option.
This option provides the most flexibility. Power Query is Microsoft's Data Connectivity and Data Preparation technology used to access and reshape data from hundreds of data sources. It includes around 40 different connectors to connect to data sources such as Excel, Oracle, OData, Azure, and more.
For more information on using Power Query, see What is Power Query?.
It's important to note that when data is ingested using the Power Query option, it's copied into your Customer Insights data lake. Depending on the volume of data, this could require extra storage to support. You need to make sure you can access the data you want to import as different connectors might require different configuration and authentication.
Note
Although Customer Insights - Data uses Power Query, not all connectors Power Query supports allow you to bring data into Customer Insights - Data. You should be familiar with the different options available based on the connector you select. There's a complete list of Power Query connectors and capabilities available.
Transforming data
One of the main advantages of importing data using Power Query is the ability to perform data transformation. Data transformation is used to change data to the appropriate form so it can be used for a statistical test or method. From a Customer Insights - Data standpoint, data typically needs to be transformed to ensure application features such as activities and measures can use it appropriately. For example, activities rely heavily on dates, so if you intend to use a data set for activities, you need to ensure the data set has at least one date field in it. You need numeric formats for fields that you might use in calculations or as measures such as calculating the total cost of something.
During the import process, you can modify configuration settings such as the delimiter used (based on the data source selected). The preview area allows you to modify some of these settings before you begin the transformation process. While each situation is unique, there are a few standard transformation steps that you should consider.
These transformations are highly recommended:
Use headers as first row - If you're ingesting data from a CSV file, you can use the first row of the CSV file as the header information. Select Transform data, and you see the option to switch between using the first row as headers or using it as data.
Map your data to a standard Data Format - Customer Insights - Data allows you to map your data to the Azure Data Lake. The Azure Data Lake is a standard and extensible collection of schemas (tables, attributes, relationships). It represents business concepts and activities with well-defined semantics, to facilitate data interoperability. Examples of tables include Account, Contact, Lead, Opportunity, Product, etc. To map data, go to Map to table, and then map fields from your source data to Azure Data Lake fields.
Modify Field Data Types - Each field with a data set has a data type associated to it.
We examine these transformations in more detail in the next unit.
For more information on Importing using Power Query, see Import Using Power Query.