Data quality for Microsoft Fabric shortcut databases
Shortcuts are objects in OneLake that point to other storage locations. The location can be internal or external to OneLake. The location that a shortcut points to is known as the target path of the shortcut. The location where the shortcut appears is known as the shortcut path. Shortcuts appear as folders in OneLake and any workload or service that has access to OneLake can use them.
Shortcuts in Microsoft OneLake allow you to unify your data across domains, clouds, and accounts by creating a single virtual data lake for your entire enterprise. All Microsoft Fabric experiences and analytical engines can directly connect to your existing data sources such as Azure, Amazon Web Services (AWS), and OneLake through a unified namespace. OneLake manages all permissions and credentials so you don't need to separately configure each Fabric workload to connect to each data source.
For more details about Microsoft Fabric shortcuts, review the Fabric documentation.
Configure data quality for Fabric shortcut databases
Log in to your Microsoft Fabric workspace. Select the ellipsis button under Tables, and select New Shortcut. From here you can create:
Azure Data Lake Gen2 shortcut
Select the Azure Data Lake Storage Gen2 shortcut from Fabric workspace New shortcut page.
Select ADLS Gen2 SAS authentication.
Generate a SAS and connection string for your ADLS Gen2 resource in the Azure portal.
Copy the endpoint of the data lake.
Add storage details for the shortcut storage.
Navigate to and choose the correct delta folder.
Preview the shortcut delta table in your Fabric workspace.
Start a scan of your Azure Data Lake Gen2 resource in the Microsoft Purview Data Map using service principal authentication.
Once the scan is finished, your data asset should appear in Unified Catalog as a lakehouse table.
Associate the asset with a data product for curation and data quality assessment.
Open the Microsoft Purview Data Quality solution and run a data quality scan or profile your data as usual.
Amazon S3 shortcut
Select New shortcut in the Microsoft Fabric workspace.
Select AWS S3 and add the URL, access key ID, and access key shortcut.
Add the connection URL and storage details.
Preview the shortcut in your Fabric workspace.
Start a scan of your Amazon S3 resource in the Microsoft Purview Data Map using service principal authentication.
Once the scan is finished, your data asset should appear in Unified Catalog.
Associate the asset with a data product for curation and data quality assessment.
Open the Microsoft Purview Data Quality solution and run a data quality scan or profile your data as usual.
Google Cloud Storage (GCS) shortcut
Select New shortcut in the Microsoft Fabric workspace.
Select Google Cloud Storage and add the URL, access key ID, and access key shortcut.
Add the connection URL and storage details,
Preview the shortcut in your Fabric workspace.
Start a scan of your Amazon S3 resource in the Microsoft Purview Data Map using service principal authentication.
Once the scan is finished, your data asset should appear in Unified Catalog.
Associate the asset with a data product for curation and data quality assessment.
Open the Microsoft Purview Data Quality solution and run a data quality scan or profile your data as usual.
Important
- Use a service principal for data map scans and managed identity for data quality scans.
- Any data sourced through a shortcut will be processed in the same region.
- There is a dependency on Fabric team to differentiate shortcut items from native items in the OneLake SDK for Lakehouse subartifacts. For now all shortcut items (tables and files) will be considered as native items in scanning.