Automating Export of Parquet Schema Elements from Purview Using PyApacheAtlas

Janvi 0 Reputation points
2025-01-14T18:15:17.4566667+00:00

How can the fully qualified name, classifications, sensitivity labels, glossary terms, and column descriptions for each column be exported from an Azure Data Lake Storage Gen2 Resource Set within a scanned collection?

After completing the scan, the goal is to download all attributes for specific assets into an Excel file. This would facilitate adding additional information and later uploading it back using PyApacheAtlas.

Efforts have been made to use get_entity, which requires a GUID for each asset, but manually retrieving this information is tedious and inefficient.

Is there a method to automate this process to export the required details for all assets into an Excel file for seamless updates and re-uploading? User's image

Microsoft Purview
Microsoft Purview
A Microsoft data governance service that helps manage and govern on-premises, multicloud, and software-as-a-service data. Previously known as Azure Purview.
1,342 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.