Delen via


Get started with the new data governance experience in Microsoft Purview

Note

The Microsoft Purview Data Catalog is changing its name to Microsoft Purview Unified Catalog. All the features will stay the same. You'll see the name change when the new Microsoft Purview Data Governance experience is generally available in your region. Check the name in your region.

This guide will take you through technical steps to get started building your data governance solution in Microsoft Purview. You'll integrate data governance with your day-to-day business operations, leveraging the Microsoft Purview Data Map and your existing data experts. Not only to harness the value of your data, but to integrate data governance into your team's flow. For strategic best practices, see our data catalog best practices article. For a step-by-step process to set up your environment follow our tutorial.

Prerequisites

  1. You need a Microsoft Purview Enterprise instance, either by:
    1. Upgrading an existing account to the new experience
    2. Upgrading from the free version of Microsoft Purview to enterprise
  2. Data sources registered and scanned in Microsoft Purview. (Here's the list of supported sources.)

Get started checklist

Basic setup

The data governance admin delegates the first level of access for data catalog users.

A governance domain is a boundary that enables the common governance, ownership, and discovery of data products, and business concepts like glossary terms and OKRs. The goal is to empower a governance domain owner to manage their data products, and establish rules for their access, usage, and distribution. Governance domains can be aligned per the following examples:

  • Corporate / business areas (Human Resources, Sales, Finance, Supply Chain, etc.)
  • Overarching subject areas (Product, Parties)
  • Regulatory domains (SOX, PCI, Anti-Money Laundering)
  • Boundaries based on organizational functions (Customer Experience, Cloud Supply Chain, Business Intelligence).

A data product is a kit of data assets (such as tables, files, PBI reports, etc.), an offering to an enterprise that provides assets with proper use cases to be shared to data consumers. A governance domain can house many data products, but a data product is managed by a single governance domain and can be discovered across many governance domains.

  1. Assign users to the Data Governance Admin role - Gives the admin permissions to grant users application-level-permissions like governance domain creation.
  2. Build governance domains:
    1. Assign users the governance domain creator role
    2. Create at least one governance domain

      Tip

      Use our guide to strategize your governance domain structure.

    3. Assign at least one governance domain owner on each governance domain. This user will be the point of support and decision authority for data being consumed in this governance domain.
  3. Build data products:
    1. Assign data product owners to create data products in your governance domains. These should be business and data experts who can pair data with day-to-day scenarios in their governance domain.
    2. Create at least one data product in your governance domains.

      Tip

      Use our guide for tips on how to build good data products.

  4. Assign users data catalog reader permissions in your governance domains so they can view and explore your data products in the data catalog.

Connect your data with business concepts

Glossary terms provide vocabulary for business users. These terms allow users to discover and work with data in the vocabulary that is familiar to them versus using abstract technical terminology inherited from physical data sources.

Objective Key Results (OKRs) are the goals or desired outcome of a Governance Domain (for example, 10% rise in sales or 3% reduction in support cases). Objectives should relate to everything an organization does and should define how they're achieving their outcomes.

Health management actions give you and your users steps to improve data health and governance across data estate. These actions correspond to the checks made to calculate a Data Product’s data governance health control score. Addressing these actions raises your health score and promotes an overall more usable and discoverable data catalog. Understanding the value of your Data Products improves the trust others take in that data and help with prioritizing which data to focus on improving first.

Connect action steps

  1. Assign users data steward permission in your governance domains to create and manage glossary terms. These users should be data and business experts. They'll increase the value of data products by curating the information and attaching glossary terms to make the data understandable.
  2. Create at least one OKR in your governance domain.
    1. Link your OKR to a data product to connect your data to your business goals.
  3. Assign users(s) data health reader permissions in your data catalog. These users will monitor data catalog use and current governance scores, and take actions to build trust in your data products.
  4. Publish some glossary terms in your governance domain.
    1. Link a glossary term to a data product to improve discoverability.
  5. Assign users data steward permission in your governance domains to create and manage OKRs. These users should be business strategy experts to ensure business leaders appreciate the value of their data, and the importance of data governance. They'll drive prioritization and strategize how teams build, maintain, and govern their data to create insights.
  6. Review your health controls to get a baseline for your current health management.
  7. Review health actions to start considering next steps for your data governance journey.

Improve data quality and remove data issues

Data quality is the measurement of the quality of data in an organization, based on data quality rules that are configured and defined in the data catalog.

Data quality rules provide a description of the state of the data with dimensions like: accuracy, completeness, conformity, consistency, timeliness, and uniqueness. Each rule, when it runs, produces a score that describes how close the data is to its desired state.

Data profiling is the process of examining the data available in your data sources and collecting statistics and information, and assessing the quality level of the data according to a defined set of goals. If data is of poor quality or managed in structures that can't be integrated to meet the needs of the organization, it can affect business processes and decision-making.

Quality action steps

  1. Assign users(s) data quality steward permissions to use all data quality features.
  2. Set up a data source connection to prepare your source for data quality assessment.
  3. Configure and run data profiling for an asset in your data source.
    1. When profiling is complete, browse the results for each column in the data asset to understand your data's current structure and state.
  4. Set up data quality rules based on the profiling results, and apply them to your data asset.
  5. Configure and run a data quality scan on a data product to assess the quality of all supported assets in the data product.
  6. Review your scan results to evaluate your data product's current data quality.

Reference model for planning

The following is a reference example to assist with planning the new Microsoft Purview data governance solution areas, scenarios, tasks, and personas with key stakeholders.

Week 1-2

Area Scenario Task Description/Outcomes Persona
Data management Catalog setup Set up first governance domain Identify governance domain scope, usage, and owners. Assign accountability to governance domain owner, define/create your first governance domain, description, and assign data owners. Capture feedback establishing the governance domain. Governance domain owner
Catalog curation Create data products in the governance domain Identify scope of data to manage, publish, and owners. Create data products, descriptions, use cases, assign ownership, create and assign glossary terms to help increase usability for data consumers. Map data assets to data products, create access policies for data consumers to attest to when requesting access – capture feedback (ease of use for business unit to manage/understand/own curation) Data product owner/data steward
Publication Publish the governance domain and data products Publish the governance domain and associated data products to make available for discovery, understanding, and access through the data catalog experience. Assign data consumers permissions to access and view the first governance domain by adding them to the data catalog reader role and capture feedback with publication. Governance domain owner and data product owner
Operations Data governance and management operations Assess operational tasks, stakeholders, processes, and procedures to enable data governance and management, evaluate against current state data governance policies, practice, and culture to identify potential areas for improvement/change. Data governance office

Week 2-3

Area Scenario Task Description/Outcomes Persona
Data discovery, understanding, and access Discover and access Data catalog product search Exercise the data catalog product search experience to help data consumers and users discover and understand data products that are curated and developed for a specific business purpose. Assess data product metadata to determine proper usage, data quality, and applicability to data consumer business outcomes, and then request access. Assess ease of use to data products the user recently reviewed, and data products subscribed to – capture feedback on full data consumer experience. Data consumer
Data management Access management Access request management Review access requests for data products in the first governance domain and approve or reject. Engage with IT owners for approvals (as appropriate) to data assets. Data product owner
Catalog curation Review data product discoverability Review discoverability and usability of published data products along with data consumer feedback to inform semantic knowledge improvement opportunities (for example, glossary terms, attention to attention items in the action center, etc.). Data product owner/data steward

Week 3-4

Area Scenario Task Description/Outcomes Persona
Data management Data quality Improve data quality and reduce data issues Assess top-level data quality by the first governance domain, and evaluate/set up data quality for associated data assets by data product (via connections). Use data quality profiling data to inform quality rules and dimensions to establish key data assets in the data product. Run data quality scans (ad-hoc or scheduled), monitor data quality activity and scans, and setup alerts to be informed of changes with data asset health (via target thresholds). Capture feedback on overall data quality experience. Data quality steward
Operations Data governance and management operations Assess operational tasks, stakeholders, processes, and procedures to enable data quality in the context of data governance and management. Evaluate against current state data governance policies, practice, and culture to identify potential areas for improvement or change. Data governance office

Week 4-5

Area Scenario Task Description/Outcomes Persona
Health management Reports Manage data governance Review the controls with business data domain owners and setup regular review of reporting on those controls.  The goal of the meeting is to review issues and prioritize solutions or data products that are needed to meet business needs. Data governance office
Health actions Improve data governance Take actions based on the controls to improve data governance and ensure standards are being met. Data stewards/data product owners

Overview page

The Overview page in the Microsoft Purview Data Catalog helps users in an organization get started with their data governance journey, understand and navigate the different steps outlined in this document, using step by step instructions and video demos.

Next steps