Governance domains in Microsoft Purview
A governance domain is a new concept in Microsoft Purview that provides context for your data assets and make it easier scale data governance practices.
Data ownership is the most important aspect of data governance. Controlling access and data use are the core tenants of any successful data governance solution. Currently, IT infrastructure stores and maintains data assets, even though IT doesn't own or use the data. There's a disconnect between how data needs to be discovered and maintained within the business, and the teams that maintain it. We introduced Unified Catalog to begin the process of bridging the gap between business owners and their physical data. But business owners needed more information about their data assets to maintain health and security, and they needed a way to efficiently scale management for a growing data estate.
To continue to address both these issues, we're introducing the governance domain.
What's a governance domain?
At its simplest, a governance domain is a boundary that enables the common governance, ownership, and discovery of data products and business concepts like glossary terms, OKRs, or critical data. The goal is to empower a governance domain owner to manage their data products and concepts, and establish rules for their access, use and distribution. With this goal in mind, you could establish many kinds of governance domains:
- Fundamental business areas - human resources, sales, finance, supply chain, etc.
- Overarching subject areas - product, parties, etc.
- Boundaries based on organizational functions - customer experience, cloud supply chain, business intelligence, etc.
In Microsoft Purview, governance domains are flexible so your organization can build the boundaries that make the most sense for your data. But once you've created them, how do they provide context or assist with data governance?
The parts of a governance domain
A governance domain has a name and a description, so users can quickly understand what parts of your organization the governance domain represents. A governance domain also has owners who govern and maintain the governance domain and the data assets it represents.
Governance domains also contain the following business concepts today:
- Data products - kits of data assets, a packaged offering to an enterprise that groups data assets (tables, files, PBI reports) into a single product for users to discover and employ.
- Glossary terms - active values that provide context, but also apply policies that determine how your data should be managed, governed, and made discoverable for use.
- Objectives and key results(OKRs) - metrics that describe the value of your data with measurable values and goals.
- Critical data - a logical grouping of columns across tables that are necessary for decision making and need to be governed with the highest importance. For example: The "Customer ID" critical data element can map "CustID" from one table and "CID" from another table into the same logical container.
Types of governance domains
The following out-of-the-box governance domain types are supported within the product today:
- Functional unit – organizations or business units such as Sales, Marketing, or Finance
- Line of business – products or services being sold such as Xbox, Office, or Azure and different markets or subsidiaries
- Data domain – key organization-wide entities such as customers or employees
- Regulatory – compliance related such as GDPR, SOX, or HIPPA
- Project – collaborative programs across the organization
How does it provide business context?
Governance domains provide context for your data product (combination of files, tables, reports, etc.), by defining an overarching category for your assets. If a user wants to look at supply delivery schedules to evaluate efficiency rather than searching the entire Unified Catalog for potentially relevant data they can immediately narrow the scope of their search to the most relevant governance domain. The information they're looking for isn't likely to be in human resources or customer experience so they can begin their search in supply chain.
Each governance domain can also develop its own objectives and key results (OKRs) that describe the business value of the available data products, and set business goals to track how well the data is meeting those goals. Governance domain owners can prioritize data products based on how they map to their business goals, and users understand how data fits into their organization's strategy.
This structure not only provides familiar landmarks for business users to navigate while searching for meaningful information, but it also allows governance domain owners to customize data governance to their governance domain, and scale that governance as their data estate grows.
How does it scale data governance?
In a flat Unified Catalog, where each data asset is listed as an individual piece of information, it's difficult for data owners and data stewards to fully govern each individual asset. And data isn't static; a healthy data estate is growing and changing along with its organization, so data owners and stewards need tools to be able to scale with increasing data assets. Domains provide boundaries where governance policies trickle down to data products.
The key to the trickle-down of governance policies are glossary terms. Glossary terms are in part as you knew them before: terms that provide business context, and can be applied to data assets to improve discovery. Terms now also contain access policies that are applied alongside the term itself. (For more information about glossary terms, see the glossary terms article.) Glossary terms are defined within a governance domain, and can be applied to any data product (groups of data assets) in that governance domain.
Data owners and data stewards don't have to traverse an entire data estate to maintain governance. They can apply terms across a governance domain they understand and know that when these terms are attached to a data product, the right policies will automatically fall into place. As an example: In the Human Resources governance domain, there's a Feedback Results term. Feedback Results are defined as any returned information from company-wide feedback queries, and contain sensitive information. For that reason, a specific team needs to review any request to access that data and what the request is being made for. A data access policy is defined in the Feedback Results term, and any data product that is labeled with Feedback Results will now automatically apply this access policy when a user requests to access data assets.
Governance domain owners and data stewards only need to develop these best practices once, and they can easily be applied across the data estate using recognizable business terms. They might not know every instance that the term could be applied to in the future, but they can be sure that when it's applied, their access best practices are used.
Governance domains use business context that already exists to make sure that your data isn't only more discoverable, but also well governed even as it grows.