ESG data estate

Completed

The ESG data estate capability centralizes and harmonizes Environmental, Social, and Governance (ESG) data from various data sources into datasets based on a standardized sustainability analytical schema. ESG data estate helps eliminate data silos, and it normalizes and harmonizes data for various sustainability requirements.

ESG data estate includes the following functionalities:

  • Ingest - This functionality unifies and standardizes the ESG datasets from disparate data sources, such as Microsoft Sustainability Manager and other non-Microsoft sources, for requirements like disclosures, analytics, and insights for reduction. The system ingests and standardizes the data from multiple source systems with the ESG data schema and lakehouses.

  • Compute - This functionality computes metrics and generates analytical datasets. The system calculates ESG metrics with prebuilt or custom data processing artifacts.

  • Visualize - This functionality uses the aggregated datasets in analytics to help you visualize computed metrics by using the built-in and custom dashboards.

ESG data estate resources

ESG data estate includes notebooks and data lakes that facilitate the transformation, computation, and storage of data from its raw form to computed ESG metrics. This transformation is based on standardized ESG data models.

ESG data estate deploys data lakes that include the following resources:

  • IngestedRawData - Stores raw data from external data sources.

  • ProcessedESGData - Stores harmonized data that conforms to a standardized ESG data model.

  • ComputedESGMetrics - Stores computed ESG metrics and aggregated analytical datasets.

  • ConfigAndDemoData - Stores certain transformation libraries, reference, and demo data.

All resources that ESG data estate deploys are prebuilt and integrated into your Fabric workspace. These resources are open and allow you to customize them according to your specific needs. For more information, see ESG data estate.

Data ingestion and transformation

You can integrate data from disparate sources into your ESG data estate. This functionality deploys the IngestedRawData lakehouse, or data lake, in your Fabric workspace, preserving the source data. After you ingest the source data into the IngestedRawData lakehouse, you can unify and harmonize the data into the sustainability analytical schema. You can also connect select data sources to a lakehouse without the need for data ingestion by using Fabric.

Diagram of external as-is data with the industry data template that transforms it into standardized E S G data.

Sustainability analytical schema

The sustainability analytical schema is a purpose-built sustainability schema with entities to store Environmental, Social, and Governance (ESG) data. It also covers business operations tables, such as finance, human resources (HR), and so on. The schema can store data at multiple asset granularities and organizational hierarchies.

The following steps outline the process of integrating emissions, water, and waste data from Microsoft Sustainability Manager and then transforming it into the sustainability analytical schema.

  1. Set up Microsoft Azure Synapse Link - Set up Azure Synapse Link to allow the flow of data from the Microsoft Sustainability Manager environment into your ESG data estate.

  2. Link the Microsoft Azure Data Lake Storage container - Use the Fabric shortcut functionality to link the Data Lake Storage container with Microsoft Sustainability Manager data to the IngestedRawData lakehouse of the deployed capability.

  3. Transform data - Use the data transformation notebooks to transform data into the sustainability analytical schema.

For more information, see Import and transform Sustainability Manager data.