Understanding the data observability capability
Note
Data playbook capabilities: The data playbook defines a set of capabilities that represent conceptual building blocks that are used to build data-related solutions. See Defining data capabilities to see the full set of capabilities defined in the playbook.
Observability is a critical measure assessing how the internal states of a system can be deduced from its external outputs. It plays a vital role in enhancing the performance of distributed IT systems and is built upon three pillars: metrics, logs, and traces.
Understanding data observability characteristics
Metrics, Logs, Traces: Comprehensive observability involves collecting and analyzing metrics, logs, and traces from data systems.
Platform Monitoring: Monitoring infrastructure is crucial for detecting and addressing system outages and performance bottlenecks in data and analytics pipelines. Two main components of platform monitoring are:
Platform logs which provide detailed diagnostic and auditing information for Azure resources and the Azure platform they depend on. Although they're automatically generated, certain platform logs need to be forwarded to one or more destinations for retention.
Platform metrics are created by Azure resources and give visibility into their health and performance. Each type of resource creates a distinct set of metrics without any configuration required. Platform metrics are collected from Azure resources at a one-minute frequency unless specified otherwise in the metric's definition.
Data observability maturity: An organization's current state of observability can be assessed using the Data Observability Maturity Model.
Learn more about data observability in Microsoft Fabric
The following lists a few options for observability in Microsoft Fabric:
- Monitoring hub - Microsoft Fabric
- Track user activities in Microsoft Fabric
- Logging in Microsoft Fabric MLflow
- Service Admin access usage - Microsoft Fabric
- Data Activator
Implementations
- Azure Synapse analytics: MDW repo: Parking Sensors Synapse
- Logging library: Python Logging Library
- Microsoft Fabric: Logging your workload using Microsoft Fabric Notebooks
For more information
- Microsoft: Monitoring and diagnostics guidance
- Data Observability in analytics
- Azure Monitor
- Azure Database for MySQL: Single Server Monitoring, Flexible Server Monitoring,
- Azure Database for MariaDB Monitoring
- Azure Database for Postgres SQL: Single Server, Flexible Server, Hyperscale(Citus) and Azure Cosmos DB for PostgreSQL
- Apache Spark applications: metrics using APIs, metrics with Prometheus and Grafana.
- Azure Databricks metrics dashboards
- Data: Databricks observability summary
- Data quality monitoring - Great Expectations
- Data quality monitoring - Apache Griffin