Send Azure Databricks application logs to Azure Monitor

Note

This article relies on an open source library hosted on GitHub at: https://github.com/mspnp/spark-monitoring.

The original library supports Azure Databricks Runtimes 10.x (Spark 3.2.x) and earlier.

Databricks has contributed an updated version to support Azure Databricks Runtimes 11.0 (Spark 3.3.x) and above on the l4jv2 branch at: https://github.com/mspnp/spark-monitoring/tree/l4jv2.

Please note that the 11.0 release is not backward compatible due to the different logging systems used in the Databricks Runtimes. Be sure to use the correct build for your Databricks Runtime. The library and GitHub repository are in maintenance mode. There are no plans for further releases, and issue support will be best-effort only. For any additional questions regarding the library or the roadmap for monitoring and logging of your Azure Databricks environments, please contact azure-spark-monitoring-help@databricks.com.

This article shows how to send application logs and metrics from Azure Databricks to a Log Analytics workspace. It uses the Azure Databricks Monitoring Library, which is available on GitHub.

Prerequisites

Configure your Azure Databricks cluster to use the monitoring library, as described in the GitHub readme.

Note

The monitoring library streams Apache Spark level events and Spark Structured Streaming metrics from your jobs to Azure Monitor. You don't need to make any changes to your application code for these events and metrics.

Send application metrics using Dropwizard

Spark uses a configurable metrics system based on the Dropwizard Metrics Library. For more information, see Metrics in the Spark documentation.

To send application metrics from Azure Databricks application code to Azure Monitor, follow these steps:

  1. Build the spark-listeners-loganalytics-1.0-SNAPSHOT.jar JAR file as described in the GitHub readme.

  2. Create Dropwizard gauges or counters in your application code. You can use the UserMetricsSystem class defined in the monitoring library. The following example creates a counter named counter1.

    import org.apache.spark.metrics.UserMetricsSystems
    import org.apache.spark.sql.SparkSession
    
    object StreamingQueryListenerSampleJob  {
    
      private final val METRICS_NAMESPACE = "samplejob"
      private final val COUNTER_NAME = "counter1"
    
      def main(args: Array[String]): Unit = {
    
        val spark = SparkSession
          .builder
          .getOrCreate
    
        val driverMetricsSystem = UserMetricsSystems
            .getMetricSystem(METRICS_NAMESPACE, builder => {
              builder.registerCounter(COUNTER_NAME)
            })
    
        driverMetricsSystem.counter(COUNTER_NAME).inc(5)
      }
    }
    

    The monitoring library includes a sample application that demonstrates how to use the UserMetricsSystem class.

Send application logs using Log4j

To send your Azure Databricks application logs to Azure Log Analytics using the Log4j appender in the library, follow these steps:

  1. Build the spark-listeners-1.0-SNAPSHOT.jar and the spark-listeners-loganalytics-1.0-SNAPSHOT.jar JAR file as described in the GitHub readme.

  2. Create a log4j.properties configuration file for your application. Include the following configuration properties. Substitute your application package name and log level where indicated:

    log4j.appender.A1=com.microsoft.pnp.logging.loganalytics.LogAnalyticsAppender
    log4j.appender.A1.layout=com.microsoft.pnp.logging.JSONLayout
    log4j.appender.A1.layout.LocationInfo=false
    log4j.additivity.<your application package name>=false
    log4j.logger.<your application package name>=<log level>, A1
    

    You can find a sample configuration file here.

  3. In your application code, include the spark-listeners-loganalytics project, and import com.microsoft.pnp.logging.Log4jconfiguration to your application code.

    import com.microsoft.pnp.logging.Log4jConfiguration
    
  4. Configure Log4j using the log4j.properties file you created in step 3:

    getClass.getResourceAsStream("<path to file in your JAR file>/log4j.properties") {
          stream => {
            Log4jConfiguration.configure(stream)
          }
    }
    
  5. Add Apache Spark log messages at the appropriate level in your code as required. For example, use the logDebug method to send a debug log message. For more information, see Logging in the Spark documentation.

    logTrace("Trace message")
    logDebug("Debug message")
    logInfo("Info message")
    logWarning("Warning message")
    logError("Error message")
    

Note

If you're using the library and you have Apache Spark Notebooks, any logs that Spark generates during execution for the notebook automatically go to Log Analytics.

There is a limitation for Python to support custom logging messages using the Spark configured Log4j. Logs can only be sent from the driver node because executor nodes don't have access to the Java virtual machine from Python.

Run the sample application

The monitoring library includes a sample application that demonstrates how to send both application metrics and application logs to Azure Monitor. To run the sample:

  1. Build the spark-jobs project in the monitoring library, as described in the GitHub readme.

  2. Navigate to your Databricks workspace and create a new job, as described in Create and run Azure Databricks Jobs.

  3. In the job detail page, select Set JAR.

  4. Upload the JAR file from /src/spark-jobs/target/spark-jobs-1.0-SNAPSHOT.jar.

  5. For Main class, enter com.microsoft.pnp.samplejob.StreamingQueryListenerSampleJob.

  6. Select a cluster that is already configured to use the monitoring library. See Configure Azure Databricks to send metrics to Azure Monitor.

When the job runs, you can view the application logs and metrics in your Log Analytics workspace.

Application logs appear under SparkLoggingEvent_CL:

SparkLoggingEvent_CL | where logger_name_s contains "com.microsoft.pnp"

Application metrics appear under SparkMetric_CL:

SparkMetric_CL | where name_s contains "rowcounter" | limit 50

Important

After you verify the metrics appear, stop the sample application job.

Next steps

Deploy the performance monitoring dashboard that accompanies this code library to troubleshoot performance issues in your production Azure Databricks workloads.