Adjusting Lateness Tolerance in Azure Stream Analytics

Annie Zhou 0 Reputation points Microsoft Employee
2025-01-16T02:56:18.8833333+00:00

A query is being run using Azure Stream Analytics to send data binned by hour into an event hub, which subsequently sends it into a Kusto database. However, there appears to be an issue, as approximately 1,000 messages are missed each day. The logs indicate the following error:

{"Source":"BlobInputAdapter","Type":"DataError","DataErrorType":"LateInputEvent","BriefMessage":"Input event with application timestamp '2025-01-15T19:59:25.0320000' and arrival time '2025-01-15T20:01:00.0000000' was sent later than configured tolerance.","ErrorCode":"InputEventLateBeyondThreshold","ErrorCategory":"DataError","Message":"Input event with application timestamp '2025-01-15T19:59:25.0320000' and arrival time '2025-01-15T20:01:00.0000000' was sent later than configured tolerance.,"EventCount":239}

A) How can Stream Analytics be configured to increase the lateness tolerance to prevent data loss?

B) Could this issue be the cause of the dropped data, or might there be another underlying issue?

Azure Stream Analytics
Azure Stream Analytics
An Azure real-time analytics service designed for mission-critical workloads.
373 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Vinodh247 27,201 Reputation points MVP
    2025-01-16T05:35:27.2866667+00:00

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    A) Configuring Lateness Tolerance in Azure Stream Analytics

    To increase lateness tolerance in Azure Stream Analytics:

    1. Adjust the Late Event Tolerance Setting:
      • Go to the stream analytics job in the Azure portal.
      • Navigate to Inputs.
      • Locate the input configuration causing the issue (in this case, likely the BlobInputAdapter).
      • Increase the Late Events Tolerance Window. This is the time window (in seconds) that stream analytics allows for late events.
      Change the Event Ordering Configuration:
      • In the job configuration, under Input > Event Ordering, set Out of order events policy to:
        - Drop: Drops out-of-order events (default).
        
              - Adjust: Adjusts the timestamp of out-of-order events to the highest timestamp seen so far.
        
                 - Increase the Max tolerable delay to allow a wider window for late arrivals.
        
                 Enable Timestamp Adjustment Policy:
        
                    - Ensure that Event Timestamp is correctly set for your input events to align with the expected stream processing behavior.
        
                       - Use `WITH (TIMESTAMP BY application timestamp)` in the Stream Analytics query to specify which timestamp to use.
        

    B) Understanding the Cause of Dropped Data

    The LateInputEvent error indicates that messages are arriving later than the configured lateness tolerance. This can be a primary cause of data loss, but it's important to consider other possibilities as well:

    Latency in Blob Input Source:

    • Events might be delayed in being written to or read from the Blob storage.

    Clock Drift or Inconsistent Timestamps:

      - If the event timestamps (`application timestamp`) and the system's clock are misaligned, the events could be considered late. Ensure clocks are synchronized across systems producing the data.
      
    
    1. Processing Delays in Event Hub:
      • Check if there are any delays between the stream analytics output to event hub and its ingestion into the Kusto database.
    2. Data Volume Overload:
      • Verify the scale of the stream analytics job. If the processing unit (SU) is insufficient for the workload, it might lead to processing delays or dropped events.

    Recommendations:

    Incrementally Adjust the Late Tolerance Window:

    • Start by doubling the current value, monitor logs, and increase further if needed.
    1. Monitor Metrics:
      • Use Diagnostics Logs in stream analytics to monitor event processing metrics. Specifically, look for:
        - Late Input Events
        
              - Dropped Input Events
        
                    - Output Errors
        
    2. Increase Streaming Units (SU):
      • If the current workload exceeds the capacity of your stream analytics job, consider scaling up the number of SUs.
    3. Debug Input and Output Streams:
      • Analyze the pipeline stages (blob to event hub to kusto) to identify any bottlenecks or misconfigurations causing delays.

    By addressing lateness tolerance and ensuring the overall pipeline is performant and synchronized, you can prevent or minimize data loss.

    Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.


  2. Deepanshu katara 12,960 Reputation points
    2025-01-16T05:36:02.09+00:00

    Hello ,

    The error message you provided indicates that the input event was sent later than the configured tolerance for lateness in Azure Stream Analytics. This is likely the cause of the dropped data. To address this issue, you can increase the lateness tolerance in your Stream Analytics job.

    By configuring the "Events that arrive late" setting to the maximum limit of 20 days, you can allow for greater tolerance of late-arriving events. This adjustment helps ensure that events arriving late are processed rather than dropped, thus reducing the risk of data loss. However, be mindful that increasing the lateness tolerance may also lead to delays in output if events are infrequent or sparse.

    For more detailed guidance, you can refer to the following resources:

    To answer your second question--> It seems more of dropped data issue instead of underlying issue.

    Please let us know if any further questions

    Kindly accept answer if it helps

    Thanks

    Deepanshu

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.