Поделиться через


Stream Analytics: batching events and trigger on "last"

For a side project, I had to write a few Azure Stream Analytics queries. Not being a SQL -- let alone Stream Analytics -- person, it required some effort and substantial assistance. Hope it helps. 

The goal of this first query is to collect all incoming events from an eventhub over a duration of 10 seconds and then forward the collection to another eventhub as output. 

 SELECT i1.Id, Collect() as messages
INTO output
FROM input 
group by i1.id, SessionWindow(second, 10, 15)

The second one is a bit more interesting. The purpose is to wait for a “last” event in a stream. If no event with a specific ID arrives within 15 seconds, the query needs to trigger an output. Here it is:

 SELECT i1.Id, i1.EventEnqueuedUtcTime as t
into output
FROM Input as i1 TIMESTAMP BY EventEnqueuedUtcTime
LEFT outer JOIN Input as i2 TIMESTAMP BY EventEnqueuedUtcTime
ON i1.Id = i2.Id 
AND DATEDIFF(second,i1,i2) BETWEEN 0 AND 15
and i2.EventEnqueuedUtcTime > i1.EventEnqueuedUtcTime
WHERE i2 IS NULL

There is something counter intuitive about the last line, but it works beautifully. It is important to put the time comparison in the ON clause and not in the WHERE clause. In the last case, you won't get any output.