WATERMARK 子句
適用於: Databricks SQL Databricks Runtime 12.0 和更新版本
將浮浮浮至 select 語句中的關聯。 子 WATERMARK
句僅適用於具狀態串流數據的查詢,其中包括數據流聯結和匯總。
語法
from_item
{ table_name [ TABLESAMPLE clause ] [ watermark_clause ] [ table_alias ] |
JOIN clause |
[ LATERAL ] table_valued_function [ table_alias ] |
VALUE clause |
[ LATERAL ] ( query ) [ TABLESAMPLE clause ] [ watermark_clause ] [ table_alias ] }
watermark_clause
WATERMARK named_expression DELAY OF interval
參數
-
提供型
timestamp
別 值的表達式。 表達式必須是現有數據行的參考,或是針對現有數據行的決定性轉換。 表達式會新增時間戳類型的數據行,用來追蹤浮水印。 新增的數據行可供查詢。 -
定義浮浮浮浮水印延遲閾值的間隔常值。 必須是小於一個月的正值。
範例
-- Creating a streaming table performing time window row count, with defining watermark from existing column
> CREATE OR REFRESH STREAMING TABLE window_agg_1
AS SELECT window(ts, '10 seconds') as w, count(*) as CNT
FROM
STREAM stream_source WATERMARK ts DELAY OF INTERVAL 10 SECONDS AS stream
GROUP BY window(ts, '10 seconds');
-- Creating a streaming table performing time window row count, with deriving a new timestamp column to define watermark
> CREATE OR REFRESH STREAMING TABLE window_agg_2
AS SELECT window(ts, '10 seconds') as w, count(*) as CNT
FROM
STREAM stream_source WATERMARK to_timestamp(ts_str) AS ts DELAY OF INTERVAL 10 SECONDS AS stream
GROUP BY window(ts, '10 seconds');