table reference
Applies to: Databricks SQL Databricks Runtime
A table reference is an intermediate result table within SQL. It can be derived from other operators, such as functions, joins or a subquery, reference a base table directly, or be constructed as an inline table.
Syntax
table_reference
{ table_name [ TABLESAMPLE clause ] [ table_alias ] |
{ STREAM table_name [ table_alias ] | STREAM ( table_name ) [ table_alias ] } |
view_name [ table_alias ] |
JOIN clause |
PIVOT clause |
UNPIVOT clause |
[ STREAM ] table_valued_function [ table_alias ] |
[ LATERAL ] table_valued_function [ table_alias ] |
VALUES clause |
[ LATERAL ] ( query ) [ TABLESAMPLE clause ] [ table_alias ] }
Parameters
-
Identifies a table that may contain a temporal specification. See Work with Delta Lake table history for details.
If the table cannot be found Azure Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.
See Column, field, parameter, and variable resolution for more information on name resolution.
Applies to: ** Databricks SQL Databricks Runtime 16.0 and later
table_name
may include an options specification. -
Identifies a view or a common table expression (CTE). If the view cannot be found Azure Databricks raises a TABLE_OR_VIEW_NOT_FOUND error.
See Column, field, parameter, and variable resolution for more information on name resolution.
STREAM
Returns a table or a table valued function as a streaming source. Tables cannot be provided with a temporal specification when used with the
STREAM
keyword. Streaming sources are most commonly used in the definitions of streaming tables.-
Combines two or more relations using a join.
-
Applies to: Databricks SQL Databricks Runtime 12.2 LTS and above.
Used for data perspective; you can get the aggregated values based on specific column value.
Prior to Databricks Runtime 12.0 PIVOT is limited to SELECT following the
FROM
clause. -
Applies to: Databricks SQL Databricks Runtime 12.2 LTS and above.
Used for data perspective; you can split multiple column groups into rows.
[LATERAL] table_valued_function_invocation
Invokes a table valued function. To refer to columns exposed by a preceding
table_reference
in the sameFROM
clause you must specifyLATERAL
.-
Defines an inline table.
[LATERAL] ( query )
Computes a table reference using a query. A query prefixed by
LATERAL
may reference columns exposed by a precedingtable_reference
in the sameFROM
clause. Such a construct is called a correlated or dependent query.-
Optionally reduce the size of the result set by only sampling a fraction of the rows.
-
Optionally specifies a label for the
table_reference
. If thetable_alias
includescolumn_identifier
s their number must match the number of columns in thetable_reference
.
Select on Delta table
In addition to the standard SELECT
options, Delta tables support the time travel options described in this section. For details, see Work with Delta Lake table history.
AS OF
syntax
table_identifier TIMESTAMP AS OF timestamp_expression
table_identifier VERSION AS OF version
timestamp_expression
can be any one of:'2018-10-18T22:15:12.013Z'
, that is, a string that can be cast to a timestampcast('2018-10-18 13:36:32 CEST' as timestamp)
'2018-10-18'
, that is, a date stringcurrent_timestamp() - interval 12 hours
date_sub(current_date(), 1)
- Any other expression that is or can be cast to a timestamp
version
is a long value that can be obtained from the output ofDESCRIBE HISTORY table_spec
.
Neither timestamp_expression
nor version
can be subqueries.
Example
> SELECT * FROM events TIMESTAMP AS OF '2018-10-18T22:15:12.013Z'
> SELECT * FROM events VERSION AS OF 123
@
syntax
Use the @
syntax to specify the timestamp or version. The timestamp must be in yyyyMMddHHmmssSSS
format. You can specify a version after @
by prepending a v
to the version. For example, to query version 123
for the table events
, specify events@v123
.
Example
> SELECT * FROM events@20190101000000000
> SELECT * FROM events@v123
Examples
-- Return a data set from a storage location using a credential.
> SELECT * FROM `csv`.`spreadsheets/data.csv` WITH(CREDENTIAL some_credential);
-- Select from a jdbc data source while controlling the fetch size.
> SELECT * FROM jdbcTable WITH(fetchSize = 0);
-- select all referencable columns from all tables
> SELECT * FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
1 2 3 4
-- select all referencable columns from one table
> SELECT t2.* FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
3 4
-- select all referencable columns from all tables except t2.c4
> SELECT * EXCEPT(c4) FROM VALUES(1, 2) AS t1(c1, c2), VALUES(3, 4) AS t2(c3, c4);
1 2 3
-- select all referencable columns from a table, except a nested field.
> SELECT * EXCEPT(c2.b) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
1 { "a" : 2 }
-- Removing all fields results in an empty struct
> SELECT * EXCEPT(c2.b, c2.a) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
1 { }
-- Overlapping names result in an error
> SELECT * EXCEPT(c2, c2.a) FROM VALUES(1, named_struct('a', 2, 'b', 3)) AS t(c1, c2);
Error: EXCEPT_OVERLAPPING_COLUMNS