CACHE SELECT
Applies to: Databricks Runtime
Note
On SQL warehouses or in Databricks Runtime 14.2 and above, the function CACHE SELECT
is ignored.
Caches the data accessed by the specified simple SELECT
query in the disk cache.
You can choose a subset of columns to be cached by providing a list of column names and choose a subset of rows by providing a predicate.
This enables subsequent queries to avoid scanning the original files as much as possible.
This construct is applicable only to Delta tables and Parquet tables.
Views are also supported, but the expanded queries are restricted to the simple queries, as described above.
Syntax
CACHE SELECT column_name [, ...] FROM table_name [ WHERE boolean_expression ]
See Disk cache vs. Spark cache for the differences between disk caching and the Apache Spark cache.
Parameters
-
Identifies an existing table. The name must not include a temporal specification or options specification.
Examples
CACHE SELECT * FROM boxes
CACHE SELECT width, length FROM boxes WHERE height=3