REORG TABLE
Applies to: Databricks SQL Databricks Runtime 11.3 LTS and above
Reorganize a Delta Lake table by rewriting files to purge soft-deleted data, such as the column data dropped by ALTER TABLE DROP COLUMN.
Syntax
REORG [ TABLE ] table_name { [ WHERE predicate ] APPLY ( PURGE ) |
APPLY ( UPGRADE UNIFORM ( ICEBERG_COMPAT_VERSION = version ) } )
For Databricks Runtime versions before 15.4 TABLE
is a mandatory keyword.
Note
APPLY (PURGE)
only rewrites files that contain soft-deleted data.APPLY (UPGRADE)
may rewrite all files.REORG TABLE
is idempotent, meaning that if it is run twice on the same dataset, the second run has no effect.- After running
APPLY (PURGE)
, the soft-deleted data may still exist in the old files. You can run VACUUM to physically delete the old files.
Parameters
-
Identifies an existing Delta table. The name must not include a temporal specification or options specification.
WHERE
predicateFor
APPLY (PURGE)
, reorganizes the files that match the given partition predicate. Only filters involving partition key attributes are supported.APPLY (PURGE)
Specifies that the purpose of file rewriting is to purge soft-deleted data. See Purge metadata-only deletes to force data rewrite.
APPLY (UPGRADE UNIFORM ( ICEBERG_COMPAT_VERSION = version ))
Applies to: Databricks SQL Databricks Runtime 14.3 and above
Specifies that the purpose of file rewriting is to upgrade the table to the given Iceberg version.
version
must be either1
or2
.
Examples
> REORG TABLE events APPLY (PURGE);
> REORG TABLE events WHERE date >= '2022-01-01' APPLY (PURGE);
> REORG TABLE events
WHERE date >= current_timestamp() - INTERVAL '1' DAY
APPLY (PURGE);
> REORG TABLE events APPLY (UPGRADE UNIFORM(ICEBERG_COMPAT_VERSION=2));