DESCRIBE TABLE

适用于:勾选“是” Databricks SQL 勾选“是” Databricks Runtime

返回表的基本元数据信息。 元数据信息包括列名、列类型和列注释。 可根据需要指定分区规范或列名称,以分别返回与分区或列有关的元数据。 使用 Delta 表时,不会返回所有字段。

元数据以报表形式或 JSON 文档的形式返回。

重要说明

使用 DESCRIBE AS JSON 以编程方式分析/描述输出。 非 JSON 报表格式可能会更改。

语法

{ DESC | DESCRIBE } [ TABLE ] [ EXTENDED ] table_name { [ PARTITION clause ] | [ column_name ] } [ AS JSON ]

为了兼容性,FORMATTED 可以作为 EXTENDED的同义词。

参数

  • EXTENDED

    如果指定了该参数,则显示有关指定列的详细信息,包括命令收集的列统计信息和其他元数据(例如架构限定符、所有者和访问时间)。

  • table_name

    标识要描述的表。 该名称不能使用 时态规范或选项规范。 如果找不到表,Azure Databricks 会引发 TABLE_OR_VIEW_NOT_FOUND 错误。

  • PARTITION 子句

    一个可选参数,指示 Databricks SQL 为命名分区返回附加元数据。

  • column_name

    一个可选参数,具有需说明的列名。 目前不允许指定嵌套列。

    单个列支持 JSON 格式。

  • AS JSON

    适用于:勾选为“是” Databricks Runtime 16.2 及更高版本

    (可选)将表元数据作为 JSON 字符串而不是人工可读报表返回。 使用程序分析结果时使用此格式。

    仅当指定 EXTENDED 格式时才受支持。

参数partition_speccolumn_name互斥,不能同时指定。

JSON 格式的输出

指定 AS JSON 时,输出将作为 JSON 字符串返回。 支持以下架构:

{
  "table_name": "<table_name>",
  "catalog_name": "<catalog_name>",
  "schema_name": "<schema_name>",
  "namespace": ["<schema_name>"],
  "type": "<table_type>",
  "provider": "<provider>",
  "columns": [
    {
      "name": "<name>",
      "type": <type_json>,
      "comment": "<comment>",
      "nullable": <boolean>,
      "default": "<default_val>"
    }
  ],
  "partition_values": {
    "<col_name>": "<val>"
  },
  "location": "<path>",
  "view_text": "<view_text>",
  "view_original_text": "<view_original_text>",
  "view_schema_mode": "<view_schema_mode>",
  "view_catalog_and_namespace": "<view_catalog_and_namespace>",
  "view_query_output_columns": ["<col_name>"],
  "comment": "<comment>",
  "table_properties": {
    "property1": "<property1>",
    "property2": "<property2>"
  },
  "statistics": {
    "num_rows": <count>,
    "size_in_bytes": <bytes>,
    "table_change_stats": {
      "inserted": <count>,
      "deleted": <count>,
      "updated": <count>,
      "change_percent": <percent_changed_float>
    }
  },
  "storage_properties": {
    "property1": "<property1>",
    "property2": "<property2>"
  },
  "serde_library": "<serde_library>",
  "input_format": "<input_format>",
  "output_format": "<output_format>",
  "num_buckets": <num_buckets>,
  "bucket_columns": ["<col_name>"],
  "sort_columns": ["<col_name>"],
  "created_time": "<timestamp_ISO-8601>",
  "created_by": "<created_by>",
  "last_access": "<timestamp_ISO-8601>",
  "partition_provider": "<partition_provider>"
}

下面是 <type_json>的架构定义:

SQL 类型 JSON 表示形式
TINYINT { "name" : "tinyint" }
SMALLINT { "name" : "smallint" }
INT { "name" : "int" }
BIGINT { "name" : "bigint" }
FLOAT { "name" : "float" }
DOUBLE { "name" : "double" }
DECIMAL(p, s) { "name" : "decimal", "precision": p, "scale": s }
STRING { "name" : "string" }
VARCHAR(n) { "name" : "varchar", "length": n }
CHAR(n) { "name" : "char", "length": n }
BINARY { "name" : "binary" }
BOOLEAN { "name" : "boolean" }
DATE { "name" : "date" }
TIMESTAMP { "name" : "timestamp_ltz" }
TIMESTAMP_NTZ { "name" : "timestamp_ntz" }
时间间隔从 start_unit 到 end_unit { "name" : "interval", "start_unit": "<start_unit>", "end_unit": "<end_unit>" }
ARRAY<element_type> { "name" : "array", "element_type": <type_json>, "element_nullable": <boolean_val> }
MAP<key_type, value_type> { "name" : "map", "key_type": <type_json>, "value_type": <type_json>, "element_nullable": <boolean_val> }
STRUCT<field_name …, …> { "name" : "struct", "fields": [ {"name" : "<field_name>", "type" : <type_json>, “nullable”: <boolean_val>, "comment": “<field_comment>”, "default": “<default_val>”}] }
VARIANT { "name" : "variant" }

示例

-- Creates a table `customer`. Assumes current schema is `salesdb`.
> CREATE TABLE customer(
        cust_id INT,
        state VARCHAR(20),
        name STRING COMMENT 'Short name'
    )
    USING parquet
    PARTITIONED BY (state);

> INSERT INTO customer PARTITION (state = 'AR') VALUES (100, 'Mike');

-- Returns basic metadata information for unqualified table `customer`
> DESCRIBE TABLE customer;
                col_name data_type    comment
 ----------------------- --------- ----------
                 cust_id       int       null
                    name    string Short name
                   state    string       null
 # Partition Information
              # col_name data_type    comment
                   state    string       null

-- Returns basic metadata information for qualified table `customer`
> DESCRIBE TABLE salesdb.customer;
                col_name data_type    comment
 ----------------------- --------- ----------
                 cust_id       int       null
                    name    string Short name
                   state    string       null
 # Partition Information
              # col_name data_type    comment
                   state    string       null

-- Returns additional metadata such as parent schema, owner, access time etc.
> DESCRIBE TABLE EXTENDED customer;
                     col_name                      data_type    comment
 ---------------------------- ------------------------------ ----------
                      cust_id                            int       null
                         name                         string Short name
                        state                         string       null
      # Partition Information
                   # col_name                      data_type    comment
                        state                         string       null

 # Detailed Table Information
                     Database                        default
                        Table                       customer
                        Owner                  <TABLE OWNER>
                 Created Time   Tue Apr 07 22:56:34 JST 2020
                  Last Access                        UNKNOWN
                   Created By                <SPARK VERSION>
                         Type                        MANAGED
                     Provider                        parquet
                     Location file:/tmp/salesdb.db/custom...
                Serde Library org.apache.hadoop.hive.ql.i...
                  InputFormat org.apache.hadoop.hive.ql.i...
                 OutputFormat org.apache.hadoop.hive.ql.i...
           Partition Provider                        Catalog

-- Returns partition metadata such as partitioning column name, column type and comment.
> DESCRIBE TABLE EXTENDED customer PARTITION (state = 'AR');
                       col_name                      data_type    comment
 ------------------------------ ------------------------------ ----------
                         cust_id                            int       null
                           name                         string Short name
                          state                         string       null
        # Partition Information
                     # col_name                      data_type    comment
                          state                         string       null

 # Detailed Partition Inform...
                       Database                        default
                          Table                       customer
               Partition Values                     [state=AR]
                       Location file:/tmp/salesdb.db/custom...
                  Serde Library org.apache.hadoop.hive.ql.i...
                    InputFormat org.apache.hadoop.hive.ql.i...
                   OutputFormat org.apache.hadoop.hive.ql.i...
             Storage Properties [serialization.format=1, pa...
           Partition Parameters {transient_lastDdlTime=1586...
                   Created Time   Tue Apr 07 23:05:43 JST 2020
                    Last Access                        UNKNOWN
           Partition Statistics                      659 bytes

          # Storage Information
                       Location file:/tmp/salesdb.db/custom...
                  Serde Library org.apache.hadoop.hive.ql.i...
                    InputFormat org.apache.hadoop.hive.ql.i...
                   OutputFormat org.apache.hadoop.hive.ql.i...
 ------------------------------ ------------------------------ ----------

-- Returns the metadata for `name` column.
-- Optional `TABLE` clause is omitted and column is fully qualified.
> DESCRIBE customer salesdb.customer.name;
 info_name info_value
 --------- ----------
  col_name       name
 data_type     string
   comment Short name

- Returns the table metadata in JSON format.
DESCRIBE EXTENDED customer AS JSON;
{
  "table_name":"customer",
  "catalog_name":"spark_catalog",
  "schema_name":"default",
  "namespace":["default"],
  "columns":[
    {"name":"cust_id","type":{"name":"integer"},"nullable":true},
    {"name":"name","type":{"name":"string"},"comment":"Short name","nullable":true},
    {"name":"state","type":{"name":"varchar","length":20},"nullable":true}],
  "location": "file:/tmp/salesdb.db/custom...",
  "created_time":"2020-04-07T14:05:43Z",
  "last_access":"UNKNOWN",
  "created_by":"None",
  "type":"MANAGED",
  "provider":"parquet",
  "partition_provider":"Catalog",
  "partition_columns":["state"]}

DESCRIBE DETAIL

DESCRIBE DETAIL [schema_name.]table_name

返回架构、分区、表大小等方面的信息。 例如,对于 Delta 表,你可以查看表的当前读取器和编写器版本。 请参阅使用 describe detail 查看 Delta Lake 表详细信息,了解详细信息架构。