to_avro function

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 15.4 and later

Returns a Avro binary value with the specified input value.

Syntax

to_avro(expr [, avroSchemaSpec] )

Arguments

  • expr: An expression.
  • avroSchemaSpec: An optional target schema in JSON format. If specified it must match the expr type as specified in Notes.

Returns

An Avro encoded BINARY.

Notes

The mapping of SQL types to Avro types is as follows:

SQL type Avro schema
VOID { "type" : "null" }
BOOLEAN { "type" : "boolean" }
TINYINT { "type" : "int" }
SMALLINT { "type" : "int" }
INT { "type" : "int" }
BIGINT { "type" : "long" }
DECIMAL(p, s) { "type": "fixed", "name": "a", logicalType": "decimal", "size": (p+1)/2, precision": p, "scale": s }
FLOAT { "type" : "float" }
DOUBLE { "type" : "float" }
STRING { "type" : "string" }
DATE { "type" : "int", "logicalType" : "date" }
TIMESTAMP { "type" : "long" } as microseconds since 1970-01-01 00:00:00.000000
TIMESTAMP_NTZ { "type" : "long" } as microseconds since 1970-01-01 00:00:00.000000
YEAR MONTH INTERVAL { "type" : "long" } as months
DAY TIME INTERVAL { "type" : "long" } as microseconds
BINARY { "type" : "bytes" }
STRUCT<field1 type1, ...> { "type" : "record", "name": "struct_name", "fields": [ { "name" : "field1", "type" : ... }, ... ] }
ARRAY<type> { "type" : "array", "items": { "type" : ... }
MAP<STRING, valueType> { "type" : "map", "keyType": { "type" : ... }, "valueType": { "type" : ... } }
MAP<nonStringType, valueType> Not supported
VARIANT Not supported

Examples

> SELECT from_avro(to_avro(5), '{ "type" : "int" }');
  5

> SELECT from_avro(to_avro(5, '{ "type" : "int" }'), '{ "type" : "int" }');
  5

> SELECT from_avro(to_avro(named_struct('num', 5, 'txt', 'hello')), '{ "type" : "record", "name": "my_record", "fields": [{ "name": "num", "type": "int"}, { "name": "txt", "type": "string"}]}');
  {"num":5,"txt":"hello"}