STRING type

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime

The type supports character sequences of any length greater or equal to 0.

Syntax

STRING

Literals

[r|R]'c [ ... ]'
  • r or R

    Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime 10.4 LTS and above

    Optional prefix denoting a raw-literal.

  • c

    Any character from the Unicode character set.

The following escape sequences are recognized in regular string literals (without the r prefix) and replaced according to the following rules:

  • \0 -> \u0000, unicode character with the code 0;
  • \b -> \u0008, backspace;
  • \n -> \u000a, linefeed;
  • \r -> \u000d, carriage return;
  • \t -> \u0009, horizontal tab;
  • \Z -> \u001A, substitute;
  • \% -> \%;
  • \_ -> \_;
  • \<other char> -> <other char>, skip the slash and leave the character as is.

If the string is prefixed with r there is no escape character.

The inherent collation of a STRING type is UTF8_BINARY.

See the collation precedence rules for more information.

Note

While you can use double quotes (") instead of single quotes (') to delimit a string literal, this is discouraged as it is not standard SQL.

Warning

Some client interfaces perform macro substitutions on strings before sending them to the SQL parser. For example in a Databricks notebook the $ (dollar) sign introduces a widget and needs to be escaped with \$ to be preserved in a string literal.

Examples

> SELECT 'Spark';
  Spark

> SELECT CAST(5 AS STRING);
  5

> SELECT 'O\'Connell'
  O'Connell

> SELECT 'Some\nText'
  Some
  Text

> SELECT r'Some\nText'
  Some\nText

> SELECT '서울시'
  서울시

> SELECT ''

> SELECT '\\'
  \

> SELECT r'\\'
  \\

-- From a notbook cell (without a widget 'e' being defined)
> SELECT 'Hou$e', 'Hou\$e', r'Hou$e', r'Hou\$e'
  Hou  Hou$e  Hou  Hou$e

> SELECT COLLATION(‘hello’);
  UTF8_BINARY