DocumentWordOutput interface

A word object consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.

Properties

confidence

Confidence of correctly extracting the word.

content

Text content of the word.

polygon

Bounding polygon of the word, with coordinates specified relative to the top-left of the page. The numbers represent the x, y values of the polygon vertices, clockwise from the left (-180 degrees inclusive) relative to the element orientation.

span

Location of the word in the reading order concatenated content.

Property Details

confidence

Confidence of correctly extracting the word.

confidence: number

Property Value

number

content

Text content of the word.

content: string

Property Value

string

polygon

Bounding polygon of the word, with coordinates specified relative to the top-left of the page. The numbers represent the x, y values of the polygon vertices, clockwise from the left (-180 degrees inclusive) relative to the element orientation.

polygon?: number[]

Property Value

number[]

span

Location of the word in the reading order concatenated content.

span: DocumentSpanOutput

Property Value