Recognize Printed Text - Recognize Printed Text

Reference

Service:: Azure AI Services

API Version:: 2.1

Optical Character Recognition (OCR) detects text in an image and extracts the recognized characters into a machine-usable character stream. Upon success, the OCR results will be returned. Upon failure, the error code together with an error message will be returned. The error code can be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, NotSupportedImage, NotSupportedLanguage, or InternalServerError.

POST {Endpoint}/vision/v2.1/ocr?detectOrientation={detectOrientation}

With optional parameters:

POST {Endpoint}/vision/v2.1/ocr?detectOrientation={detectOrientation}&language={language}

URI Parameters

Name	In	Required	Type	Description
Endpoint	path	True	string	Supported Cognitive Services endpoints.
detectOrientation	query	True	boolean	Whether detect the text orientation in the image. With detectOrientation=true the OCR service tries to detect the image orientation and correct it before further processing (e.g. if it's upside-down).
language	query		OcrLanguages	The BCP-47 language code of the text to be detected in the image. The default value is 'unk'.

Request Header

Name	Required	Type	Description
Ocp-Apim-Subscription-Key	True	string

Request Body

Name	Required	Type	Description
url	True	string	Publicly reachable URL of an image.

Responses

Name	Type	Description
200 OK	OcrResult	The OCR results in the hierarchy of region/line/word. The results include text, bounding box for regions, lines and words. The angle, in radians, of the detected text with respect to the closest horizontal or vertical direction. After rotating the input image clockwise by this angle, the recognized text lines become horizontal or vertical. In combination with the orientation property it can be used to overlay recognition results correctly on the original image, by rotating either the original image or recognition results by a suitable angle around the center of the original image. If the angle cannot be confidently detected, this property is not present. If the image contains text at different angles, only part of the text will be recognized correctly.
Other Status Codes	ComputerVisionError	Error response.

Security

Ocp-Apim-Subscription-Key

Type: apiKey
In: header

Examples

Successful RecognizePrintedText request

Sample request

HTTP

POST https://westus.api.cognitive.microsoft.com/vision/v2.1/ocr?detectOrientation=true&language=en


"{url}"

Sample response

Status code:: 200

{
  "language": "en",
  "textAngle": -2.0000000000000338,
  "orientation": "Up",
  "regions": [
    {
      "boundingBox": "462,379,497,258",
      "lines": [
        {
          "boundingBox": "462,379,497,74",
          "words": [
            {
              "boundingBox": "462,379,41,73",
              "text": "A"
            },
            {
              "boundingBox": "523,379,153,73",
              "text": "GOAL"
            },
            {
              "boundingBox": "694,379,265,74",
              "text": "WITHOUT"
            }
          ]
        },
        {
          "boundingBox": "565,471,289,74",
          "words": [
            {
              "boundingBox": "565,471,41,73",
              "text": "A"
            },
            {
              "boundingBox": "626,471,150,73",
              "text": "PLAN"
            },
            {
              "boundingBox": "801,472,53,73",
              "text": "IS"
            }
          ]
        },
        {
          "boundingBox": "519,563,375,74",
          "words": [
            {
              "boundingBox": "519,563,149,74",
              "text": "JUST"
            },
            {
              "boundingBox": "683,564,41,72",
              "text": "A"
            },
            {
              "boundingBox": "741,564,153,73",
              "text": "WISH"
            }
          ]
        }
      ]
    }
  ]
}

Definitions

Name	Description
ComputerVisionError	Details about the API request error.
ComputerVisionErrorCodes	The error code.
ImageUrl
OcrLanguages	The BCP-47 language code of the text to be detected in the image. The default value is 'unk'.
OcrLine	An object describing a single recognized line of text.
OcrRegion	A region consists of multiple lines (e.g. a column of text in a multi-column document).
OcrResult
OcrWord	Information on a recognized word.

ComputerVisionError

Details about the API request error.

Name	Type	Description
code	ComputerVisionErrorCodes	The error code.
message	string	A message explaining the error reported by the service.
requestId	string	A unique request identifier.

ComputerVisionErrorCodes

The error code.

Name	Type	Description
BadArgument	string
CancelledRequest	string
DetectFaceError	string
FailedToProcess	string
InternalServerError	string
InvalidDetails	string
InvalidImageFormat	string
InvalidImageSize	string
InvalidImageUrl	string
InvalidModel	string
InvalidThumbnailSize	string
NotSupportedFeature	string
NotSupportedImage	string
NotSupportedLanguage	string
NotSupportedVisualFeature	string
StorageException	string
Timeout	string
Unspecified	string
UnsupportedMediaType	string

ImageUrl

Name	Type	Description
url	string	Publicly reachable URL of an image.

OcrLanguages

The BCP-47 language code of the text to be detected in the image. The default value is 'unk'.

Name	Type	Description
ar	string
cs	string
da	string
de	string
el	string
en	string
es	string
fi	string
fr	string
hu	string
it	string
ja	string
ko	string
nb	string
nl	string
pl	string
pt	string
ro	string
ru	string
sk	string
sr-Cyrl	string
sr-Latn	string
sv	string
tr	string
unk	string
zh-Hans	string
zh-Hant	string

OcrLine

An object describing a single recognized line of text.

Name	Type	Description
boundingBox	string	Bounding box of a recognized line. The four integers represent the x-coordinate of the left edge, the y-coordinate of the top edge, width, and height of the bounding box, in the coordinate system of the input image, after it has been rotated around its center according to the detected text angle (see textAngle property), with the origin at the top-left corner, and the y-axis pointing down.
words	OcrWord[]	An array of objects, where each object represents a recognized word.

OcrRegion

A region consists of multiple lines (e.g. a column of text in a multi-column document).

Name	Type	Description
boundingBox	string	Bounding box of a recognized region. The four integers represent the x-coordinate of the left edge, the y-coordinate of the top edge, width, and height of the bounding box, in the coordinate system of the input image, after it has been rotated around its center according to the detected text angle (see textAngle property), with the origin at the top-left corner, and the y-axis pointing down.
lines	OcrLine[]	An array of recognized lines of text.

OcrResult

Name	Type	Description
language	string	The BCP-47 language code of the text in the image.
orientation	string	Orientation of the text recognized in the image, if requested. The value (up, down, left, or right) refers to the direction that the top of the recognized text is facing, after the image has been rotated around its center according to the detected text angle (see textAngle property). If detection of the orientation was not requested, or no text is detected, the value is 'NotDetected'.
regions	OcrRegion[]	An array of objects, where each object represents a region of recognized text.
textAngle	number	The angle, in radians, of the detected text with respect to the closest horizontal or vertical direction. After rotating the input image clockwise by this angle, the recognized text lines become horizontal or vertical. In combination with the orientation property it can be used to overlay recognition results correctly on the original image, by rotating either the original image or recognition results by a suitable angle around the center of the original image. If the angle cannot be confidently detected, this property is not present. If the image contains text at different angles, only part of the text will be recognized correctly.

OcrWord

Information on a recognized word.

Name	Type	Description
boundingBox	string	Bounding box of a recognized word. The four integers represent the x-coordinate of the left edge, the y-coordinate of the top edge, width, and height of the bounding box, in the coordinate system of the input image, after it has been rotated around its center according to the detected text angle (see textAngle property), with the origin at the top-left corner, and the y-axis pointing down.
text	string	String value of a recognized word.

Share via