Analyze - Image

Reference

Service:: Azure AI Services

API Version:: 2024-02-01

Analyze the input image. The request either contains image stream with any content type ['image/*', 'application/octet-stream'], or a JSON payload which includes an url property to be used to retrieve the image stream.

POST /imageanalysis:analyze?api-version=2024-02-01

With optional parameters:

POST /imageanalysis:analyze?features={features}&language={language}&model-version={model-version}&smartcrops-aspect-ratios={smartcrops-aspect-ratios}&gender-neutral-caption={gender-neutral-caption}&api-version=2024-02-01

URI Parameters

Name	In	Required	Type	Description
api-version	query	True	string	Requested API version.
features	query		VisualFeature[]	The visual features requested. At least one visual feature must be specified.
gender-neutral-caption	query		boolean	Boolean flag for enabling gender-neutral captioning for caption and denseCaptions features. If this parameter is not specified, the default value is "false".
language	query		string	The desired language for output generation. If this parameter is not specified, the default value is "en". See https://aka.ms/cv-languages for a list of supported languages.
model-version	query		string	Model version.
smartcrops-aspect-ratios	query		array[]	A list of aspect ratios to use for smartCrops feature. Aspect ratios are calculated by dividing the target crop width by the height. Supported values are between 0.75 and 1.8 (inclusive). Multiple values should be comma-separated. If this parameter is not specified, the service will return one crop suggestion with an aspect ratio it sees fit between 0.5 and 2.0 (inclusive).

Request Body

Name	Required	Type	Description
url	True	string	Publicly reachable URL of an image.

Responses

Name	Type	Description
200 OK	ImageAnalysisResult	Success
Other Status Codes	ErrorResponse	Error Headers x-ms-error-code: string

Name

Type

Description

200 OK

ImageAnalysisResult

Success

Other Status Codes

ErrorResponse

Error

Headers

x-ms-error-code: string

Examples

ImageAnalysis_Analyze_MaximumSet_Gen

ImageAnalysis_Analyze_MinimumSet_Gen

ImageAnalysis_Analyze_MaximumSet_Gen

Sample request

HTTP

POST /imageanalysis:analyze?features=tags&language=hduryxtlvjjvwnmpjiojibvjy&model-version=kkblitshktun&smartcrops-aspect-ratios=&gender-neutral-caption=True&api-version=2024-02-01

{
  "url": "https://microsoft.com/a"
}

Sample response

Status code:: 200

{
  "captionResult": {
    "text": "azcggjzjuvbytsq",
    "confidence": 0
  },
  "objectsResult": {
    "values": [
      {
        "id": "iaofvdltgfjrsffgltupmo",
        "boundingBox": {
          "x": 0,
          "y": 0,
          "w": 27,
          "h": 13
        },
        "tags": [
          {
            "name": "expoctetvqe",
            "confidence": 0
          }
        ]
      }
    ]
  },
  "readResult": {
    "blocks": [
      {
        "lines": [
          {
            "text": "npk",
            "boundingPolygon": [
              {
                "x": 0,
                "y": 0
              },
              {
                "x": 0,
                "y": 0
              },
              {
                "x": 0,
                "y": 0
              },
              {
                "x": 0,
                "y": 0
              }
            ],
            "words": [
              {
                "text": "wljuxeeadklupdpxgcinka",
                "boundingPolygon": [
                  {
                    "x": 0,
                    "y": 0
                  },
                  {
                    "x": 0,
                    "y": 0
                  },
                  {
                    "x": 0,
                    "y": 0
                  },
                  {
                    "x": 0,
                    "y": 0
                  }
                ],
                "confidence": 0
              }
            ]
          }
        ]
      }
    ]
  },
  "denseCaptionsResult": {
    "values": [
      {
        "text": "pqrcyrtz",
        "confidence": 0,
        "boundingBox": {
          "x": 0,
          "y": 0,
          "w": 27,
          "h": 13
        }
      }
    ]
  },
  "modelVersion": "hslbdtpcuyabri",
  "metadata": {
    "width": 10,
    "height": 27
  },
  "tagsResult": {
    "values": [
      {
        "name": "expoctetvqe",
        "confidence": 0
      }
    ]
  },
  "smartCropsResult": {
    "values": [
      {
        "aspectRatio": 23,
        "boundingBox": {
          "x": 0,
          "y": 0,
          "w": 27,
          "h": 13
        }
      }
    ]
  },
  "peopleResult": {
    "values": [
      {
        "boundingBox": {
          "x": 0,
          "y": 0,
          "w": 27,
          "h": 13
        },
        "confidence": 0
      }
    ]
  }
}

ImageAnalysis_Analyze_MinimumSet_Gen

Sample request

HTTP

POST /imageanalysis:analyze?api-version=2024-02-01

{
  "url": "https://www.abc.com"
}

Sample response

Status code:: 200

{
  "modelVersion": "cvhbhwpfswz",
  "metadata": {
    "width": 10,
    "height": 23
  }
}

Definitions

Name	Description
BoundingBox	A bounding box for an area inside an image.
CaptionResult	A brief description of what the image depicts.
ContentTag	An entity observation in the image, along with the confidence score.
CropRegion	A region identified for smart cropping. There will be one region returned for each requested aspect ratio.
DenseCaption	A brief description of what the image depicts.
DenseCaptionsResult	A list of captions.
DetectedObject	Describes a detected object in an image.
DetectedPerson	A person detected in an image.
DetectedTextBlock	A detected text block.
DetectedTextLine	A detected text line.
DetectedTextWord	A detected word consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.
ErrorResponse	Response returned when an error occurs.
ErrorResponseDetails	Error info.
ErrorResponseInnerError	Detailed error.
ImageAnalysisResult	Describe the combined results of different types of image analysis.
ImageMetadata	The image metadata information such as height and width.
ImagePoint	An object representing a point in the image.
ImageUrl	A JSON document with a URL pointing to the publicly accessible image to be analyzed.
ObjectsResult	Describes detected objects in an image.
PeopleResult	An object describing whether the image contains people.
ReadResult	The results of an Read operation.
SmartCropsResult	Smart cropping result.
TagsResult	A list of tags with confidence level.
VisualFeature	The visual features requested. At least one visual feature must be specified.

BoundingBox

A bounding box for an area inside an image.

Name	Type	Description
h	integer	Height measured from the top-left point of the area, in pixels.
w	integer	Width measured from the top-left point of the area, in pixels.
x	integer	Left-coordinate of the top left point of the area, in pixels.
y	integer	Top-coordinate of the top left point of the area, in pixels.

CaptionResult

A brief description of what the image depicts.

Name	Type	Description
confidence	number	The level of confidence the service has in the caption. Confidence scores span the range of 0.0 to 1.0 (inclusive), with higher values indicating a higher confidence of a match.
text	string	The text of the caption.

ContentTag

An entity observation in the image, along with the confidence score.

Name	Type	Description
confidence	number	The level of confidence that the entity was observed. Confidence scores span the range of 0.0 to 1.0 (inclusive), with higher values indicating a higher confidence of a match.
name	string	Name of the entity.

CropRegion

A region identified for smart cropping. There will be one region returned for each requested aspect ratio.

Name	Type	Description
aspectRatio	number	The aspect ratio of the crop region.
boundingBox	BoundingBox	A bounding box for an area inside an image.

DenseCaption

A brief description of what the image depicts.

Name	Type	Description
boundingBox	BoundingBox	A bounding box for an area inside an image.
confidence	number	The level of confidence the service has in the caption. Confidence scores span the range of 0.0 to 1.0 (inclusive), with higher values indicating a higher confidence of a match.
text	string	The text of the caption.

DenseCaptionsResult

A list of captions.

Name	Type	Description
values	DenseCaption[]	A list of captions.

DetectedObject

Describes a detected object in an image.

Name	Type	Description
boundingBox	BoundingBox	A bounding box for an area inside an image.
id	string	Id of the detected object.
tags	ContentTag[]	Classification confidences of the detected object.

DetectedPerson

A person detected in an image.

Name	Type	Description
boundingBox	BoundingBox	A bounding box for an area inside an image.
confidence	number	Confidence score of having observed the person in the image. Confidence scores span the range of 0.0 to 1.0 (inclusive), with higher values indicating a higher confidence of a match.

DetectedTextBlock

A detected text block.

Name	Type	Description
lines	DetectedTextLine[]	List of text lines in the text block.

DetectedTextLine

A detected text line.

Name	Type	Description
boundingPolygon	ImagePoint[]	Bounding polygon of the text line.
text	string	Text content of the detected text line.
words	DetectedTextWord[]	List of words in the text line.

DetectedTextWord

A detected word consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.

Name	Type	Description
boundingPolygon	ImagePoint[]	Bounding polygon of the word.
confidence	number	The level of confidence that the word was detected. Confidence scores span the range of 0.0 to 1.0 (inclusive), with higher values indicating a higher confidence of a match.
text	string	Text content of the word.

ErrorResponse

Response returned when an error occurs.

Name	Type	Description
error	ErrorResponseDetails	Error info.

ErrorResponseDetails

Error info.

Name	Type	Description
code	string	Error code.
details	ErrorResponseDetails[]	List of detailed errors.
innererror	ErrorResponseInnerError	Detailed error.
message	string	Error message.
target	string	Target of the error.

ErrorResponseInnerError

Detailed error.

Name	Type	Description
code	string	Error code.
innererror	ErrorResponseInnerError	Detailed error.
message	string	Error message.

ImageAnalysisResult

Describe the combined results of different types of image analysis.

Name	Type	Description
captionResult	CaptionResult	A brief description of what the image depicts.
denseCaptionsResult	DenseCaptionsResult	A list of captions.
metadata	ImageMetadata	The image metadata information such as height and width.
modelVersion	string	Model Version.
objectsResult	ObjectsResult	Describes detected objects in an image.
peopleResult	PeopleResult	An object describing whether the image contains people.
readResult	ReadResult	The results of an Read operation.
smartCropsResult	SmartCropsResult	Smart cropping result.
tagsResult	TagsResult	A list of tags with confidence level.

ImageMetadata

The image metadata information such as height and width.

Name	Type	Description
height	integer	The height of the image in pixels.
width	integer	The width of the image in pixels.

ImagePoint

An object representing a point in the image.

Name	Type	Description
x	integer	The x-coordinate of this point.
y	integer	The y-coordinate of this point.

ImageUrl

A JSON document with a URL pointing to the publicly accessible image to be analyzed.

Name	Type	Description
url	string	Publicly reachable URL of an image.

ObjectsResult

Describes detected objects in an image.

Name	Type	Description
values	DetectedObject[]	An array of detected objects.

PeopleResult

An object describing whether the image contains people.

Name	Type	Description
values	DetectedPerson[]	An array of detected people.

ReadResult

The results of an Read operation.

Name	Type	Description
blocks	DetectedTextBlock[]	A list of text blocks.

SmartCropsResult

Smart cropping result.

Name	Type	Description
values	CropRegion[]	Recommended regions for cropping the image.

TagsResult

A list of tags with confidence level.

Name	Type	Description
values	ContentTag[]	A list of tags with confidence level.

VisualFeature

The visual features requested. At least one visual feature must be specified.

Name	Type	Description
caption	string	A description or a caption summarizing the content of the image.
denseCaptions	string	Detailed captions providing in-depth descriptions of the image content.
objects	string	Specific objects recognized and labeled in the image.
people	string	Detection and analysis of people in the image.
read	string	Textual content extracted from the image, such as signs or labels.
smartCrops	string	Automatically generated cropped versions of the image focusing on important content.
tags	string	Visual tags representing objects detected in the image.

Share via

Analyze - Image

URI Parameters

Request Body

Responses

Examples

ImageAnalysis_Analyze_MaximumSet_Gen

Sample request

Sample response

ImageAnalysis_Analyze_MinimumSet_Gen

Sample request

Sample response

Definitions

BoundingBox

CaptionResult

ContentTag

CropRegion

DenseCaption

DenseCaptionsResult

DetectedObject

DetectedPerson

DetectedTextBlock

DetectedTextLine

DetectedTextWord

ErrorResponse

ErrorResponseDetails

ErrorResponseInnerError

ImageAnalysisResult

ImageMetadata

ImagePoint

ImageUrl

ObjectsResult

PeopleResult

ReadResult

SmartCropsResult

TagsResult

VisualFeature

Additional resources