Rediger

Del via


Azure OpenAI Service REST API preview reference

This article provides details on the inference REST API endpoints for Azure OpenAI.

API specs

Managing and interacting with Azure OpenAI models and resources is divided across three primary API surfaces:

  • Control plane
  • Data plane - authoring
  • Data plane - inference

Each API surface/specification encapsulates a different set of Azure OpenAI capabilities. Each API has its own unique set of preview and stable/generally available (GA) API releases. Preview releases currently tend to follow a monthly cadence.

API Latest preview release Latest GA release Specifications Description
Control plane 2024-06-01-preview 2024-10-01 Spec files Azure OpenAI shares a common control plane with all other Azure AI Services. The control plane API is used for things like creating Azure OpenAI resources, model deployment, and other higher level resource management tasks. The control plane also governs what is possible to do with capabilities like Azure Resource Manager, Bicep, Terraform, and Azure CLI.
Data plane - authoring 2024-10-01-preview 2024-10-21 Spec files The data plane authoring API controls fine-tuning, file-upload, ingestion jobs, batch and certain model level queries
Data plane - inference 2024-12-01-preview 2024-10-21 Spec files The data plane inference API provides the inference capabilities/endpoints for features like completions, chat completions, embeddings, speech/whisper, on your data, Dall-e, assistants, etc.

Authentication

Azure OpenAI provides two methods for authentication. You can use either API Keys or Microsoft Entra ID.

  • API Key authentication: For this type of authentication, all API requests must include the API Key in the api-key HTTP header. The Quickstart provides guidance for how to make calls with this type of authentication.

  • Microsoft Entra ID authentication: You can authenticate an API call using a Microsoft Entra token. Authentication tokens are included in a request as the Authorization header. The token provided must be preceded by Bearer, for example Bearer YOUR_AUTH_TOKEN. You can read our how-to guide on authenticating with Microsoft Entra ID.

REST API versioning

The service APIs are versioned using the api-version query parameter. All versions follow the YYYY-MM-DD date structure. For example:

POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01

Data plane inference

The rest of the article covers the latest preview release of the Azure OpenAI data plane inference specification, 2024-10-01-preview. This article includes documentation for the latest preview capabilities like assistants, threads, and vector stores.

If you're looking for documentation on the latest GA API release, refer to the latest GA data plane inference API

Completions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2024-12-01-preview

Creates a completion for the provided prompt, parameters and chosen model.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string Deployment id of the model which was deployed.
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
prompt string or array The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays.

Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt isn't specified the model will generate as if from the beginning of a new document.
Yes
best_of integer Generates best_of completions server-side and returns the "best" (the one with the highest log probability per token). Results can't be streamed.

When used with n, best_of controls the number of candidate completions and n specifies how many to return – best_of must be greater than n.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
No 1
echo boolean Echo back the prompt in addition to the completion
No False
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
No 0
logit_bias object Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.

As an example, you can pass {"50256": -100} to prevent the <|endoftext|> token from being generated.
No None
logprobs integer Include the log probabilities on the logprobs most likely output tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the five most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.

The maximum value for logprobs is 5.
No None
max_tokens integer The maximum number of tokens that can be generated in the completion.

The token count of your prompt plus max_tokens can't exceed the model's context length.
No 16
n integer How many completions to generate for each prompt.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
No 1
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
No 0
seed integer If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

Determinism isn't guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
No
stop string or array Up to four sequences where the API will stop generating further tokens. The returned text won't contain the stop sequence.
No
stream boolean Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.
No False
suffix string The suffix that comes after a completion of inserted text.

This parameter is only supported for gpt-3.5-turbo-instruct.
No None
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help to monitor and detect abuse.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json createCompletionResponse Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).

Status Code: default

Description: Service unavailable

Content-Type Type Description
application/json errorResponse

Examples

Example

Creates a completion for the provided prompt, parameters and chosen model.

POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2024-12-01-preview

{
 "prompt": [
  "tell me a joke about mango"
 ],
 "max_tokens": 32,
 "temperature": 1.0,
 "n": 1
}

Responses: Status Code: 200

{
  "body": {
    "id": "cmpl-7QmVI15qgYVllxK0FtxVGG6ywfzaq",
    "created": 1686617332,
    "choices": [
      {
        "text": "es\n\nWhat do you call a mango who's in charge?\n\nThe head mango.",
        "index": 0,
        "finish_reason": "stop",
        "logprobs": null
      }
    ],
    "usage": {
      "completion_tokens": 20,
      "prompt_tokens": 6,
      "total_tokens": 26
    }
  }
}

Embeddings - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2024-12-01-preview

Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
input string or array Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (8,192 tokens for text-embedding-ada-002), can't be an empty string, and any array must be 2,048 dimensions or less. Yes
user string A unique identifier representing your end-user, which can help monitoring and detecting abuse. No
input_type string input type of embedding search to use No
encoding_format string The format to return the embeddings in. Can be either float or base64. Defaults to float. No
dimensions integer The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models. No

Responses

Name Type Description Required Default
object string Yes
model string Yes
data array Yes
usage object Yes

Properties for usage

prompt_tokens

Name Type Description Default
prompt_tokens integer

total_tokens

Name Type Description Default
total_tokens integer

Status Code: 200

Description: OK

Content-Type Type Description
application/json object

Examples

Example

Return the embeddings for a given prompt.

POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2024-12-01-preview

{
 "input": [
  "this is a test"
 ]
}

Responses: Status Code: 200

{
  "body": {
    "data": [
      {
        "index": 0,
        "embedding": [
          -0.012838088,
          -0.007421397,
          -0.017617522,
          -0.028278312,
          -0.018666342,
          0.01737855,
          -0.01821495,
          -0.006950092,
          -0.009937238,
          -0.038580645,
          0.010674067,
          0.02412286,
          -0.013647936,
          0.013189907,
          0.0021125758,
          0.012406612,
          0.020790534,
          0.00074595667,
          0.008397198,
          -0.00535031,
          0.008968075,
          0.014351576,
          -0.014086051,
          0.015055214,
          -0.022211088,
          -0.025198232,
          0.0065186154,
          -0.036350243,
          0.009180495,
          -0.009698266,
          0.009446018,
          -0.008463579,
          -0.0040426035,
          -0.03443847,
          -0.00091273896,
          -0.0019217303,
          0.002349888,
          -0.021560553,
          0.016515596,
          -0.015572986,
          0.0038666942,
          -8.432463e-05
        ]
      }
    ],
    "usage": {
      "prompt_tokens": 4,
      "total_tokens": 4
    }
  }
}

Chat completions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

Creates a completion for the chat message

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string Deployment id of the model which was deployed.
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
stream boolean If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.
No False
stop string or array Up to four sequences where the API will stop generating further tokens.
No
max_tokens integer The maximum number of tokens that can be generated in the chat completion.

The total length of input tokens and generated tokens is limited by the model's context length.
No
max_completion_tokens integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. No
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
No 0
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
No 0
logit_bias object Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
No None
store boolean Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. No
metadata object Developer-defined tags and values used for filtering completions in the stored completions dashboard. No
user string A unique identifier representing your end-user, which can help to monitor and detect abuse.
No
messages array A list of messages comprising the conversation so far. Yes
data_sources array The configuration entries for Azure OpenAI chat extensions that use them.
This additional specification is only compatible with Azure OpenAI.
No
reasoning_effort enum o1 models only

Constrains effort on reasoning for
reasoning models.

Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Possible values: low, medium, high
No
logprobs boolean Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message. No False
top_logprobs integer An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used. No
n integer How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. No 1
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which guarantees the model will match your supplied JSON schema.

Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
seed integer This feature is in Beta.
If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Determinism isn't guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
No
stream_options chatCompletionStreamOptions Options for streaming response. Only set this when you set stream: true.
No None
tools array A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
No
tool_choice chatCompletionToolChoiceOption Controls which (if any) tool is called by the model. none means the model won't call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present. No
function_call string or chatCompletionFunctionCallOption Deprecated in favor of tool_choice.

Controls which (if any) function is called by the model.
none means the model won't call a function and instead generates a message.
auto means the model can pick between generating a message or calling a function.
Specifying a particular function via {"name": "my_function"} forces the model to call that function.

none is the default when no functions are present. auto is the default if functions are present.
No
functions array Deprecated in favor of tools.

A list of functions the model may generate JSON inputs for.
No
user_security_context userSecurityContext User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json createChatCompletionResponse or createChatCompletionStreamResponse

Status Code: default

Description: Service unavailable

Content-Type Type Description
application/json errorResponse

Examples

Example

Creates a completion for the provided prompt, parameters and chosen model.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

{
 "messages": [
  {
   "role": "system",
   "content": "you are a helpful assistant that talks like a pirate"
  },
  {
   "role": "user",
   "content": "can you tell me how to care for a parrot?"
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Ahoy matey! So ye be wantin' to care for a fine squawkin' parrot, eh? Well, shiver me timbers, let ol' Cap'n Assistant share some wisdom with ye! Here be the steps to keepin' yer parrot happy 'n healthy:\n\n1. Secure a sturdy cage: Yer parrot be needin' a comfortable place to lay anchor! Be sure ye get a sturdy cage, at least double the size of the bird's wingspan, with enough space to spread their wings, yarrrr!\n\n2. Perches 'n toys: Aye, parrots need perches of different sizes, shapes, 'n textures to keep their feet healthy. Also, a few toys be helpin' to keep them entertained 'n their minds stimulated, arrrh!\n\n3. Proper grub: Feed yer feathered friend a balanced diet of high-quality pellets, fruits, 'n veggies to keep 'em strong 'n healthy. Give 'em fresh water every day, or ye\u00e2\u20ac\u2122ll have a scurvy bird on yer hands!\n\n4. Cleanliness: Swab their cage deck! Clean their cage on a regular basis: fresh water 'n food daily, the floor every couple of days, 'n a thorough scrubbing ev'ry few weeks, so the bird be livin' in a tidy haven, arrhh!\n\n5. Socialize 'n train: Parrots be a sociable lot, arrr! Exercise 'n interact with 'em daily to create a bond 'n maintain their mental 'n physical health. Train 'em with positive reinforcement, treat 'em kindly, yarrr!\n\n6. Proper rest: Yer parrot be needin' \u00e2\u20ac\u2122bout 10-12 hours o' sleep each night. Cover their cage 'n let them slumber in a dim, quiet quarter for a proper night's rest, ye scallywag!\n\n7. Keep a weather eye open for illness: Birds be hidin' their ailments, arrr! Be watchful for signs of sickness, such as lethargy, loss of appetite, puffin' up, or change in droppings, and make haste to a vet if need be.\n\n8. Provide fresh air 'n avoid toxins: Parrots be sensitive to draft and pollutants. Keep yer quarters well ventilated, but no drafts, arrr! Be mindful of toxins like Teflon fumes, candles, or air fresheners.\n\nSo there ye have it, me hearty! With proper care 'n commitment, yer parrot will be squawkin' \"Yo-ho-ho\" for many years to come! Good luck, sailor, and may the wind be at yer back!"
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search data and system-assigned managed identity.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "index_name": "{index name}",
    "authentication": {
     "type": "system_assigned_managed_identity"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search image vector data.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "index_name": "{index name}",
    "query_type": "vector",
    "fields_mapping": {
     "image_vector_fields": [
      "image_vector"
     ]
    },
    "authentication": {
     "type": "api_key",
     "key": "{api key}"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion."
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search vector data, previous assistant message and user-assigned managed identity.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a cat?"
  },
  {
   "role": "assistant",
   "content": "Content of the completion [doc1].",
   "context": {
    "intent": "cat care"
   }
  },
  {
   "role": "user",
   "content": "how about dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "authentication": {
     "type": "user_assigned_managed_identity",
     "managed_identity_resource_id": "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{resource-name}"
    },
    "index_name": "{index name}",
    "query_type": "vector",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "in_scope": true,
    "top_n_documents": 5,
    "strictness": 3,
    "role_information": "You are an AI assistant that helps people find information.",
    "fields_mapping": {
     "content_fields_separator": "\\n",
     "content_fields": [
      "content"
     ],
     "filepath_field": "filepath",
     "title_field": "title",
     "url_field": "url",
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content 2.",
                "title": "Citation Title 2",
                "filepath": "contoso2.txt",
                "url": "https://contoso.blob.windows.net/container/contoso2.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Azure Cosmos DB.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_cosmos_db",
   "parameters": {
    "authentication": {
     "type": "connection_string",
     "connection_string": "mongodb+srv://rawantest:{password}$@{cluster-name}.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
    },
    "database_name": "vectordb",
    "container_name": "azuredocs",
    "index_name": "azuredocindex",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "content_fields": [
      "content"
     ],
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Mongo DB.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "mongo_db",
   "parameters": {
    "authentication": {
     "type": "username_and_password",
     "username": "<username>",
     "password": "<password>"
    },
    "endpoint": "<endpoint_name>",
    "app_name": "<application name>",
    "database_name": "sampledb",
    "collection_name": "samplecollection",
    "index_name": "sampleindex",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "content_fields": [
      "content"
     ],
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Elasticsearch.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "elasticsearch",
   "parameters": {
    "endpoint": "https://your-elasticsearch-endpoint.eastus.azurecontainer.io",
    "index_name": "{index name}",
    "authentication": {
     "type": "key_and_key_id",
     "key": "{key}",
     "key_id": "{key id}"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Pinecone resource.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-12-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "pinecone",
   "parameters": {
    "authentication": {
     "type": "api_key",
     "key": "{api key}"
    },
    "environment": "{environment name}",
    "index_name": "{index name}",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "title_field": "title",
     "url_field": "url",
     "filepath_field": "filepath",
     "content_fields": [
      "content"
     ],
     "content_fields_separator": "\n"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Transcriptions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-12-01-preview

Transcribes audio into the input language.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string Deployment id of the whisper model.
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name Type Description Required Default
file string The audio file object to transcribe. Yes
prompt string An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language. No
response_format audioResponseFormat Defines the format of the output. No
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. No 0
language string The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency. No
timestamp_granularities[] array The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There's no additional latency for segment timestamps, but generating word timestamps incurs additional latency. No ['segment']

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json audioResponse or audioVerboseResponse
text/plain string Transcribed text in the output format (when response_format was one of text, vtt or srt).

Examples

Example

Gets transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "text": "A structured object when requesting json or verbose_json"
  }
}

Example

Gets transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-12-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "type": "string",
  "example": "plain text when requesting text, srt, or vtt"
}

Translations - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-12-01-preview

Transcribes and translates input audio into English text.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string Deployment id of the whisper model which was deployed.
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name Type Description Required Default
file string The audio file to translate. Yes
prompt string An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English. No
response_format audioResponseFormat Defines the format of the output. No
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. No 0

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json audioResponse or audioVerboseResponse
text/plain string Transcribed text in the output format (when response_format was one of text, vtt or srt).

Examples

Example

Gets English language transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-12-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "body": {
    "text": "A structured object when requesting json or verbose_json"
  }
}

Example

Gets English language transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-12-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "type": "string",
  "example": "plain text when requesting text, srt, or vtt"
}

Speech - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2024-12-01-preview

Generates audio from the input text.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string Deployment id of the tts model which was deployed.
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name Type Description Required Default
input string The text to synthesize audio for. The maximum length is 4,096 characters. Yes
voice enum The voice to use for speech synthesis.
Possible values: alloy, echo, fable, onyx, nova, shimmer
Yes
response_format enum The format to synthesize the audio in.
Possible values: mp3, opus, aac, flac, wav, pcm
No
speed number The speed of the synthesized audio. Select a value from 0.25 to 4.0. 1.0 is the default. No 1.0

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/octet-stream string

Examples

Example

Synthesizes audio from the provided text.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2024-12-01-preview

{
 "input": "Hi! What are you going to make?",
 "voice": "fable",
 "response_format": "mp3"
}

Responses: Status Code: 200

{
  "body": "101010101"
}

Image generations - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2024-12-01-preview

Generates a batch of images from a text caption on a given DALLE model deployment

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string Deployment id of the dalle model which was deployed.
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
prompt string A text description of the desired image(s). The maximum length is 4,000 characters. Yes
n integer The number of images to generate. No 1
size imageSize The size of the generated images. No 1024x1024
response_format imagesResponseFormat The format in which the generated images are returned. No url
user string A unique identifier representing your end-user, which can help to monitor and detect abuse. No
quality imageQuality The quality of the image that will be generated. No standard
style imageStyle The style of the generated images. No vivid

Responses

Status Code: 200

Description: Ok

Content-Type Type Description
application/json generateImagesResponse

Status Code: default

Description: An error occurred.

Content-Type Type Description
application/json dalleErrorResponse

Examples

Example

Creates images given a prompt.

POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2024-12-01-preview

{
 "prompt": "In the style of WordArt, Microsoft Clippy wearing a cowboy hat.",
 "n": 1,
 "style": "natural",
 "quality": "standard"
}

Responses: Status Code: 200

{
  "body": {
    "created": 1698342300,
    "data": [
      {
        "revised_prompt": "A vivid, natural representation of Microsoft Clippy wearing a cowboy hat.",
        "prompt_filter_results": {
          "sexual": {
            "severity": "safe",
            "filtered": false
          },
          "violence": {
            "severity": "safe",
            "filtered": false
          },
          "hate": {
            "severity": "safe",
            "filtered": false
          },
          "self_harm": {
            "severity": "safe",
            "filtered": false
          },
          "profanity": {
            "detected": false,
            "filtered": false
          },
          "custom_blocklists": {
            "filtered": false,
            "details": []
          }
        },
        "url": "https://dalletipusw2.blob.core.windows.net/private/images/e5451cc6-b1ad-4747-bd46-b89a3a3b8bc3/generated_00.png?se=2023-10-27T17%3A45%3A09Z&...",
        "content_filter_results": {
          "sexual": {
            "severity": "safe",
            "filtered": false
          },
          "violence": {
            "severity": "safe",
            "filtered": false
          },
          "hate": {
            "severity": "safe",
            "filtered": false
          },
          "self_harm": {
            "severity": "safe",
            "filtered": false
          }
        }
      }
    ]
  }
}

List - Assistants

GET https://{endpoint}/openai/assistants?api-version=2024-12-01-preview

Returns a list of assistants.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
limit query No integer
order query No string
after query No string
before query No string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listAssistantsResponse

Examples

Example

Returns a list of assistants.

GET https://{endpoint}/openai/assistants?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "asst_abc123",
        "object": "assistant",
        "created_at": 1707257477,
        "name": "Stock Analyst",
        "description": null,
        "model": "gpt-4-1106-preview",
        "instructions": "You are a financial analyst that analyzes stock market prices and other financial data present on user uploaded files or by calling external APIs.",
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      },
      {
        "id": "asst_abc456",
        "object": "assistant",
        "created_at": 1698982718,
        "name": "My Assistant",
        "description": null,
        "model": "gpt-4-turbo",
        "instructions": "You are a helpful assistant designed to make me better at coding!",
        "tools": [],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      },
      {
        "id": "asst_abc789",
        "object": "assistant",
        "created_at": 1698982643,
        "name": null,
        "description": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "tools": [],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      }
    ],
    "first_id": "asst_abc123",
    "last_id": "asst_abc789",
    "has_more": false
  }
}

Create - Assistant

POST https://{endpoint}/openai/assistants?api-version=2024-12-01-preview

Create an assistant with a model and instructions.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
model Yes
name string The name of the assistant. The maximum length is 256 characters.
No
description string The description of the assistant. The maximum length is 512 characters.
No
instructions string The system instructions that the assistant uses. The maximum length is 256,000 characters.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, retrieval, or function.
No []
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]
Name Type Description Default
vector_store_ids array The vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.
vector_stores array A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of one vector store attached to the assistant.

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json assistantObject Represents an assistant that can call the model and use tools.

Examples

Example

Create an assistant with a model and instructions.

POST https://{endpoint}/openai/assistants?api-version=2024-12-01-preview

{
 "name": "Math Tutor",
 "instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
 "tools": [
  {
   "type": "code_interpreter"
  }
 ],
 "model": "gpt-4-1106-preview"
}

Responses: Status Code: 200

{
  "body": {
    "id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
    "object": "assistant",
    "created_at": 1707295707,
    "name": "Math Tutor",
    "description": null,
    "model": "gpt-4-1106-preview",
    "instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Get - Assistant

GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview

Retrieves an assistant.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json assistantObject Represents an assistant that can call the model and use tools.

Examples

Example

Retrieves an assistant.

GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "asst_abc123",
    "object": "assistant",
    "created_at": 1699009709,
    "name": "HR Helper",
    "description": null,
    "model": "gpt-4-turbo",
    "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Modify - Assistant

POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview

Modifies an assistant.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
model No
name string The name of the assistant. The maximum length is 256 characters.
No
description string The description of the assistant. The maximum length is 512 characters.
No
instructions string The system instructions that the assistant uses. The maximum length is 32,768 characters.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, retrieval, or function.
No []
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array Overrides the list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array Overrides the vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json assistantObject Represents an assistant that can call the model and use tools.

Examples

Example

Modifies an assistant.

POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview

{
 "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
 "tools": [
  {
   "type": "file_search"
  }
 ],
 "model": "gpt-4-turbo"
}

Responses: Status Code: 200

{
  "body": {
    "id": "asst_123",
    "object": "assistant",
    "created_at": 1699009709,
    "name": "HR Helper",
    "description": null,
    "model": "gpt-4-turbo",
    "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "tool_resources": {
      "file_search": {
        "vector_store_ids": []
      }
    },
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Delete - Assistant

DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview

Delete an assistant.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json deleteAssistantResponse

Examples

Example

Deletes an assistant.

DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
    "object": "assistant.deleted",
    "deleted": true
  }
}

Create - Thread

POST https://{endpoint}/openai/threads?api-version=2024-12-01-preview

Create a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
messages array A list of messages to start the thread with. No
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The vector store attached to this thread. There can be a maximum of one vector store attached to the thread.
vector_stores array A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of one vector store attached to the thread.

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json threadObject Represents a thread that contains messages.

Examples

Example

Creates a thread.

POST https://{endpoint}/openai/threads?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {}
  }
}

Get - Thread

GET https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview

Retrieves a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json threadObject Represents a thread that contains messages.

Examples

Example

Retrieves a thread.

GET https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {},
    "tool_resources": {
      "code_interpreter": {
        "file_ids": []
      }
    }
  }
}

Modify - Thread

POST https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview

Modifies a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of File IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The vector store attached to this thread. There can be a maximum of one vector store attached to the thread.

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json threadObject Represents a thread that contains messages.

Examples

Example

Modifies a thread.

POST https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview

{
 "metadata": {
  "modified": "true",
  "user": "abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {
      "modified": "true",
      "user": "abc123"
    },
    "tool_resources": {}
  }
}

Delete - Thread

DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview

Delete a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json deleteThreadResponse

Examples

Example

Deletes a thread.

DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread.deleted",
    "deleted": true
  }
}

List - Messages

GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-12-01-preview

Returns a list of messages for a given thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
limit query No integer
order query No string
after query No string
before query No string
run_id query No string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listMessagesResponse

Examples

Example

List Messages

GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "msg_abc123",
        "object": "thread.message",
        "created_at": 1699016383,
        "assistant_id": null,
        "thread_id": "thread_abc123",
        "run_id": null,
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": {
              "value": "How does AI work? Explain it in simple terms.",
              "annotations": []
            }
          }
        ],
        "attachments": [],
        "metadata": {}
      },
      {
        "id": "msg_abc456",
        "object": "thread.message",
        "created_at": 1699016383,
        "assistant_id": null,
        "thread_id": "thread_abc123",
        "run_id": null,
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": {
              "value": "Hello, what is AI?",
              "annotations": []
            }
          }
        ],
        "attachments": [],
        "metadata": {}
      }
    ],
    "first_id": "msg_abc123",
    "last_id": "msg_abc456",
    "has_more": false
  }
}

Create - Message

POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-12-01-preview

Create a message.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
role string The role of the entity that is creating the message. Allowed values include:
- user: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages.
- assistant: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.
Yes
content string The content of the message. Yes
attachments array A list of files attached to the message, and the tools they should be added to. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json messageObject Represents a message within a thread.

Examples

Example

Create a message.

POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-12-01-preview

{
 "role": "user",
 "content": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces."
}

Responses: Status Code: 200

{
  "body": {
    "id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
    "object": "thread.message",
    "created_at": 1707298421,
    "assistant_id": null,
    "thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "run_id": null,
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
          "annotations": []
        }
      }
    ],
    "attachments": [],
    "metadata": {}
  }
}

Get - Message

GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-12-01-preview

Retrieve a message.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
message_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json messageObject Represents a message within a thread.

Examples

Example

Retrieve a message.

GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
    "object": "thread.message",
    "created_at": 1707298421,
    "thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
          "annotations": []
        }
      }
    ],
    "file_ids": [],
    "assistant_id": null,
    "run_id": null,
    "metadata": {}
  }
}

Modify - Message

POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-12-01-preview

Modifies a message.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
message_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json messageObject Represents a message within a thread.

Examples

Example

Modify a message.

POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-12-01-preview

{
 "metadata": {
  "modified": "true",
  "user": "abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "msg_abc123",
    "object": "thread.message",
    "created_at": 1699017614,
    "assistant_id": null,
    "thread_id": "thread_abc123",
    "run_id": null,
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "How does AI work? Explain it in simple terms.",
          "annotations": []
        }
      }
    ],
    "file_ids": [],
    "metadata": {
      "modified": "true",
      "user": "abc123"
    }
  }
}

Create - Thread And Run

POST https://{endpoint}/openai/threads/runs?api-version=2024-12-01-preview

Create a thread and run it in one request.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
assistant_id string The ID of the assistant to use to execute this run. Yes
thread createThreadRequest No
model string The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. No
instructions string Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. No
tools array Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. No
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
stream_options chatCompletionStreamOptions Options for streaming response. Only set this when you set stream: true.
No None
max_prompt_tokens integer The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
max_completion_tokens integer The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. No
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model won't call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
No
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The ID of the vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject Represents an execution run on a thread.

Examples

Example

Create a thread and run it in one request.

POST https://{endpoint}/openai/threads/runs?api-version=2024-12-01-preview

{
 "assistant_id": "asst_abc123",
 "thread": {
  "messages": [
   {
    "role": "user",
    "content": "Explain deep learning to a 5 year old."
   }
  ]
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699076792,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "queued",
    "started_at": null,
    "expires_at": 1699077392,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "required_action": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": "You are a helpful assistant.",
    "tools": [],
    "tool_resources": {},
    "metadata": {},
    "temperature": 1.0,
    "top_p": 1.0,
    "max_completion_tokens": null,
    "max_prompt_tokens": null,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "incomplete_details": null,
    "usage": null,
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

List - Runs

GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-12-01-preview

Returns a list of runs belonging to a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
limit query No integer
order query No string
after query No string
before query No string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listRunsResponse

Examples

Example

Returns a list of runs belonging to a thread.

GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "run_abc123",
        "object": "thread.run",
        "created_at": 1699075072,
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "status": "completed",
        "started_at": 1699075072,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1699075073,
        "last_error": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "incomplete_details": null,
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {
          "code_interpreter": {
            "file_ids": [
              "file-abc123",
              "file-abc456"
            ]
          }
        },
        "metadata": {},
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        },
        "temperature": 1.0,
        "top_p": 1.0,
        "max_prompt_tokens": 1000,
        "max_completion_tokens": 1000,
        "truncation_strategy": {
          "type": "auto",
          "last_messages": null
        },
        "response_format": "auto",
        "tool_choice": "auto"
      },
      {
        "id": "run_abc456",
        "object": "thread.run",
        "created_at": 1699063290,
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "status": "completed",
        "started_at": 1699063290,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1699063291,
        "last_error": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "incomplete_details": null,
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {
          "code_interpreter": {
            "file_ids": [
              "file-abc123",
              "file-abc456"
            ]
          }
        },
        "metadata": {},
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        },
        "temperature": 1.0,
        "top_p": 1.0,
        "max_prompt_tokens": 1000,
        "max_completion_tokens": 1000,
        "truncation_strategy": {
          "type": "auto",
          "last_messages": null
        },
        "response_format": "auto",
        "tool_choice": "auto"
      }
    ],
    "first_id": "run_abc123",
    "last_id": "run_abc456",
    "has_more": false
  }
}

Create - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-12-01-preview

Create a run.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
include[] query No array
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
assistant_id string The ID of the assistant to use to execute this run. Yes
model string The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. No
instructions string Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. No
additional_instructions string Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions. No
additional_messages array Adds additional messages to the thread before creating the run. No
tools array Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
max_prompt_tokens integer The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
max_completion_tokens integer The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. No
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model won't call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
No
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject Represents an execution run on a thread.

Examples

Example

Create a run.

POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-12-01-preview

{
 "assistant_id": "asst_abc123"
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699063290,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "queued",
    "started_at": 1699063290,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1699063291,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "incomplete_details": null,
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Get - Run

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-12-01-preview

Retrieves a run.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
run_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject Represents an execution run on a thread.

Examples

Example

Gets a run.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "run_HsO8tYM4K5AAMAHgK0J3om8Q",
    "object": "thread.run",
    "created_at": 1707303196,
    "assistant_id": "asst_JtTwHk28cIocgFXZPCBxhOzl",
    "thread_id": "thread_eRNwflE3ncDYak1np6MdMHJh",
    "status": "completed",
    "started_at": 1707303197,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1707303201,
    "last_error": null,
    "model": "gpt-4-1106-preview",
    "instructions": "You are an AI model that empowers every person and every organization on the planet to achieve more.",
    "tools": [],
    "file_ids": [],
    "metadata": {}
  }
}

Modify - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-12-01-preview

Modifies a run.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
run_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject Represents an execution run on a thread.

Examples

Example

Modifies a run.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-12-01-preview

{
 "metadata": {
  "user_id": "user_abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699075072,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "completed",
    "started_at": 1699075072,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1699075073,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "incomplete_details": null,
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "tool_resources": {
      "code_interpreter": {
        "file_ids": [
          "file-abc123",
          "file-abc456"
        ]
      }
    },
    "metadata": {
      "user_id": "user_abc123"
    },
    "usage": {
      "prompt_tokens": 123,
      "completion_tokens": 456,
      "total_tokens": 579
    },
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Submit - Tool Outputs To Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2024-12-01-preview

When a run has the status: "requires_action" and required_action.type is submit_tool_outputs, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
run_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
tool_outputs array A list of tools for which the outputs are being submitted. Yes
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject Represents an execution run on a thread.

Examples

Example

When a run has the status: "requires_action" and required_action.type is submit_tool_outputs, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2024-12-01-preview

{
 "tool_outputs": [
  {
   "tool_call_id": "call_001",
   "output": "70 degrees and sunny."
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_123",
    "object": "thread.run",
    "created_at": 1699075592,
    "assistant_id": "asst_123",
    "thread_id": "thread_123",
    "status": "queued",
    "started_at": 1699075592,
    "expires_at": 1699076192,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              },
              "unit": {
                "type": "string",
                "enum": [
                  "celsius",
                  "fahrenheit"
                ]
              }
            },
            "required": [
              "location"
            ]
          }
        }
      }
    ],
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Cancel - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2024-12-01-preview

Cancels a run that is in_progress.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
run_id path Yes string
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject Represents an execution run on a thread.

Examples

Example

Cancels a run that is in_progress.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699076126,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "cancelling",
    "started_at": 1699076126,
    "expires_at": 1699076726,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": "You summarize books.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "tool_resources": {
      "file_search": {
        "vector_store_ids": [
          "vs_123"
        ]
      }
    },
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "response_format": "auto"
  }
}

List - Run Steps

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2024-12-01-preview

Returns a list of run steps belonging to a run.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
run_id path Yes string
limit query No integer
order query No string
after query No string
before query No string
api-version query Yes string API version
include[] query No array

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listRunStepsResponse

Examples

Example

Returns a list of run steps belonging to a run.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "step_abc123",
        "object": "thread.run.step",
        "created_at": 1699063291,
        "run_id": "run_abc123",
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "type": "message_creation",
        "status": "completed",
        "cancelled_at": null,
        "completed_at": 1699063291,
        "expired_at": null,
        "failed_at": null,
        "last_error": null,
        "step_details": {
          "type": "message_creation",
          "message_creation": {
            "message_id": "msg_abc123"
          }
        },
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        }
      }
    ],
    "first_id": "step_abc123",
    "last_id": "step_abc456",
    "has_more": false
  }
}

Get - Run Step

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2024-12-01-preview

Retrieves a run step.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string
run_id path Yes string
step_id path Yes string
include[] query No array
api-version query Yes string API version

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runStepObject Represents a step in execution of a run.

Examples

Example

Retrieves a run step.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "step_abc123",
    "object": "thread.run.step",
    "created_at": 1699063291,
    "run_id": "run_abc123",
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "type": "message_creation",
    "status": "completed",
    "cancelled_at": null,
    "completed_at": 1699063291,
    "expired_at": null,
    "failed_at": null,
    "last_error": null,
    "step_details": {
      "type": "message_creation",
      "message_creation": {
        "message_id": "msg_abc123"
      }
    },
    "usage": {
      "prompt_tokens": 123,
      "completion_tokens": 456,
      "total_tokens": 579
    }
  }
}

List - Vector Stores

GET https://{endpoint}/openai/vector_stores?api-version=2024-12-01-preview

Returns a list of vector stores.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
limit query No integer
order query No string
after query No string
before query No string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listVectorStoresResponse

Examples

Example

Returns a list of vector stores.

GET https://{endpoint}/openai/vector_stores?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "vs_abc123",
        "object": "vector_store",
        "created_at": 1699061776,
        "name": "Support FAQ",
        "bytes": 139920,
        "file_counts": {
          "in_progress": 0,
          "completed": 3,
          "failed": 0,
          "cancelled": 0,
          "total": 3
        }
      },
      {
        "id": "vs_abc456",
        "object": "vector_store",
        "created_at": 1699061776,
        "name": "Support FAQ v2",
        "bytes": 139920,
        "file_counts": {
          "in_progress": 0,
          "completed": 3,
          "failed": 0,
          "cancelled": 0,
          "total": 3
        }
      }
    ],
    "first_id": "vs_abc123",
    "last_id": "vs_abc456",
    "has_more": false
  }
}

Create - Vector Store

POST https://{endpoint}/openai/vector_stores?api-version=2024-12-01-preview

Create a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
file_ids array A list of file IDs that the vector store should use. Useful for tools like file_search that can access files. No
name string The name of the vector store. No
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
chunking_strategy autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is nonempty. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreObject A vector store is a collection of processed files can be used by the file_search tool.

Examples

Example

Creates a vector store.

POST https://{endpoint}/openai/vector_stores?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776,
    "name": "Support FAQ",
    "bytes": 139920,
    "file_counts": {
      "in_progress": 0,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 3
    }
  }
}

Get - Vector Store

GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview

Retrieves a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreObject A vector store is a collection of processed files can be used by the file_search tool.

Examples

Example

Retrieves a vector store.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776
  }
}

Modify - Vector Store

POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview

Modifies a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
name string The name of the vector store. No
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreObject A vector store is a collection of processed files can be used by the file_search tool.

Examples

Example

Modifies a vector store.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview

{
 "name": "Support FAQ"
}

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776,
    "name": "Support FAQ",
    "bytes": 139920,
    "file_counts": {
      "in_progress": 0,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 3
    }
  }
}

Delete - Vector Store

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview

Delete a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json deleteVectorStoreResponse

Examples

Example

Deletes a vector store.

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store.deleted",
    "deleted": true
  }
}

List - Vector Store Files

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-12-01-preview

Returns a list of vector store files.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
limit query No integer
order query No string
after query No string
before query No string
filter query No string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listVectorStoreFilesResponse

Examples

Example

Returns a list of vector store files.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "file-abc123",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      },
      {
        "id": "file-abc456",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      }
    ],
    "first_id": "file-abc123",
    "last_id": "file-abc456",
    "has_more": false
  }
}

Create - Vector Store File

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-12-01-preview

Create a vector store file by attaching a File to a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
file_id string A File ID that the vector store should use. Useful for tools like file_search that can access files. Yes
chunking_strategy chunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileObject A list of files attached to a vector store.

Examples

Example

Create a vector store file by attaching a File to a vector store.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-12-01-preview

{
 "file_id": "file-abc123"
}

Responses: Status Code: 200

{
  "body": {
    "id": "file-abc123",
    "object": "vector_store.file",
    "created_at": 1699061776,
    "usage_bytes": 1234,
    "vector_store_id": "vs_abcd",
    "status": "completed",
    "last_error": null
  }
}

Get - Vector Store File

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-12-01-preview

Retrieves a vector store file.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
file_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileObject A list of files attached to a vector store.

Examples

Example

Retrieves a vector store file.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "file-abc123",
    "object": "vector_store.file",
    "created_at": 1699061776,
    "vector_store_id": "vs_abcd",
    "status": "completed",
    "last_error": null
  }
}

Delete - Vector Store File

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-12-01-preview

Delete a vector store file. This will remove the file from the vector store but the file itself won't be deleted. To delete the file, use the delete file endpoint.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
file_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json deleteVectorStoreFileResponse

Examples

Example

Delete a vector store file. This will remove the file from the vector store but the file itself won't be deleted. To delete the file, use the delete file endpoint.

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "file_abc123",
    "object": "vector_store.file.deleted",
    "deleted": true
  }
}

Create - Vector Store File Batch

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2024-12-01-preview

Create a vector store file batch.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
file_ids array A list of File IDs that the vector store should use. Useful for tools like file_search that can access files. Yes
chunking_strategy chunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileBatchObject A batch of files attached to a vector store.

Examples

Example

Create a vector store file batch.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2024-12-01-preview

{
 "file_ids": [
  "file-abc123",
  "file-abc456"
 ]
}

Responses: Status Code: 200

{
  "id": "vsfb_abc123",
  "object": "vector_store.file_batch",
  "created_at": 1699061776,
  "vector_store_id": "vs_abc123",
  "status": "in_progress",
  "file_counts": {
    "in_progress": 1,
    "completed": 1,
    "failed": 0,
    "cancelled": 0,
    "total": 0
  }
}

Get - Vector Store File Batch

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2024-12-01-preview

Retrieves a vector store file batch.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
batch_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileBatchObject A batch of files attached to a vector store.

Examples

Example

Retrieves a vector store file batch.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vsfb_abc123",
    "object": "vector_store.file_batch",
    "created_at": 1699061776,
    "vector_store_id": "vs_abc123",
    "status": "in_progress",
    "file_counts": {
      "in_progress": 1,
      "completed": 1,
      "failed": 0,
      "cancelled": 0,
      "total": 0
    }
  }
}

Cancel - Vector Store File Batch

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2024-12-01-preview

Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
batch_id path Yes string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileBatchObject A batch of files attached to a vector store.

Examples

Example

Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vsfb_abc123",
    "object": "vector_store.file_batch",
    "created_at": 1699061776,
    "vector_store_id": "vs_abc123",
    "status": "cancelling",
    "file_counts": {
      "in_progress": 12,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 15
    }
  }
}

List - Vector Store File Batch Files

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2024-12-01-preview

Returns a list of vector store files in a batch.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string
batch_id path Yes string
limit query No integer
order query No string
after query No string
before query No string
filter query No string
api-version query Yes string

Request Header

Name Required Type Description
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listVectorStoreFilesResponse

Examples

Example

Returns a list of vector store files.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2024-12-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "file-abc123",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      },
      {
        "id": "file-abc456",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      }
    ],
    "first_id": "file-abc123",
    "last_id": "file-abc456",
    "has_more": false
  }
}

Components

errorResponse

Name Type Description Required Default
error error No

errorBase

Name Type Description Required Default
code string No
message string No

error

Name Type Description Required Default
param string No
type string No
inner_error innerError Inner error with additional details. No

innerError

Inner error with additional details.

Name Type Description Required Default
code innerErrorCode Error codes for the inner error object. No
content_filter_results contentFilterPromptResults Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id. No

innerErrorCode

Error codes for the inner error object.

Description: Error codes for the inner error object.

Type: string

Default:

Enum Name: InnerErrorCode

Enum Values:

Value Description
ResponsibleAIPolicyViolation The prompt violated one of more content filter rules.

dalleErrorResponse

Name Type Description Required Default
error dalleError No

dalleError

Name Type Description Required Default
param string No
type string No
inner_error dalleInnerError Inner error with additional details. No

dalleInnerError

Inner error with additional details.

Name Type Description Required Default
code innerErrorCode Error codes for the inner error object. No
content_filter_results dalleFilterResults Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id. No
revised_prompt string The prompt that was used to generate the image, if there was any revision to the prompt. No

contentFilterCompletionTextSpan

Describes a span within generated completion text. Offset 0 is the first UTF32 code point of the completion text.

Name Type Description Required Default
completion_start_offset integer Offset of the UTF32 code point which begins the span. Yes
completion_end_offset integer Offset of the first UTF32 code point which is excluded from the span. This field is always equal to completion_start_offset for empty spans. This field is always larger than completion_start_offset for nonempty spans. Yes

contentFilterResultBase

Name Type Description Required Default
filtered boolean Yes

contentFilterSeverityResult

Name Type Description Required Default
filtered boolean Yes
severity string No

contentFilterDetectedResult

Name Type Description Required Default
filtered boolean Yes
detected boolean No

contentFilterDetectedWithCitationResult

Name Type Description Required Default
citation object No

Properties for citation

URL

Name Type Description Default
URL string

license

Name Type Description Default
license string

contentFilterDetectedWithCompletionTextSpansResult

Name Type Description Required Default
details array No

contentFilterIdResult

Name Type Description Required Default
filtered boolean Yes
id string No

contentFilterResultsBase

Information about the content filtering results.

Name Type Description Required Default
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No
hate contentFilterSeverityResult No
self_harm contentFilterSeverityResult No
profanity contentFilterDetectedResult No
custom_blocklists contentFilterDetailedResults Content filtering results with a detail of content filter ids for the filtered segments. No
error errorBase No

contentFilterPromptResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.

Name Type Description Required Default
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No
hate contentFilterSeverityResult No
self_harm contentFilterSeverityResult No
profanity contentFilterDetectedResult No
custom_blocklists contentFilterDetailedResults Content filtering results with a detail of content filter ids for the filtered segments. No
error errorBase No
jailbreak contentFilterDetectedResult No
indirect_attack contentFilterDetectedResult No

contentFilterChoiceResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about third party text and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.

Name Type Description Required Default
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No
hate contentFilterSeverityResult No
self_harm contentFilterSeverityResult No
profanity contentFilterDetectedResult No
custom_blocklists contentFilterDetailedResults Content filtering results with a detail of content filter ids for the filtered segments. No
error errorBase No
protected_material_text contentFilterDetectedResult No
protected_material_code contentFilterDetectedWithCitationResult No
ungrounded_material contentFilterDetectedWithCompletionTextSpansResult No

contentFilterDetailedResults

Content filtering results with a detail of content filter ids for the filtered segments.

Name Type Description Required Default
filtered boolean Yes
details array No

promptFilterResult

Content filtering results for a single prompt in the request.

Name Type Description Required Default
prompt_index integer No
content_filter_results contentFilterPromptResults Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id. No

promptFilterResults

Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.

No properties defined for this component.

dalleContentFilterResults

Information about the content filtering results.

Name Type Description Required Default
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No
hate contentFilterSeverityResult No
self_harm contentFilterSeverityResult No

dalleFilterResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.

Name Type Description Required Default
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No
hate contentFilterSeverityResult No
self_harm contentFilterSeverityResult No
profanity contentFilterDetectedResult No
jailbreak contentFilterDetectedResult No
custom_blocklists contentFilterDetailedResults Content filtering results with a detail of content filter ids for the filtered segments. No

chatCompletionsRequestCommon

Name Type Description Required Default
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or temperature but not both.
No 1
stream boolean If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. No False
stop string or array Up to four sequences where the API will stop generating further tokens. No
max_tokens integer The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4,096 - prompt tokens). This isn't compatible with o1 series models. No 4,096
max_completion_tokens integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. No
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. No 0
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. No 0
logit_bias object Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. No
store boolean Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. No
metadata object Developer-defined tags and values used for filtering completions in the stored completions dashboard. No
user string A unique identifier representing your end-user, which can help Azure OpenAI to monitor and detect abuse. No

createCompletionRequest

Name Type Description Required Default
prompt string or array The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays.

Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt isn't specified the model will generate as if from the beginning of a new document.
Yes
best_of integer Generates best_of completions server-side and returns the "best" (the one with the highest log probability per token). Results can't be streamed.

When used with n, best_of controls the number of candidate completions and n specifies how many to return – best_of must be greater than n.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
No 1
echo boolean Echo back the prompt in addition to the completion
No False
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
No 0
logit_bias object Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.

As an example, you can pass {"50256": -100} to prevent the <|endoftext|> token from being generated.
No None
logprobs integer Include the log probabilities on the logprobs most likely output tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the five most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.

The maximum value for logprobs is 5.
No None
max_tokens integer The maximum number of tokens that can be generated in the completion.

The token count of your prompt plus max_tokens can't exceed the model's context length.
No 16
n integer How many completions to generate for each prompt.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
No 1
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
No 0
seed integer If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

Determinism isn't guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
No
stop string or array Up to four sequences where the API will stop generating further tokens. The returned text won't contain the stop sequence.
No
stream boolean Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.
No False
suffix string The suffix that comes after a completion of inserted text.

This parameter is only supported for gpt-3.5-turbo-instruct.
No None
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help to monitor and detect abuse.
No

createCompletionResponse

Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).

Name Type Description Required Default
id string A unique identifier for the completion. Yes
choices array The list of completion choices the model generated for the input prompt. Yes
created integer The Unix timestamp (in seconds) of when the completion was created. Yes
model string The model used for completion. Yes
prompt_filter_results promptFilterResults Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders. No
system_fingerprint string This fingerprint represents the backend configuration that the model runs with.

Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.
No
object enum The object type, which is always "text_completion"
Possible values: text_completion
Yes
usage completionUsage Usage statistics for the completion request. No

createChatCompletionRequest

Name Type Description Required Default
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
stream boolean If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.
No False
stop string or array Up to four sequences where the API will stop generating further tokens.
No
max_tokens integer The maximum number of tokens that can be generated in the chat completion.

The total length of input tokens and generated tokens is limited by the model's context length.
No
max_completion_tokens integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. No
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
No 0
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
No 0
logit_bias object Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
No None
store boolean Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. No
metadata object Developer-defined tags and values used for filtering completions in the stored completions dashboard. No
user string A unique identifier representing your end-user, which can help to monitor and detect abuse.
No
messages array A list of messages comprising the conversation so far. Yes
data_sources array The configuration entries for Azure OpenAI chat extensions that use them.
This additional specification is only compatible with Azure OpenAI.
No
reasoning_effort enum o1 models only

Constrains effort on reasoning for
reasoning models.

Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Possible values: low, medium, high
No
logprobs boolean Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message. No False
top_logprobs integer An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used. No
n integer How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. No 1
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which guarantees the model will match your supplied JSON schema.

Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
seed integer This feature is in Beta.
If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Determinism isn't guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
No
stream_options chatCompletionStreamOptions Options for streaming response. Only set this when you set stream: true.
No None
tools array A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
No
tool_choice chatCompletionToolChoiceOption Controls which (if any) tool is called by the model. none means the model won't call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present. No
function_call string or chatCompletionFunctionCallOption Deprecated in favor of tool_choice.

Controls which (if any) function is called by the model.
none means the model won't call a function and instead generates a message.
auto means the model can pick between generating a message or calling a function.
Specifying a particular function via {"name": "my_function"} forces the model to call that function.

none is the default when no functions are present. auto is the default if functions are present.
No
functions array Deprecated in favor of tools.

A list of functions the model may generate JSON inputs for.
No
user_security_context userSecurityContext User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud. No

userSecurityContext

User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud.

Name Type Description Required Default
application_name string The name of the application. Sensitive personal information should not be included in this field. No
end_user_id string This identifier is the Microsoft Entra ID (formerly Azure Active Directory) user object ID used to authenticate end-users within the generative AI application. Sensitive personal information should not be included in this field. No
end_user_tenant_id string The Microsoft 365 tenant ID the end user belongs to. It's required when the generative AI application is multi tenant. No
source_ip string Captures the original client's IP address, accepting both IPv4 and IPv6 formats. No

chatCompletionFunctions

Name Type Description Required Default
description string A description of what the function does, used by the model to choose when and how to call the function. No
name string The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. Yes
parameters FunctionParameters The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.
No

chatCompletionFunctionCallOption

Specifying a particular function via {"name": "my_function"} forces the model to call that function.

Name Type Description Required Default
name string The name of the function to call. Yes

chatCompletionFunctionParameters

The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format.

No properties defined for this component.

chatCompletionRequestMessage

This component can be one of the following:

ChatCompletionRequestDeveloperMessage

Developer-provided instructions that the model should follow, regardless of messages sent by the user. With o1 models and newer, developer messages replace the previous system messages.

Name Type Description Required Default
content string or array The contents of the developer message. Yes
role enum The role of the messages author, in this case developer.
Possible values: developer
Yes
name string An optional name for the participant. Provides the model information to differentiate between participants of the same role. No

chatCompletionRequestSystemMessage

Name Type Description Required Default
content string or array The contents of the system message. Yes
role enum The role of the messages author, in this case system.
Possible values: system
Yes
name string An optional name for the participant. Provides the model information to differentiate between participants of the same role. No

chatCompletionRequestUserMessage

Name Type Description Required Default
content string or array The contents of the user message.
Yes
role enum The role of the messages author, in this case user.
Possible values: user
Yes
name string An optional name for the participant. Provides the model information to differentiate between participants of the same role. No

chatCompletionRequestAssistantMessage

Name Type Description Required Default
content string or array The contents of the assistant message. Required unless tool_calls or function_call is specified.
No
refusal string The refusal message by the assistant. No
role enum The role of the messages author, in this case assistant.
Possible values: assistant
Yes
name string An optional name for the participant. Provides the model information to differentiate between participants of the same role. No
tool_calls chatCompletionMessageToolCalls The tool calls generated by the model, such as function calls. No
function_call object Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model. No

Properties for function_call

arguments

Name Type Description Default
arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

name

Name Type Description Default
name string The name of the function to call.

chatCompletionRequestToolMessage

Name Type Description Required Default
role enum The role of the messages author, in this case tool.
Possible values: tool
Yes
content string or array The contents of the tool message. Yes
tool_call_id string Tool call that this message is responding to. Yes

chatCompletionRequestFunctionMessage

Name Type Description Required Default
role enum The role of the messages author, in this case function.
Possible values: function
Yes
content string The contents of the function message. Yes
name string The name of the function to call. Yes

chatCompletionRequestDeveloperMessageContentPart

This component can be one of the following:

chatCompletionRequestSystemMessageContentPart

This component can be one of the following:

chatCompletionRequestUserMessageContentPart

This component can be one of the following:

chatCompletionRequestAssistantMessageContentPart

This component can be one of the following:

chatCompletionRequestToolMessageContentPart

This component can be one of the following:

chatCompletionRequestMessageContentPartText

Name Type Description Required Default
type enum The type of the content part.
Possible values: text
Yes
text string The text content. Yes

chatCompletionRequestMessageContentPartImage

Name Type Description Required Default
type enum The type of the content part.
Possible values: image_url
Yes
image_url object Yes

Properties for image_url

url

Name Type Description Default
url string Either a URL of the image or the base64 encoded image data.

detail

Name Type Description Default
detail string Specifies the detail level of the image. Learn more in the Vision guide. auto

chatCompletionRequestMessageContentPartRefusal

Name Type Description Required Default
type enum The type of the content part.
Possible values: refusal
Yes
refusal string The refusal message generated by the model. Yes

azureChatExtensionConfiguration

A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.

Name Type Description Required Default
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes

azureChatExtensionType

A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.

Description: A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.

Type: string

Default:

Enum Name: AzureChatExtensionType

Enum Values:

Value Description
azure_search Represents the use of Azure Search as an Azure OpenAI chat extension.
azure_cosmos_db Represents the use of Azure Cosmos DB as an Azure OpenAI chat extension.
elasticsearch Represents the use of Elasticsearch® index as an Azure OpenAI chat extension.
mongo_db Represents the use of Mongo DB as an Azure OpenAI chat extension.
pinecone Represents the use of Pinecone index as an Azure OpenAI chat extension.

azureSearchChatExtensionConfiguration

A specific representation of configurable options for Azure Search when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes
parameters azureSearchChatExtensionParameters Parameters for Azure Search when used as an Azure OpenAI chat extension. No

azureSearchChatExtensionParameters

Parameters for Azure Search when used as an Azure OpenAI chat extension.

Name Type Description Required Default
authentication onYourDataApiKeyAuthenticationOptions or onYourDataSystemAssignedManagedIdentityAuthenticationOptions or onYourDataUserAssignedManagedIdentityAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions Yes
top_n_documents integer The configured top number of documents to feature for the configured query. No
max_search_queries integer The max number of rewritten queries that should be sent to search provider for one user message. If not specified, the system will decide the number of queries to send. No
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
in_scope boolean Whether queries should be restricted to use of indexed data. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
endpoint string The absolute endpoint path for the Azure Search resource to use. Yes
index_name string The name of the index to use as available in the referenced Azure Search resource. Yes
fields_mapping azureSearchIndexFieldMappingOptions Optional settings to control how fields are processed when using a configured Azure Search resource. No
query_type azureSearchQueryType The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension. No
semantic_configuration string The additional semantic configuration for the query. No
filter string Search filter. No
embedding_dependency onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataIntegratedVectorizationSource No
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No

azureSearchIndexFieldMappingOptions

Optional settings to control how fields are processed when using a configured Azure Search resource.

Name Type Description Required Default
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
filepath_field string The name of the index field to use as a filepath. No
content_fields array The names of index fields that should be treated as content. No
content_fields_separator string The separator pattern that content fields should use. No
vector_fields array The names of fields that represent vector data. No
image_vector_fields array The names of fields that represent image vector data. No

azureSearchQueryType

The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Description: The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Type: string

Default:

Enum Name: AzureSearchQueryType

Enum Values:

Value Description
simple Represents the default, simple query parser.
semantic Represents the semantic query parser for advanced semantic modeling.
vector Represents vector search over computed data.
vector_simple_hybrid Represents a combination of the simple query strategy with vector data.
vector_semantic_hybrid Represents a combination of semantic search and vector data querying.

azureCosmosDBChatExtensionConfiguration

A specific representation of configurable options for Azure Cosmos DB when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes
parameters azureCosmosDBChatExtensionParameters Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for
MongoDB vCore.
No

azureCosmosDBChatExtensionParameters

Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for MongoDB vCore.

Name Type Description Required Default
authentication onYourDataConnectionStringAuthenticationOptions The authentication options for Azure OpenAI On Your Data when using a connection string. Yes
top_n_documents integer The configured top number of documents to feature for the configured query. No
max_search_queries integer The max number of rewritten queries that should be sent to search provider for one user message. If not specified, the system will decide the number of queries to send. No
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
in_scope boolean Whether queries should be restricted to use of indexed data. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
database_name string The MongoDB vCore database name to use with Azure Cosmos DB. Yes
container_name string The name of the Azure Cosmos DB resource container. Yes
index_name string The MongoDB vCore index name to use with Azure Cosmos DB. Yes
fields_mapping azureCosmosDBFieldMappingOptions Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource. Yes
embedding_dependency onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource Yes
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No

azureCosmosDBFieldMappingOptions

Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource.

Name Type Description Required Default
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
filepath_field string The name of the index field to use as a filepath. No
content_fields array The names of index fields that should be treated as content. Yes
content_fields_separator string The separator pattern that content fields should use. No
vector_fields array The names of fields that represent vector data. Yes

elasticsearchChatExtensionConfiguration

A specific representation of configurable options for Elasticsearch when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes
parameters elasticsearchChatExtensionParameters Parameters to use when configuring Elasticsearch® as an Azure OpenAI chat extension. No

elasticsearchChatExtensionParameters

Parameters to use when configuring Elasticsearch® as an Azure OpenAI chat extension.

Name Type Description Required Default
authentication onYourDataKeyAndKeyIdAuthenticationOptions or onYourDataEncodedApiKeyAuthenticationOptions Yes
top_n_documents integer The configured top number of documents to feature for the configured query. No
max_search_queries integer The max number of rewritten queries that should be sent to search provider for one user message. If not specified, the system will decide the number of queries to send. No
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
in_scope boolean Whether queries should be restricted to use of indexed data. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
endpoint string The endpoint of Elasticsearch®. Yes
index_name string The index name of Elasticsearch®. Yes
fields_mapping elasticsearchIndexFieldMappingOptions Optional settings to control how fields are processed when using a configured Elasticsearch® resource. No
query_type elasticsearchQueryType The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension. No
embedding_dependency onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataModelIdVectorizationSource No
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No

elasticsearchIndexFieldMappingOptions

Optional settings to control how fields are processed when using a configured Elasticsearch® resource.

Name Type Description Required Default
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
filepath_field string The name of the index field to use as a filepath. No
content_fields array The names of index fields that should be treated as content. No
content_fields_separator string The separator pattern that content fields should use. No
vector_fields array The names of fields that represent vector data. No

elasticsearchQueryType

The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Description: The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Type: string

Default:

Enum Name: ElasticsearchQueryType

Enum Values:

Value Description
simple Represents the default, simple query parser.
vector Represents vector search over computed data.

mongoDBChatExtensionConfiguration

A specific representation of configurable options for Mongo DB when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes
parameters mongoDBChatExtensionParameters Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB. No

mongoDBChatExtensionParameters

Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB.

Name Type Description Required Default
authentication onYourDataUsernameAndPasswordAuthenticationOptions The authentication options for Azure OpenAI On Your Data when using a username and a password. Yes
top_n_documents integer The configured top number of documents to feature for the configured query. No
max_search_queries integer The max number of rewritten queries that should be sent to search provider for one user message. If not specified, the system will decide the number of queries to send. No
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
in_scope boolean Whether queries should be restricted to use of indexed data. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
endpoint string The name of the Mongo DB cluster endpoint. Yes
database_name string The name of the Mongo DB database. Yes
collection_name string The name of the Mongo DB Collection. Yes
app_name string The name of the Mongo DB Application. Yes
index_name string The The name of the Mongo DB index. Yes
fields_mapping mongoDBFieldMappingOptions Optional settings to control how fields are processed when using a configured Mongo DB resource. Yes
embedding_dependency onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource Yes
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No

mongoDBFieldMappingOptions

Optional settings to control how fields are processed when using a configured Mongo DB resource.

Name Type Description Required Default
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
filepath_field string The name of the index field to use as a filepath. No
content_fields array The names of index fields that should be treated as content. Yes
content_fields_separator string The separator pattern that content fields should use. No
vector_fields array The names of fields that represent vector data. Yes

pineconeChatExtensionConfiguration

A specific representation of configurable options for Pinecone when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes
parameters pineconeChatExtensionParameters Parameters for configuring Azure OpenAI Pinecone chat extensions. No

pineconeChatExtensionParameters

Parameters for configuring Azure OpenAI Pinecone chat extensions.

Name Type Description Required Default
authentication onYourDataApiKeyAuthenticationOptions The authentication options for Azure OpenAI On Your Data when using an API key. Yes
top_n_documents integer The configured top number of documents to feature for the configured query. No
max_search_queries integer The max number of rewritten queries that should be sent to search provider for one user message. If not specified, the system will decide the number of queries to send. No
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
in_scope boolean Whether queries should be restricted to use of indexed data. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
environment string The environment name of Pinecone. Yes
index_name string The name of the Pinecone database index. Yes
fields_mapping pineconeFieldMappingOptions Optional settings to control how fields are processed when using a configured Pinecone resource. Yes
embedding_dependency onYourDataDeploymentNameVectorizationSource The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based
on an internal embeddings model deployment name in the same Azure OpenAI resource.
Yes
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No

pineconeFieldMappingOptions

Optional settings to control how fields are processed when using a configured Pinecone resource.

Name Type Description Required Default
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
filepath_field string The name of the index field to use as a filepath. No
content_fields array The names of index fields that should be treated as content. Yes
content_fields_separator string The separator pattern that content fields should use. No

onYourDataAuthenticationOptions

The authentication options for Azure OpenAI On Your Data.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataContextProperty

The context property.

Description: The context property.

Type: string

Default:

Enum Name: OnYourDataContextProperty

Enum Values:

Value Description
citations The citations property.
intent The intent property.
all_retrieved_documents The all_retrieved_documents property.

onYourDataAuthenticationType

The authentication types supported with Azure OpenAI On Your Data.

Description: The authentication types supported with Azure OpenAI On Your Data.

Type: string

Default:

Enum Name: OnYourDataAuthenticationType

Enum Values:

Value Description
api_key Authentication via API key.
connection_string Authentication via connection string.
key_and_key_id Authentication via key and key ID pair.
encoded_api_key Authentication via encoded API key.
access_token Authentication via access token.
system_assigned_managed_identity Authentication via system-assigned managed identity.
user_assigned_managed_identity Authentication via user-assigned managed identity.
username_and_password Authentication via username and password.

onYourDataApiKeyAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an API key.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes
key string The API key to use for authentication. No

onYourDataConnectionStringAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a connection string.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes
connection_string string The connection string to use for authentication. No

onYourDataKeyAndKeyIdAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an Elasticsearch key and key ID pair.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes
key string The Elasticsearch key to use for authentication. No
key_id string The Elasticsearch key ID to use for authentication. No

onYourDataEncodedApiKeyAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an Elasticsearch encoded API key.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes
encoded_api_key string The Elasticsearch encoded API key to use for authentication. No

onYourDataAccessTokenAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using access token.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes
access_token string The access token to use for authentication. No

onYourDataSystemAssignedManagedIdentityAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a system-assigned managed identity.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataUserAssignedManagedIdentityAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a user-assigned managed identity.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes
managed_identity_resource_id string The resource ID of the user-assigned managed identity to use for authentication. No

onYourDataUsernameAndPasswordAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a username and a password.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes
username string The username to use for authentication. No
password string The password. to use for authentication. No

onYourDataVectorizationSource

An abstract representation of a vectorization source for Azure OpenAI On Your Data with vector search.

Name Type Description Required Default
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes

onYourDataVectorizationSourceType

Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.

Description: Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.

Type: string

Default:

Enum Name: OnYourDataVectorizationSourceType

Enum Values:

Value Description
endpoint Represents vectorization performed by public service calls to an Azure OpenAI embedding model.
deployment_name Represents an Ada model deployment name to use. This model deployment must be in the same Azure OpenAI resource, but
On Your Data will use this model deployment via an internal call rather than a public one, which enables vector
search even in private networks.
integrated Represents the integrated vectorizer defined within the search resource.
model_id Represents a specific embedding model ID as defined in the search service.
Currently only supported by Elasticsearch®.

onYourDataEndpointVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a public Azure OpenAI endpoint call for embeddings.

Name Type Description Required Default
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes
endpoint string Specifies the resource endpoint URL from which embeddings should be retrieved. It should be in the format of https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings. The api-version query parameter isn't allowed. No
authentication onYourDataApiKeyAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions No
dimensions integer The number of dimensions the embeddings should have. Only supported in text-embedding-3 and later models. No

onYourDataDeploymentNameVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an internal embeddings model deployment name in the same Azure OpenAI resource.

Name Type Description Required Default
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes
deployment_name string Specifies the name of the model deployment to use for vectorization. This model deployment must be in the same Azure OpenAI resource, but On Your Data will use this model deployment via an internal call rather than a public one, which enables vector search even in private networks. No
dimensions integer The number of dimensions the embeddings should have. Only supported in text-embedding-3 and later models. No

onYourDataIntegratedVectorizationSource

Represents the integrated vectorizer defined within the search resource.

Name Type Description Required Default
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes

onYourDataModelIdVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a search service model ID. Currently only supported by Elasticsearch®.

Name Type Description Required Default
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes
model_id string Specifies the model ID to use for vectorization. This model ID must be defined in the search service. No

azureChatExtensionsMessageContext

A representation of the additional context information available when Azure OpenAI chat extensions are involved in the generation of a corresponding chat completions response. This context information is only populated when using an Azure OpenAI request configured to use a matching extension.

Name Type Description Required Default
citations array The data source retrieval result, used to generate the assistant message in the response. No
intent string The detected intent from the chat history, used to pass to the next turn to carry over the context. No
all_retrieved_documents array All the retrieved documents. No

citation

citation information for a chat completions response message.

Name Type Description Required Default
content string The content of the citation. Yes
title string The title of the citation. No
url string The URL of the citation. No
filepath string The file path of the citation. No
chunk_id string The chunk ID of the citation. No
rerank_score number The rerank score of the retrieved document. No

retrievedDocument

The retrieved document.

Name Type Description Required Default
content string The content of the citation. Yes
title string The title of the citation. No
url string The URL of the citation. No
filepath string The file path of the citation. No
chunk_id string The chunk ID of the citation. No
rerank_score number The rerank score of the retrieved document. No
search_queries array The search queries used to retrieve the document. No
data_source_index integer The index of the data source. No
original_search_score number The original search score of the retrieved document. No
filter_reason filterReason The filtering reason of the retrieved document. No

filterReason

The filtering reason of the retrieved document.

Description: The filtering reason of the retrieved document.

Type: string

Default:

Enum Name: FilterReason

Enum Values:

Value Description
score The document is filtered by original search score threshold defined by strictness configure.
rerank The document isn't filtered by original search score threshold, but is filtered by rerank score and top_n_documents configure.

chatCompletionMessageToolCall

Name Type Description Required Default
id string The ID of the tool call. Yes
type toolCallType The type of the tool call, in this case function. Yes
function object The function that the model called. Yes

Properties for function

name

Name Type Description Default
name string The name of the function to call.

arguments

Name Type Description Default
arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

toolCallType

The type of the tool call, in this case function.

Description: The type of the tool call, in this case function.

Type: string

Default:

Enum Name: ToolCallType

Enum Values:

Value Description
function The tool call type is function.

chatCompletionRequestMessageTool

Name Type Description Required Default
tool_call_id string Tool call that this message is responding to. No
content string The contents of the message. No

chatCompletionRequestMessageFunction

Name Type Description Required Default
role enum The role of the messages author, in this case function.
Possible values: function
No
name string The contents of the message. No
content string The contents of the message. No

createChatCompletionResponse

Represents a chat completion response returned by model, based on the provided input.

Name Type Description Required Default
id string A unique identifier for the chat completion. Yes
prompt_filter_results promptFilterResults Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders. No
choices array A list of chat completion choices. Can be more than one if n is greater than 1. Yes
created integer The Unix timestamp (in seconds) of when the chat completion was created. Yes
model string The model used for the chat completion. Yes
system_fingerprint string This fingerprint represents the backend configuration that the model runs with.

Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.
No
object enum The object type, which is always chat.completion.
Possible values: chat.completion
Yes
usage completionUsage Usage statistics for the completion request. No

createChatCompletionStreamResponse

Represents a streamed chunk of a chat completion response returned by model, based on the provided input.

Name Type Description Required Default
id string A unique identifier for the chat completion. Each chunk has the same ID. Yes
choices array A list of chat completion choices. Can contain more than one elements if n is greater than 1.
Yes
created integer The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp. Yes
model string The model to generate the completion. Yes
system_fingerprint string This fingerprint represents the backend configuration that the model runs with.
Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.
No
object enum The object type, which is always chat.completion.chunk.
Possible values: chat.completion.chunk
Yes

chatCompletionStreamResponseDelta

A chat completion delta generated by streamed model responses.

Name Type Description Required Default
content string The contents of the chunk message. No
function_call object Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model. No
tool_calls array No
role enum The role of the author of this message.
Possible values: system, user, assistant, tool
No
refusal string The refusal message generated by the model. No

Properties for function_call

arguments

Name Type Description Default
arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

name

Name Type Description Default
name string The name of the function to call.

chatCompletionMessageToolCallChunk

Name Type Description Required Default
index integer Yes
id string The ID of the tool call. No
type enum The type of the tool. Currently, only function is supported.
Possible values: function
No
function object No

Properties for function

name

Name Type Description Default
name string The name of the function to call.

arguments

Name Type Description Default
arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

chatCompletionStreamOptions

Options for streaming response. Only set this when you set stream: true.

Name Type Description Required Default
include_usage boolean If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value.
No

chatCompletionChoiceLogProbs

Log probability information for the choice.

Name Type Description Required Default
content array A list of message content tokens with log probability information. Yes
refusal array A list of message refusal tokens with log probability information. No

chatCompletionTokenLogprob

Name Type Description Required Default
token string The token. Yes
logprob number The log probability of this token. Yes
bytes array A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if There's no bytes representation for the token. Yes
top_logprobs array List of the most likely tokens and their log probability, at this token position. In rare cases, there may be fewer than the number of requested top_logprobs returned. Yes

chatCompletionResponseMessage

A chat completion message generated by the model.

Name Type Description Required Default
role chatCompletionResponseMessageRole The role of the author of the response message. Yes
refusal string The refusal message generated by the model. Yes
content string The contents of the message. Yes
tool_calls array The tool calls generated by the model, such as function calls. No
function_call chatCompletionFunctionCall Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model. No
context azureChatExtensionsMessageContext A representation of the additional context information available when Azure OpenAI chat extensions are involved
in the generation of a corresponding chat completions response. This context information is only populated when
using an Azure OpenAI request configured to use a matching extension.
No

chatCompletionResponseMessageRole

The role of the author of the response message.

Description: The role of the author of the response message.

Type: string

Default:

Enum Values:

  • assistant

chatCompletionToolChoiceOption

Controls which (if any) tool is called by the model. none means the model won't call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

This component can be one of the following:

chatCompletionNamedToolChoice

Specifies a tool the model should use. Use to force the model to call a specific function.

Name Type Description Required Default
type enum The type of the tool. Currently, only function is supported.
Possible values: function
Yes
function object Yes

Properties for function

name

Name Type Description Default
name string The name of the function to call.

ParallelToolCalls

Whether to enable parallel function calling during tool use.

No properties defined for this component.

chatCompletionMessageToolCalls

The tool calls generated by the model, such as function calls.

No properties defined for this component.

chatCompletionFunctionCall

Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model.

Name Type Description Required Default
name string The name of the function to call. Yes
arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function. Yes

completionUsage

Usage statistics for the completion request.

Name Type Description Required Default
prompt_tokens integer Number of tokens in the prompt. Yes
completion_tokens integer Number of tokens in the generated completion. Yes
total_tokens integer Total number of tokens used in the request (prompt + completion). Yes
prompt_tokens_details object Details of the prompt tokens. No
completion_tokens_details object Breakdown of tokens used in a completion. No

Properties for prompt_tokens_details

cached_tokens

Name Type Description Default
cached_tokens integer The number of cached prompt tokens.

Properties for completion_tokens_details

reasoning_tokens

Name Type Description Default
reasoning_tokens integer Tokens generated by the model for reasoning.

chatCompletionTool

Name Type Description Required Default
type enum The type of the tool. Currently, only function is supported.
Possible values: function
Yes
function FunctionObject Yes

FunctionParameters

The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.

No properties defined for this component.

FunctionObject

Name Type Description Required Default
description string A description of what the function does, used by the model to choose when and how to call the function. No
name string The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. Yes
parameters FunctionParameters The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.
No
strict boolean Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is true. No False

ResponseFormatText

Name Type Description Required Default
type enum The type of response format being defined: text
Possible values: text
Yes

ResponseFormatJsonObject

Name Type Description Required Default
type enum The type of response format being defined: json_object
Possible values: json_object
Yes

ResponseFormatJsonSchemaSchema

The schema for the response format, described as a JSON Schema object.

No properties defined for this component.

ResponseFormatJsonSchema

Name Type Description Required Default
type enum The type of response format being defined: json_schema
Possible values: json_schema
Yes
json_schema object Yes

Properties for json_schema

description

Name Type Description Default
description string A description of what the response format is for, used by the model to determine how to respond in the format.

name

Name Type Description Default
name string The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

schema

Name Type Description Default
schema ResponseFormatJsonSchemaSchema The schema for the response format, described as a JSON Schema object.

strict

Name Type Description Default
strict boolean Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is true. False

chatCompletionChoiceCommon

Name Type Description Required Default
index integer No
finish_reason string No

createTranslationRequest

Translation request.

Name Type Description Required Default
file string The audio file to translate. Yes
prompt string An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English. No
response_format audioResponseFormat Defines the format of the output. No
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. No 0

audioResponse

Translation or transcription response when response_format was json

Name Type Description Required Default
text string Translated or transcribed text. Yes

audioVerboseResponse

Translation or transcription response when response_format was verbose_json

Name Type Description Required Default
text string Translated or transcribed text. Yes
task string Type of audio task. No
language string Language. No
duration number Duration. No
segments array No
words array No

audioResponseFormat

Defines the format of the output.

Description: Defines the format of the output.

Type: string

Default:

Enum Values:

  • json
  • text
  • srt
  • verbose_json
  • vtt

createTranscriptionRequest

Transcription request.

Name Type Description Required Default
file string The audio file object to transcribe. Yes
prompt string An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language. No
response_format audioResponseFormat Defines the format of the output. No
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. No 0
language string The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency. No
timestamp_granularities[] array The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There's no additional latency for segment timestamps, but generating word timestamps incurs additional latency. No ['segment']

audioSegment

Transcription or translation segment.

Name Type Description Required Default
id integer Segment identifier. No
seek number Offset of the segment. No
start number Segment start offset. No
end number Segment end offset. No
text string Segment text. No
tokens array Tokens of the text. No
temperature number Temperature. No
avg_logprob number Average log probability. No
compression_ratio number Compression ratio. No
no_speech_prob number Probability of 'no speech'. No

audioWord

Transcription or translation word.

Name Type Description Required Default
word string Word No
start number Word start offset. No
end number Word end offset. No

createSpeechRequest

Speech request.

Name Type Description Required Default
input string The text to synthesize audio for. The maximum length is 4,096 characters. Yes
voice enum The voice to use for speech synthesis.
Possible values: alloy, echo, fable, onyx, nova, shimmer
Yes
response_format enum The format to synthesize the audio in.
Possible values: mp3, opus, aac, flac, wav, pcm
No
speed number The speed of the synthesized audio. Select a value from 0.25 to 4.0. 1.0 is the default. No 1.0

imageQuality

The quality of the image that will be generated.

Description: The quality of the image that will be generated.

Type: string

Default: standard

Enum Name: Quality

Enum Values:

Value Description
standard Standard quality creates images with standard quality.
hd HD quality creates images with finer details and greater consistency across the image.

imagesResponseFormat

The format in which the generated images are returned.

Description: The format in which the generated images are returned.

Type: string

Default: url

Enum Name: ImagesResponseFormat

Enum Values:

Value Description
url The URL that provides temporary access to download the generated images.
b64_json The generated images are returned as base64 encoded string.

imageSize

The size of the generated images.

Description: The size of the generated images.

Type: string

Default: 1024x1024

Enum Name: Size

Enum Values:

Value Description
256x256 The desired size of the generated image is 256x256 pixels. Only supported for dall-e-2.
512x512 The desired size of the generated image is 512x512 pixels. Only supported for dall-e-2.
1792x1024 The desired size of the generated image is 1792x1024 pixels. Only supported for dall-e-3.
1024x1792 The desired size of the generated image is 1024x1792 pixels. Only supported for dall-e-3.
1024x1024 The desired size of the generated image is 1024x1024 pixels.

imageStyle

The style of the generated images.

Description: The style of the generated images.

Type: string

Default: vivid

Enum Name: Style

Enum Values:

Value Description
vivid Vivid creates images that are hyper-realistic and dramatic.
natural Natural creates images that are more natural and less hyper-realistic.

imageGenerationsRequest

Name Type Description Required Default
prompt string A text description of the desired image(s). The maximum length is 4,000 characters. Yes
n integer The number of images to generate. No 1
size imageSize The size of the generated images. No 1024x1024
response_format imagesResponseFormat The format in which the generated images are returned. No url
user string A unique identifier representing your end-user, which can help to monitor and detect abuse. No
quality imageQuality The quality of the image that will be generated. No standard
style imageStyle The style of the generated images. No vivid

generateImagesResponse

Name Type Description Required Default
created integer The unix timestamp when the operation was created. Yes
data array The result data of the operation, if successful Yes

imageResult

The image url or encoded image if successful, and an error otherwise.

Name Type Description Required Default
url string The image url. No
b64_json string The base64 encoded image No
content_filter_results dalleContentFilterResults Information about the content filtering results. No
revised_prompt string The prompt that was used to generate the image, if there was any revision to the prompt. No
prompt_filter_results dalleFilterResults Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id. No

line

A content line object consisting of an adjacent sequence of content elements, such as words and selection marks.

Name Type Description Required Default
text string Yes
spans array An array of spans that represent detected objects and its bounding box information. Yes

span

A span object that represents a detected object and its bounding box information.

Name Type Description Required Default
text string The text content of the span that represents the detected object. Yes
offset integer The character offset within the text where the span begins. This offset is defined as the position of the first character of the span, counting from the start of the text as Unicode codepoints. Yes
length integer The length of the span in characters, measured in Unicode codepoints. Yes
polygon array An array of objects representing points in the polygon that encloses the detected object. Yes

runCompletionUsage

Usage statistics related to the run. This value will be null if the run isn't in a terminal state (i.e. in_progress, queued, etc.).

Name Type Description Required Default
completion_tokens integer Number of completion tokens used over the course of the run. Yes
prompt_tokens integer Number of prompt tokens used over the course of the run. Yes
total_tokens integer Total number of tokens used (prompt + completion). Yes

runStepCompletionUsage

Usage statistics related to the run step. This value will be null while the run step's status is in_progress.

Name Type Description Required Default
completion_tokens integer Number of completion tokens used over the course of the run step. Yes
prompt_tokens integer Number of prompt tokens used over the course of the run step. Yes
total_tokens integer Total number of tokens used (prompt + completion). Yes

assistantsApiResponseFormatOption

Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

This component can be one of the following:

assistantsApiResponseFormat

An object describing the expected output of the model. If json_object only function type tools are allowed to be passed to the Run. If text the model can return text or any value needed.

Name Type Description Required Default
type string Must be one of text or json_object. No text

type Enum: AssistantsApiResponseFormat

Value Description
text
json_object

assistantObject

Represents an assistant that can call the model and use tools.

Name Type Description Required Default
id string The identifier, which can be referenced in API endpoints. Yes
object string The object type, which is always assistant. Yes
created_at integer The Unix timestamp (in seconds) for when the assistant was created. Yes
name string The name of the assistant. The maximum length is 256 characters.
Yes
description string The description of the assistant. The maximum length is 512 characters.
Yes
model string ID of the model to use. You can use the List models API to see all of your available models, or see our Model overview for descriptions of them.
Yes
instructions string The system instructions that the assistant uses. The maximum length is 256,000 characters.
Yes
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, file_search, or function.
Yes []
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The ID of the vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.

object Enum: AssistantObjectType

Value Description
assistant The object type, which is always assistant

createAssistantRequest

Name Type Description Required Default
model Yes
name string The name of the assistant. The maximum length is 256 characters.
No
description string The description of the assistant. The maximum length is 512 characters.
No
instructions string The system instructions that the assistant uses. The maximum length is 256,000 characters.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, retrieval, or function.
No []
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.
vector_stores array A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of one vector store attached to the assistant.

modifyAssistantRequest

Name Type Description Required Default
model No
name string The name of the assistant. The maximum length is 256 characters.
No
description string The description of the assistant. The maximum length is 512 characters.
No
instructions string The system instructions that the assistant uses. The maximum length is 32,768 characters.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, retrieval, or function.
No []
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array Overrides the list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array Overrides the vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.

deleteAssistantResponse

Name Type Description Required Default
id string Yes
deleted boolean Yes
object string Yes

object Enum: DeleteAssistantResponseState

Value Description
assistant.deleted

listAssistantsResponse

Name Type Description Required Default
object string Yes
data array Yes
first_id string Yes
last_id string Yes
has_more boolean Yes

assistantToolsCode

Name Type Description Required Default
type string The type of tool being defined: code_interpreter Yes

type Enum: assistantToolsCodeType

Value Description
code_interpreter

assistantToolsFileSearch

Name Type Description Required Default
type string The type of tool being defined: file_search Yes
file_search object Overrides for the file search tool. No

max_num_results

Name Type Description Default
max_num_results integer The maximum number of results the file search tool should output. The default is 20 for gpt-4* models and 5 for gpt-3.5-turbo. This number should be between 1 and 50 inclusive.

Note that the file search tool may output fewer than max_num_results results.

type Enum: assistantToolsFileSearchType

Value Description
file_search

assistantToolsFileSearchTypeOnly

Name Type Description Required Default
type string The type of tool being defined: file_search Yes

type Enum: assistantToolsFileSearchType

Value Description
file_search

assistantToolsFunction

Name Type Description Required Default
type string The type of tool being defined: function Yes
function object The function definition. Yes

Properties for function

description

Name Type Description Default
description string A description of what the function does, used by the model to choose when and how to call the function.

name

Name Type Description Default
name string The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

parameters

Name Type Description Default
parameters chatCompletionFunctionParameters The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format.

type Enum: assistantToolsFunction

Value Description
function

truncationObject

Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.

Name Type Description Required Default
type string The truncation strategy to use for the thread. The default is auto. If set to last_messages, the thread will be truncated to the n most recent messages in the thread. When set to auto, messages in the middle of the thread will be dropped to fit the context length of the model, max_prompt_tokens. Yes
last_messages integer The number of most recent messages from the thread when constructing the context for the run. No

type Enum: TruncationType

Value Description
auto
last_messages

assistantsApiToolChoiceOption

Controls which (if any) tool is called by the model. none means the model won't call any tools and instead generates a message. auto is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

This component can be one of the following:

assistantsNamedToolChoice

Specifies a tool the model should use. Use to force the model to call a specific tool.

Name Type Description Required Default
type string The type of the tool. If type is function, the function name must be set Yes
function object No

Properties for function

name

Name Type Description Default
name string The name of the function to call.

type Enum: AssistantsNamedToolChoiceType

Value Description
function
code_interpreter
file_search

runObject

Represents an execution run on a thread.

Name Type Description Required Default
id string The identifier, which can be referenced in API endpoints. Yes
object string The object type, which is always thread.run. Yes
created_at integer The Unix timestamp (in seconds) for when the run was created. Yes
thread_id string The ID of the thread that was executed on as a part of this run. Yes
assistant_id string The ID of the assistant used for execution of this run. Yes
status string The status of the run, which can be either queued, in_progress, requires_action, cancelling, cancelled, failed, completed, or expired. Yes
required_action object Details on the action required to continue the run. Will be null if no action is required. Yes
last_error object The last error associated with this run. Will be null if there are no errors. Yes
expires_at integer The Unix timestamp (in seconds) for when the run will expire. Yes
started_at integer The Unix timestamp (in seconds) for when the run was started. Yes
cancelled_at integer The Unix timestamp (in seconds) for when the run was cancelled. Yes
failed_at integer The Unix timestamp (in seconds) for when the run failed. Yes
completed_at integer The Unix timestamp (in seconds) for when the run was completed. Yes
incomplete_details object Details on why the run is incomplete. Will be null if the run isn't incomplete. Yes
model string The model that the assistant used for this run. Yes
instructions string The instructions that the assistant used for this run. Yes
tools array The list of tools that the assistant used for this run. Yes []
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes
usage runCompletionUsage Usage statistics related to the run. This value will be null if the run isn't in a terminal state (i.e. in_progress, queued, etc.). Yes
temperature number The sampling temperature used for this run. If not set, defaults to 1. No
top_p number The nucleus sampling value used for this run. If not set, defaults to 1. No
max_prompt_tokens integer The maximum number of prompt tokens specified to have been used over the course of the run.
Yes
max_completion_tokens integer The maximum number of completion tokens specified to have been used over the course of the run.
Yes
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. Yes
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model won't call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
Yes
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
Yes

Properties for required_action

type

Name Type Description Default
type string For now, this is always submit_tool_outputs.

submit_tool_outputs

Name Type Description Default
tool_calls array A list of the relevant tool calls.

Properties for last_error

code

Name Type Description Default
code string One of server_error or rate_limit_exceeded.

message

Name Type Description Default
message string A human-readable description of the error.

Properties for incomplete_details

reason

Name Type Description Default
reason string The reason why the run is incomplete. This will point to which specific token limit was reached over the course of the run.

object Enum: runObjectType

Value Description
thread.run The run object type which is always thread.run

status Enum: RunObjectStatus

Value Description
queued The queued state
in_progress The in_progress state
requires_action The required_action state
cancelling The cancelling state
cancelled The cancelled state
failed The failed state
completed The completed state
expired The expired state

createRunRequest

Name Type Description Required Default
assistant_id string The ID of the assistant to use to execute this run. Yes
model string The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. No
instructions string Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. No
additional_instructions string Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions. No
additional_messages array Adds additional messages to the thread before creating the run. No
tools array Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
max_prompt_tokens integer The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
max_completion_tokens integer The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. No
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model won't call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
No
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

listRunsResponse

Name Type Description Required Default
object string Yes
data array Yes
first_id string Yes
last_id string Yes
has_more boolean Yes

modifyRunRequest

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

submitToolOutputsRunRequest

Name Type Description Required Default
tool_outputs array A list of tools for which the outputs are being submitted. Yes
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No

runToolCallObject

Tool call objects

Name Type Description Required Default
id string The ID of the tool call. This ID must be referenced when you submit the tool outputs in using the Submit tool outputs to run endpoint. Yes
type string The type of tool call the output is required for. For now, this is always function. Yes
function object The function definition. Yes

Properties for function

name

Name Type Description Default
name string The name of the function.

arguments

Name Type Description Default
arguments string The arguments that the model expects you to pass to the function.

type Enum: RunToolCallObjectType

Value Description
function

createThreadAndRunRequest

Name Type Description Required Default
assistant_id string The ID of the assistant to use to execute this run. Yes
thread createThreadRequest No
model string The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. No
instructions string Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. No
tools array Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. No
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
stream_options chatCompletionStreamOptions Options for streaming response. Only set this when you set stream: true.
No None
max_prompt_tokens integer The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
max_completion_tokens integer The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. No
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model won't call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
No
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The ID of the vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.

threadObject

Represents a thread that contains messages.

Name Type Description Required Default
id string The identifier, which can be referenced in API endpoints. Yes
object string The object type, which is always thread. Yes
created_at integer The Unix timestamp (in seconds) for when the thread was created. Yes
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The vector store attached to this thread. There can be a maximum of one vector store attached to the thread.

object Enum: ThreadObjectType

Value Description
thread The type of thread object which is always thread

createThreadRequest

Name Type Description Required Default
messages array A list of messages to start the thread with. No
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The vector store attached to this thread. There can be a maximum of one vector store attached to the thread.
vector_stores array A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of one vector store attached to the thread.

modifyThreadRequest

Name Type Description Required Default
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Properties for tool_resources

code_interpreter

Name Type Description Default
file_ids array A list of File IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
[]

file_search

Name Type Description Default
vector_store_ids array The vector store attached to this thread. There can be a maximum of one vector store attached to the thread.

deleteThreadResponse

Name Type Description Required Default
id string Yes
deleted boolean Yes
object string Yes

object Enum: DeleteThreadResponseObjectState

Value Description
thread.deleted The delete thread response object state which is thread.deleted

listThreadsResponse

Name Type Description Required Default
object string Yes
data array Yes
first_id string Yes
last_id string Yes
has_more boolean Yes

messageObject

Represents a message within a thread.

Name Type Description Required Default
id string The identifier, which can be referenced in API endpoints. Yes
object string The object type, which is always thread.message. Yes
created_at integer The Unix timestamp (in seconds) for when the message was created. Yes
thread_id string The thread ID that this message belongs to. Yes
status string The status of the message, which can be either in_progress, incomplete, or completed. Yes
incomplete_details object On an incomplete message, details about why the message is incomplete. Yes
completed_at integer The Unix timestamp (in seconds) for when the message was completed. Yes
incomplete_at integer The Unix timestamp (in seconds) for when the message was marked as incomplete. Yes
role string The entity that produced the message. One of user or assistant. Yes
content array The content of the message in array of text and/or images. Yes
assistant_id string If applicable, the ID of the assistant that authored this message. Yes
run_id string If applicable, the ID of the run associated with the authoring of this message. Yes
attachments array A list of files attached to the message, and the tools they were added to. Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes

Properties for incomplete_details

reason

Name Type Description Default
reason string The reason the message is incomplete.

object Enum: MessageObjectType

Value Description
thread.message The message object type which is thread.message

status Enum: MessageObjectStatus

Value Description
in_progress
incomplete
completed

role Enum: MessageObjectRole

Value Description
user
assistant

messageDeltaObject

Represents a message delta i.e. any changed fields on a message during streaming.

Name Type Description Required Default
id string The identifier of the message, which can be referenced in API endpoints. Yes
object string The object type, which is always thread.message.delta. Yes
delta object The delta containing the fields that have changed on the Message. Yes

Properties for delta

role

Name Type Description Default
role string The entity that produced the message. One of user or assistant.

content

Name Type Description Default
content array The content of the message in array of text and/or images.

object Enum: MessageDeltaObjectType

Value Description
thread.message.delta

createMessageRequest

Name Type Description Required Default
role string The role of the entity that is creating the message. Allowed values include:
- user: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages.
- assistant: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.
Yes
content string The content of the message. Yes
attachments array A list of files attached to the message, and the tools they should be added to. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

role Enum: CreateMessageRequestRole

Value Description
user
assistant

modifyMessageRequest

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

deleteMessageResponse

Name Type Description Required Default
id string Yes
deleted boolean Yes
object string Yes

object Enum: DeleteMessageResponseObject

Value Description
thread.message.deleted The delete message response object state

listMessagesResponse

Name Type Description Required Default
object string Yes
data array Yes
first_id string Yes
last_id string Yes
has_more boolean Yes

messageContentImageFileObject

References an image File in the content of a message.

Name Type Description Required Default
type string Always image_file. Yes
image_file object Yes

Properties for image_file

file_id

Name Type Description Default
file_id string The File ID of the image in the message content.

type Enum: MessageContentImageFileObjectType

Value Description
image_file The message content image file type

messageContentTextObject

The text content that is part of a message.

Name Type Description Required Default
type string Always text. Yes
text object Yes

Properties for text

value

Name Type Description Default
value string The data that makes up the text.

annotations

Name Type Description Default
annotations array

type Enum: messageContentTextObjectType

Value Description
text The message content text Object type

messageContentTextAnnotationsFileCitationObject

A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "retrieval" tool to search files.

Name Type Description Required Default
type string Always file_citation. Yes
text string The text in the message content that needs to be replaced. Yes
file_citation object Yes
start_index integer Yes
end_index integer Yes

Properties for file_citation

file_id

Name Type Description Default
file_id string The ID of the specific File the citation is from.

type Enum: FileCitationObjectType

Value Description
file_citation The file citation object type

messageContentTextAnnotationsFilePathObject

A URL for the file that's generated when the assistant used the code_interpreter tool to generate a file.

Name Type Description Required Default
type string Always file_path. Yes
text string The text in the message content that needs to be replaced. Yes
file_path object Yes
start_index integer Yes
end_index integer Yes

Properties for file_path

file_id

Name Type Description Default
file_id string The ID of the file that was generated.

type Enum: FilePathObjectType

Value Description
file_path The file path object type

messageDeltaContentImageFileObject

References an image File in the content of a message.

Name Type Description Required Default
index integer The index of the content part in the message. Yes
type string Always image_file. Yes
image_file object No

Properties for image_file

file_id

Name Type Description Default
file_id string The File ID of the image in the message content.

type Enum: MessageDeltaContentImageFileObjectType

Value Description
image_file

messageDeltaContentTextObject

The text content that is part of a message.

Name Type Description Required Default
index integer The index of the content part in the message. Yes
type string Always text. Yes
text object No

Properties for text

value

Name Type Description Default
value string The data that makes up the text.

annotations

Name Type Description Default
annotations array

type Enum: MessageDeltaContentTextObjectType

Value Description
text

messageDeltaContentTextAnnotationsFileCitationObject

A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "file_search" tool to search files.

Name Type Description Required Default
index integer The index of the annotation in the text content part. Yes
type string Always file_citation. Yes
text string The text in the message content that needs to be replaced. No
file_citation object No
start_index integer No
end_index integer No

Properties for file_citation

file_id

Name Type Description Default
file_id string The ID of the specific File the citation is from.

quote

Name Type Description Default
quote string The specific quote in the file.

type Enum: MessageDeltaContentTextAnnotationsFileCitationObjectType

Value Description
file_citation

messageDeltaContentTextAnnotationsFilePathObject

A URL for the file that's generated when the assistant used the code_interpreter tool to generate a file.

Name Type Description Required Default
index integer The index of the annotation in the text content part. Yes
type string Always file_path. Yes
text string The text in the message content that needs to be replaced. No
file_path object No
start_index integer No
end_index integer No

Properties for file_path

file_id

Name Type Description Default
file_id string The ID of the file that was generated.

type Enum: MessageDeltaContentTextAnnotationsFilePathObjectType

Value Description
file_path

runStepObject

Represents a step in execution of a run.

| Name | Type | Description | Required | Default | |------|------|-------------|----------|---------| | id | string | The identifier of the run step, which can be referenced in API endpoints. | Yes | | | object | string | The object type, which is always assistant.run.step``. | Yes | | | created_at | integer | The Unix timestamp (in seconds) for when the run step was created. | Yes | | | assistant_id | string | The ID of the assistant associated with the run step. | Yes | | | thread_id | string | The ID of the thread that was run. | Yes | | | run_id | string | The ID of the run) that this run step is a part of. | Yes | | | type | string | The type of run step, which can be either message_creationortool_calls. | Yes | | | status | string | The status of the run, which can be either in_progress, cancelled, failed, completed, or expired. | Yes | | | step_details | [runStepDetailsMessageCreationObject](#runstepdetailsmessagecreationobject) or [runStepDetailsToolCallsObject](#runstepdetailstoolcallsobject) | The details of the run step. | Yes | | | last_error | object | The last error associated with this run step. Will be null` if there are no errors. | Yes | | | expired_at | integer | The Unix timestamp (in seconds) for when the run step expired. A step is considered expired if the parent run is expired. | Yes | | | cancelled_at | integer | The Unix timestamp (in seconds) for when the run step was cancelled. | Yes | | | failed_at | integer | The Unix timestamp (in seconds) for when the run step failed. | Yes | | | completed_at | integer | The Unix timestamp (in seconds) for when the run step completed. | Yes | | | metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
| Yes | |

Properties for last_error

code

Name Type Description Default
code string One of server_error or rate_limit_exceeded.

message

Name Type Description Default
message string A human-readable description of the error.

object Enum: RunStepObjectType

Value Description
assistant.run.step The object type, which is always assistant.run.step

type Enum: RunStepObjectType

Value Description
message_creation The message_creation run step
tool_calls The tool_calls run step

status Enum: RunStepObjectStatus

Value Description
in_progress The in_progress run status
cancelled The cancelled run status
failed The cancelled run status
completed The cancelled run status
expired The cancelled run status

runStepDeltaObject

Represents a run step delta i.e. any changed fields on a run step during streaming.

Name Type Description Required Default
id string The identifier of the run step, which can be referenced in API endpoints. Yes
object string The object type, which is always thread.run.step.delta. Yes
delta object The delta containing the fields that have changed on the run step. Yes

Properties for delta

step_details

Name Type Description Default
step_details runStepDeltaStepDetailsMessageCreationObject or runStepDeltaStepDetailsToolCallsObject The details of the run step.

object Enum: RunStepDeltaObjectType

Value Description
thread.run.step.delta

listRunStepsResponse

Name Type Description Required Default
object string Yes
data array Yes
first_id string Yes
last_id string Yes
has_more boolean Yes

runStepDetailsMessageCreationObject

Details of the message creation by the run step.

Name Type Description Required Default
type string Always `message_creation``. Yes
message_creation object Yes

Properties for message_creation

message_id

Name Type Description Default
message_id string The ID of the message that was created by this run step.

type Enum: RunStepDetailsMessageCreationObjectType

Value Description
message_creation

runStepDeltaStepDetailsMessageCreationObject

Details of the message creation by the run step.

Name Type Description Required Default
type string Always message_creation. Yes
message_creation object No

Properties for message_creation

message_id

Name Type Description Default
message_id string The ID of the message that was created by this run step.

type Enum: RunStepDeltaStepDetailsMessageCreationObjectType

Value Description
message_creation

runStepDetailsToolCallsObject

Details of the tool call.

Name Type Description Required Default
type string Always tool_calls. Yes
tool_calls array An array of tool calls the run step was involved in. These can be associated with one of three types of tools: code_interpreter, retrieval or function.
Yes

type Enum: RunStepDetailsToolCallsObjectType

Value Description
tool_calls

runStepDeltaStepDetailsToolCallsObject

Details of the tool call.

Name Type Description Required Default
type string Always tool_calls. Yes
tool_calls array An array of tool calls the run step was involved in. These can be associated with one of three types of tools: code_interpreter, file_search or function.
No

type Enum: RunStepDeltaStepDetailsToolCallsObjectType

Value Description
tool_calls

runStepDetailsToolCallsCodeObject

Details of the Code Interpreter tool call the run step was involved in.

Name Type Description Required Default
id string The ID of the tool call. Yes
type string The type of tool call. This is always going to be code_interpreter for this type of tool call. Yes
code_interpreter object The Code Interpreter tool call definition. Yes

Properties for code_interpreter

input

Name Type Description Default
input string The input to the Code Interpreter tool call.

outputs

Name Type Description Default
outputs array The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (logs) or images (image). Each of these are represented by a different object type.

type Enum: RunStepDetailsToolCallsCodeObjectType

Value Description
code_interpreter

runStepDeltaStepDetailsToolCallsCodeObject

Details of the Code Interpreter tool call the run step was involved in.

Name Type Description Required Default
index integer The index of the tool call in the tool calls array. Yes
id string The ID of the tool call. No
type string The type of tool call. This is always going to be code_interpreter for this type of tool call. Yes
code_interpreter object The Code Interpreter tool call definition. No

Properties for code_interpreter

input

Name Type Description Default
input string The input to the Code Interpreter tool call.

outputs

Name Type Description Default
outputs array The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (logs) or images (image). Each of these are represented by a different object type.

type Enum: RunStepDeltaStepDetailsToolCallsCodeObjectType

Value Description
code_interpreter

runStepDetailsToolCallsCodeOutputLogsObject

Text output from the Code Interpreter tool call as part of a run step.

Name Type Description Required Default
type string Always logs. Yes
logs string The text output from the Code Interpreter tool call. Yes

type Enum: RunStepDetailsToolCallsCodeOutputLogsObjectType

Value Description
logs

runStepDeltaStepDetailsToolCallsCodeOutputLogsObject

Text output from the Code Interpreter tool call as part of a run step.

Name Type Description Required Default
index integer The index of the output in the outputs array. Yes
type string Always logs. Yes
logs string The text output from the Code Interpreter tool call. No

type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputLogsObjectType

Value Description
logs

runStepDetailsToolCallsCodeOutputImageObject

Name Type Description Required Default
type string Always image. Yes
image object Yes

Properties for image

file_id

Name Type Description Default
file_id string The File ID of the image.

type Enum: RunStepDetailsToolCallsCodeOutputImageObjectType

Value Description
image

runStepDeltaStepDetailsToolCallsCodeOutputImageObject

Name Type Description Required Default
index integer The index of the output in the outputs array. Yes
type string Always image. Yes
image object No

Properties for image

file_id

Name Type Description Default
file_id string The file ID of the image.

type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputImageObject

Value Description
image

runStepDetailsToolCallsFileSearchObject

Name Type Description Required Default
id string The ID of the tool call object. Yes
type string The type of tool call. This is always going to be file_search for this type of tool call. Yes
file_search object For now, this is always going to be an empty object. Yes

Properties for file_search

results

Name Type Description Default
results array The results of the file search.

type Enum: RunStepDetailsToolCallsFileSearchObjectType

Value Description
file_search

runStepDetailsToolCallsFileSearchResultObject

A result instance of the file search.

Name Type Description Required Default
file_id string The ID of the file that result was found in. Yes
file_name string The name of the file that result was found in. Yes
score number The score of the result. All values must be a floating point number between 0 and 1. Yes
content array The content of the result that was found. The content is only included if requested via the include query parameter. No

runStepDeltaStepDetailsToolCallsFileSearchObject

Name Type Description Required Default
index integer The index of the tool call in the tool calls array. Yes
id string The ID of the tool call object. No
type string The type of tool call. This is always going to be retrieval for this type of tool call. Yes
file_search object For now, this is always going to be an empty object. Yes

type Enum: RunStepDeltaStepDetailsToolCallsFileSearchObjectType

Value Description
file_search

runStepDetailsToolCallsFunctionObject

Name Type Description Required Default
id string The ID of the tool call object. Yes
type string The type of tool call. This is always going to be function for this type of tool call. Yes
function object The definition of the function that was called. Yes

Properties for function

name

Name Type Description Default
name string The name of the function.

arguments

Name Type Description Default
arguments string The arguments passed to the function.

output

Name Type Description Default
output string The output of the function. This will be null if the outputs have not been submitted yet.

type Enum: RunStepDetailsToolCallsFunctionObjectType

Value Description
function

runStepDeltaStepDetailsToolCallsFunctionObject

Name Type Description Required Default
index integer The index of the tool call in the tool calls array. Yes
id string The ID of the tool call object. No
type string The type of tool call. This is always going to be function for this type of tool call. Yes
function object The definition of the function that was called. No

Properties for function

name

Name Type Description Default
name string The name of the function.

arguments

Name Type Description Default
arguments string The arguments passed to the function.

output

Name Type Description Default
output string The output of the function. This will be null if the outputs have not been submitted yet.

type Enum: RunStepDetailsToolCallsFunctionObjectType

Value Description
function

vectorStoreExpirationAfter

The expiration policy for a vector store.

Name Type Description Required Default
anchor string Anchor timestamp after which the expiration policy applies. Supported anchors: last_active_at. Yes
days integer The number of days after the anchor time that the vector store will expire. Yes

anchor Enum: VectorStoreExpirationAfterAnchor

Value Description
last_active_at The anchor timestamp after which the expiration policy applies.

vectorStoreObject

A vector store is a collection of processed files can be used by the file_search tool.

Name Type Description Required Default
id string The identifier, which can be referenced in API endpoints. Yes
object enum The object type, which is always vector_store.
Possible values: vector_store
Yes
created_at integer The Unix timestamp (in seconds) for when the vector store was created. Yes
name string The name of the vector store. Yes
usage_bytes integer The total number of bytes used by the files in the vector store. Yes
file_counts object Yes
status string The status of the vector store, which can be either expired, in_progress, or completed. A status of completed indicates that the vector store is ready for use. Yes
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
expires_at integer The Unix timestamp (in seconds) for when the vector store will expire. No
last_active_at integer The Unix timestamp (in seconds) for when the vector store was last active. Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes

Properties for file_counts

in_progress

Name Type Description Default
in_progress integer The number of files that are currently being processed.

completed

Name Type Description Default
completed integer The number of files that have been successfully processed.

failed

Name Type Description Default
failed integer The number of files that have failed to process.

cancelled

Name Type Description Default
cancelled integer The number of files that were cancelled.

total

Name Type Description Default
total integer The total number of files.

status Enum: VectorStoreObjectStatus

Value Description
expired
in_progress
completed

createVectorStoreRequest

Name Type Description Required Default
file_ids array A list of file IDs that the vector store should use. Useful for tools like file_search that can access files. No
name string The name of the vector store. No
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
chunking_strategy autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is nonempty. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

updateVectorStoreRequest

Name Type Description Required Default
name string The name of the vector store. No
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

listVectorStoresResponse

Name Type Description Required Default
object string Yes
data array Yes
first_id string Yes
last_id string Yes
has_more boolean Yes

deleteVectorStoreResponse

Name Type Description Required Default
id string Yes
deleted boolean Yes
object string Yes

object Enum: DeleteVectorStoreResponseObject

Value Description
vector_store.deleted The delete vector store response object state

vectorStoreFileObject

A list of files attached to a vector store.

Name Type Description Required Default
id string The identifier, which can be referenced in API endpoints. Yes
object string The object type, which is always vector_store.file. Yes
usage_bytes integer The total vector store usage in bytes. Note that this may be different from the original file size. Yes
created_at integer The Unix timestamp (in seconds) for when the vector store file was created. Yes
vector_store_id string The ID of the vector store that the file is attached to. Yes
status string The status of the vector store file, which can be either in_progress, completed, cancelled, or failed. The status completed indicates that the vector store file is ready for use. Yes
last_error object The last error associated with this vector store file. Will be null if there are no errors. Yes
chunking_strategy autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is nonempty. No

Properties for last_error

code

Name Type Description Default
code string One of server_error or invalid_file or unsupported_file.

message

Name Type Description Default
message string A human-readable description of the error.

object Enum: VectorStoreFileObjectType

Value Description
vector_store.file

status Enum: VectorStoreFileObjectStatus

Value Description
in_progress
completed
cancelled
failed

otherChunkingStrategyResponseParam

This is returned when the chunking strategy is unknown. Typically, this is because the file was indexed before the chunking_strategy concept was introduced in the API.

Name Type Description Required Default
type string Always other. Yes

type Enum: OtherChunkingStrategyResponseParamType

Value Description
other

staticChunkingStrategyResponseParam

Name Type Description Required Default
type string Always static. Yes
static staticChunkingStrategy Yes

type Enum: StaticChunkingStrategyResponseParamType

Value Description
static

staticChunkingStrategy

Name Type Description Required Default
max_chunk_size_tokens integer The maximum number of tokens in each chunk. The default value is 800. The minimum value is 100 and the maximum value is 4,096. Yes
chunk_overlap_tokens integer The number of tokens that overlap between chunks. The default value is 400.

Note that the overlap must not exceed half of max_chunk_size_tokens.
Yes

autoChunkingStrategyRequestParam

The default strategy. This strategy currently uses a max_chunk_size_tokens of 800 and chunk_overlap_tokens of 400.

Name Type Description Required Default
type enum Always auto.
Possible values: auto
Yes

staticChunkingStrategyRequestParam

Name Type Description Required Default
type enum Always static.
Possible values: static
Yes
static staticChunkingStrategy Yes

chunkingStrategyRequestParam

The chunking strategy used to chunk the file(s). If not set, will use the auto strategy.

This component can be one of the following:

createVectorStoreFileRequest

Name Type Description Required Default
file_id string A File ID that the vector store should use. Useful for tools like file_search that can access files. Yes
chunking_strategy chunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. No

listVectorStoreFilesResponse

Name Type Description Required Default
object string Yes
data array Yes
first_id string Yes
last_id string Yes
has_more boolean Yes

deleteVectorStoreFileResponse

Name Type Description Required Default
id string Yes
deleted boolean Yes
object string Yes

object Enum: DeleteVectorStoreFileResponseObject

Value Description
vector_store.file.deleted

vectorStoreFileBatchObject

A batch of files attached to a vector store.

Name Type Description Required Default
id string The identifier, which can be referenced in API endpoints. Yes
object string The object type, which is always vector_store.file_batch. Yes
created_at integer The Unix timestamp (in seconds) for when the vector store files batch was created. Yes
vector_store_id string The ID of the vector store that the File is attached to. Yes
status string The status of the vector store files batch, which can be either in_progress, completed, cancelled or failed. Yes
file_counts object Yes

Properties for file_counts

in_progress

Name Type Description Default
in_progress integer The number of files that are currently being processed.

completed

Name Type Description Default
completed integer The number of files that have been processed.

failed

Name Type Description Default
failed integer The number of files that have failed to process.

cancelled

Name Type Description Default
cancelled integer The number of files that were cancelled.

total

Name Type Description Default
total integer The total number of files.

object Enum: VectorStoreFileBatchObjectType

Value Description
vector_store.files_batch

status Enum: VectorStoreFileBatchObjectStatus

Value Description
in_progress
completed
cancelled
failed

createVectorStoreFileBatchRequest

Name Type Description Required Default
file_ids array A list of File IDs that the vector store should use. Useful for tools like file_search that can access files. Yes
chunking_strategy chunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. No

assistantStreamEvent

Represents an event emitted when streaming a Run.

Each event in a server-sent events stream has an event and data property:

event: thread.created
data: {"id": "thread_123", "object": "thread", ...}

We emit events whenever a new object is created, transitions to a new state, or is being streamed in parts (deltas). For example, we emit thread.run.created when a new run is created, thread.run.completed when a run completes, and so on. When an Assistant chooses to create a message during a run, we emit a thread.message.created event, a thread.message.in_progress event, many thread.message.delta events, and finally a thread.message.completed event.

We may add additional events over time, so we recommend handling unknown events gracefully in your code.

This component can be one of the following:

threadStreamEvent

This component can be one of the following:

thread.created

Occurs when a new thread is created.

Name Type Description Required Default
event string Yes
data threadObject Represents a thread that contains messages. Yes

Data: threadObject

Event Enum: ThreadStreamEventEnum

Value Description
thread.created The thread created event

runStreamEvent

This component can be one of the following:

thread.run.created

Occurs when a new run is created.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventCreated

Value Description
thread.run.created

thread.run.queued

Occurs when a run moves to a queued status.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventQueued

Value Description
thread.run.queued

thread.run.in_progress

Occurs when a run moves to an in_progress status.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventInProgress

Value Description
thread.run.in_progress

thread.run.requires_action

Occurs when a run moves to a requires_action status.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventRequiresAction

Value Description
thread.run.requires_action

thread.run.completed

Occurs when a run is completed.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventCompleted

Value Description
thread.run.completed

thread.run.failed

Occurs when a run fails.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventFailed

Value Description
thread.run.failed

thread.run.cancelling

Occurs when a run moves to a cancelling status.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventCancelling

Value Description
thread.run.cancelling

thread.run.cancelled

Occurs when a run is cancelled.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventCancelled

Value Description
thread.run.cancelled

thread.run.expired

Occurs when a run expires.

Name Type Description Required Default
event string Yes
data runObject Represents an execution run on a thread. Yes

Data: runObject

Event Enum: RunStreamEventExpired

Value Description
thread.run.expired

runStepStreamEvent

This component can be one of the following:

thread.run.step.created

Occurs when a run step is created.

Name Type Description Required Default
event string Yes
data runStepObject Represents a step in execution of a run.
Yes

Data: runStepObject

Event Enum: RunStepStreamEventCreated

Value Description
thread.run.step.created

thread.run.step.in_progress

Occurs when a run step moves to an in_progress state.

Name Type Description Required Default
event string Yes
data runStepObject Represents a step in execution of a run.
Yes

Data: runStepObject

Event Enum: RunStepStreamEventInProgress

Value Description
thread.run.step.in_progress

thread.run.step.delta

Occurs when parts of a run step are being streamed.

Name Type Description Required Default
event string Yes
data runStepDeltaObject Represents a run step delta i.e. any changed fields on a run step during streaming.
Yes

Data: runStepDeltaObject

Event Enum: RunStepStreamEventDelta

Value Description
thread.run.step.delta

thread.run.step.completed

Occurs when a run step is completed.

Name Type Description Required Default
event string Yes
data runStepObject Represents a step in execution of a run.
Yes

Data: runStepObject

Event Enum: RunStepStreamEventCompleted

Value Description
thread.run.step.completed

thread.run.step.failed

Occurs when a run step fails.

Name Type Description Required Default
event string Yes
data runStepObject Represents a step in execution of a run.
Yes

Data: runStepObject

Event Enum: RunStepStreamEventFailed

Value Description
thread.run.step.failed

thread.run.step.cancelled

Occurs when a run step is cancelled.

Name Type Description Required Default
event string Yes
data runStepObject Represents a step in execution of a run.
Yes

Data: runStepObject

Event Enum: RunStepStreamEventCancelled

Value Description
thread.run.step.cancelled

thread.run.step.expired

Occurs when a run step expires.

Name Type Description Required Default
event string Yes
data runStepObject Represents a step in execution of a run.
Yes

Data: runStepObject

Event Enum: RunStepStreamEventExpired

Value Description
thread.run.step.expired

messageStreamEvent

This component can be one of the following:

thread.message.created

Occurs when a message is created.

Name Type Description Required Default
event string Yes
data messageObject Represents a message within a thread. Yes

Data: messageObject

Event Enum: MessageStreamEventCreated

Value Description
thread.message.created

thread.message.in_progress

Occurs when a message moves to an in_progress state.

Name Type Description Required Default
event string Yes
data messageObject Represents a message within a thread. Yes

Data: messageObject

Event Enum: MessageStreamEventInProgress

Value Description
thread.message.in_progress

thread.message.delta

Occurs when parts of a message are being streamed.

Name Type Description Required Default
event string Yes
data messageDeltaObject Represents a message delta i.e. any changed fields on a message during streaming.
Yes

Data: messageDeltaObject

Event Enum: MessageStreamEventDelta

Value Description
thread.message.delta

thread.message.completed

Occurs when a message is completed.

Name Type Description Required Default
event string Yes
data messageObject Represents a message within a thread. Yes

Data: messageObject

Event Enum: MessageStreamEventCompleted

Value Description
thread.message.completed

thread.message.incomplete

Occurs when a message ends before it is completed.

Name Type Description Required Default
event string Yes
data messageObject Represents a message within a thread. Yes

Data: messageObject

Event Enum: MessageStreamEventIncomplete

Value Description
thread.message.incomplete

errorEvent

Occurs when an error occurs. This can happen due to an internal server error or a timeout.

Name Type Description Required Default
event string Yes
data error Yes

event Enum: ErrorEventEnum

Value Description
error

doneEvent

Occurs when a stream ends.

Name Type Description Required Default
event string Yes
data string Yes

event Enum: DoneEventEnum

Value Description
done

data Enum: DoneEventDataEnum

Value Description
[DONE]

Next steps

Learn about Models, and fine-tuning with the REST API. Learn more about the underlying models that power Azure OpenAI.