Share via


ChatCompletionsOptions Class

Definition

The configuration information for a chat completions request. Completions support a wide variety of tasks and generate text that continues from or "completes" provided prompt data.

public class ChatCompletionsOptions : System.ClientModel.Primitives.IJsonModel<Azure.AI.Inference.ChatCompletionsOptions>, System.ClientModel.Primitives.IPersistableModel<Azure.AI.Inference.ChatCompletionsOptions>
type ChatCompletionsOptions = class
    interface IJsonModel<ChatCompletionsOptions>
    interface IPersistableModel<ChatCompletionsOptions>
Public Class ChatCompletionsOptions
Implements IJsonModel(Of ChatCompletionsOptions), IPersistableModel(Of ChatCompletionsOptions)
Inheritance
ChatCompletionsOptions
Implements

Constructors

ChatCompletionsOptions()

Initializes a new instance of ChatCompletionsOptions.

ChatCompletionsOptions(IEnumerable<ChatRequestMessage>)

Initializes a new instance of ChatCompletionsOptions.

Properties

AdditionalProperties

Additional Properties

To assign an object to the value of this property use FromObjectAsJson<T>(T, JsonSerializerOptions).

To assign an already formatted json string to this property use FromString(String).

Examples:

  • BinaryData.FromObjectAsJson("foo"): Creates a payload of "foo".
  • BinaryData.FromString("\"foo\""): Creates a payload of "foo".
  • BinaryData.FromObjectAsJson(new { key = "value" }): Creates a payload of { "key": "value" }.
  • BinaryData.FromString("{\"key\": \"value\"}"): Creates a payload of { "key": "value" }.

FrequencyPenalty

A value that influences the probability of generated tokens appearing based on their cumulative frequency in generated text. Positive values will make tokens less likely to appear as their frequency increases and decrease the likelihood of the model repeating the same statements verbatim. Supported range is [-2, 2].

MaxTokens

The maximum number of tokens to generate.

Messages

The collection of context messages associated with this chat completions request. Typical usage begins with a chat message for the System role that provides instructions for the behavior of the assistant, followed by alternating messages between the User and Assistant roles. Please note ChatRequestMessage is the base class. According to the scenario, a derived class of the base class might need to be assigned here, or this property needs to be casted to one of the possible derived classes. The available derived classes include ChatRequestAssistantMessage, ChatRequestSystemMessage, ChatRequestToolMessage and ChatRequestUserMessage.

Model

ID of the specific AI model to use, if more than one model is available on the endpoint.

NucleusSamplingFactor

An alternative to sampling with temperature called nucleus sampling. This value causes the model to consider the results of tokens with the provided probability mass. As an example, a value of 0.15 will cause only the tokens comprising the top 15% of probability mass to be considered. It is not recommended to modify temperature and top_p for the same completions request as the interaction of these two settings is difficult to predict. Supported range is [0, 1].

PresencePenalty

A value that influences the probability of generated tokens appearing based on their existing presence in generated text. Positive values will make tokens less likely to appear when they already exist and increase the model's likelihood to output new topics. Supported range is [-2, 2].

ResponseFormat

An object specifying the format that the model must output.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. Please note ChatCompletionsResponseFormat is the base class. According to the scenario, a derived class of the base class might need to be assigned here, or this property needs to be casted to one of the possible derived classes. The available derived classes include ChatCompletionsResponseFormatJsonObject and ChatCompletionsResponseFormatText.

Seed

If specified, the system will make a best effort to sample deterministically such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed.

StopSequences

A collection of textual sequences that will end completions generation.

Temperature

The sampling temperature to use that controls the apparent creativity of generated completions. Higher values will make output more random while lower values will make results more focused and deterministic. It is not recommended to modify temperature and top_p for the same completions request as the interaction of these two settings is difficult to predict. Supported range is [0, 1].

ToolChoice

If specified, the model will configure which of the provided tools it can use for the chat completions response.

Tools

A list of tools the model may request to call. Currently, only functions are supported as a tool. The model may response with a function call request and provide the input arguments in JSON format for that function.

Methods

JsonModelWriteCore(Utf8JsonWriter, ModelReaderWriterOptions)

Explicit Interface Implementations

IJsonModel<ChatCompletionsOptions>.Create(Utf8JsonReader, ModelReaderWriterOptions)

Reads one JSON value (including objects or arrays) from the provided reader and converts it to a model.

IJsonModel<ChatCompletionsOptions>.Write(Utf8JsonWriter, ModelReaderWriterOptions)

Writes the model to the provided Utf8JsonWriter.

IPersistableModel<ChatCompletionsOptions>.Create(BinaryData, ModelReaderWriterOptions)

Converts the provided BinaryData into a model.

IPersistableModel<ChatCompletionsOptions>.GetFormatFromOptions(ModelReaderWriterOptions)

Gets the data interchange format (JSON, Xml, etc) that the model uses when communicating with the service.

IPersistableModel<ChatCompletionsOptions>.Write(ModelReaderWriterOptions)

Writes the model into a BinaryData.

Applies to