你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn。
了解如何使用可重现的输出(预览版)
默认情况下,如果多次向 Azure OpenAI 聊天完成模型询问同一问题,则可能得到不同的回应。 因此,这些回应被认为是非确定性的。 可重现的输出是一项新的预览功能,可让你有选择地更改默认行为,以帮助生成更具确定性的输出。
可重现的输出支持
目前仅在以下条件下支持可重现的输出:
支持的模型
gpt-35-turbo
(1106)gpt-35-turbo
(0125)gpt-4
(1106-Preview)gpt-4
(0125-Preview)gpt-4
(turbo-2024-04-09)gpt-4o-mini
(2024-07-18)gpt-4o
(2024-05-13)
有关模型区域可用性的最新信息,请参阅模型页面。
API 版本
首次在 API 版本 2023-12-01-preview
中添加了对可重现输出的支持
示例
首先,我们对同一问题生成三个回应,以演示在其他参数相同的情况下,聊天完成模型回应常见的可变性:
import os
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version="2024-02-01"
)
for i in range(3):
print(f'Story Version {i + 1}\n---')
response = client.chat.completions.create(
model="gpt-35-turbo-0125", # Model = should match the deployment name you chose for your 0125-preview model deployment
#seed=42,
temperature=0.7,
max_tokens =50,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a story about how the universe began?"}
]
)
print(response.choices[0].message.content)
print("---\n")
del response
输出
Story Version 1
---
Once upon a time, before there was time, there was nothing but a vast emptiness. In this emptiness, there existed a tiny, infinitely dense point of energy. This point contained all the potential for the universe as we know it. And
---
Story Version 2
---
Once upon a time, long before the existence of time itself, there was nothing but darkness and silence. The universe lay dormant, a vast expanse of emptiness waiting to be awakened. And then, in a moment that defies comprehension, there
---
Story Version 3
---
Once upon a time, before time even existed, there was nothing but darkness and stillness. In this vast emptiness, there was a tiny speck of unimaginable energy and potential. This speck held within it all the elements that would come
请注意,虽然每个回应版本可能有类似的元素和一些逐字重复,但回应越长,就越倾向于出现分歧。
现在,我们将运行与之前相同的代码,但这次取消注释显示 seed=42
的参数的行
import os
from openai import AzureOpenAI
client = AzureOpenAI(
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version="2024-02-01"
)
for i in range(3):
print(f'Story Version {i + 1}\n---')
response = client.chat.completions.create(
model="gpt-35-turbo-0125", # Model = should match the deployment name you chose for your 0125-preview model deployment
seed=42,
temperature=0.7,
max_tokens =50,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a story about how the universe began?"}
]
)
print(response.choices[0].message.content)
print("---\n")
del response
输出
Story Version 1
---
In the beginning, there was nothing but darkness and silence. Then, suddenly, a tiny point of light appeared. This point of light contained all the energy and matter that would eventually form the entire universe. With a massive explosion known as the Big Bang
---
Story Version 2
---
In the beginning, there was nothing but darkness and silence. Then, suddenly, a tiny point of light appeared. This point of light contained all the energy and matter that would eventually form the entire universe. With a massive explosion known as the Big Bang
---
Story Version 3
---
In the beginning, there was nothing but darkness and silence. Then, suddenly, a tiny point of light appeared. This was the moment when the universe was born.
The point of light began to expand rapidly, creating space and time as it grew.
---
对于三个请求中的每一个,使用相同的 seed
参数 42,同时将所有其他参数保持相同,我们便能够生成更一致的结果。
重要
无法通过可重现的输出来保证确定性。 即使在 API 调用中种子参数和 system_fingerprint
相同的情况下,目前在响应中仍会经常出现一定程度的可变性。 即使设置了种子参数,使用较大 max_tokens
值的相同 API 调用也通常会导致不太确定的响应。
参数详细信息
seed
是可选参数,可将其设置为整数或 null。
此功能以预览版提供。 指定后,我们的系统将尽最大努力确定性地采样,这样,具有相同种子和参数的重复请求应该会返回相同的结果。 无法保证确定性,你应参考 system_fingerprint
响应参数来监视后端的更改。
system_fingerprint
是一个字符串,是聊天完成对象的一部分。
这个指纹表示模型运行的后端配置。
它可与种子请求参数一起使用,以了解何时进行了可能影响确定性的后端更改。
要通过 system_fingerprint
查看完整的聊天完成对象,可以将 print(response.model_dump_json(indent=2))
添加到现有打印语句旁的上一个 Python 代码,或在 PowerShell 示例的末尾添加 $response | convertto-json -depth 5
。 这种更改将导致输出中包含以下附加信息:
输出
{
"id": "chatcmpl-8LmLRatZxp8wsx07KGLKQF0b8Zez3",
"choices": [
{
"finish_reason": "length",
"index": 0,
"message": {
"content": "In the beginning, there was nothing but a vast emptiness, a void without form or substance. Then, from this nothingness, a singular event occurred that would change the course of existence forever—The Big Bang.\n\nAround 13.8 billion years ago, an infinitely hot and dense point, no larger than a single atom, began to expand at an inconceivable speed. This was the birth of our universe, a moment where time and space came into being. As this primordial fireball grew, it cooled, and the fundamental forces that govern the cosmos—gravity, electromagnetism, and the strong and weak nuclear forces—began to take shape.\n\nMatter coalesced into the simplest elements, hydrogen and helium, which later formed vast clouds in the expanding universe. These clouds, driven by the force of gravity, began to collapse in on themselves, creating the first stars. The stars were crucibles of nuclear fusion, forging heavier elements like carbon, nitrogen, and oxygen",
"role": "assistant",
"function_call": null,
"tool_calls": null
},
"content_filter_results": {
"hate": {
"filtered": false,
"severity": "safe"
},
"self_harm": {
"filtered": false,
"severity": "safe"
},
"sexual": {
"filtered": false,
"severity": "safe"
},
"violence": {
"filtered": false,
"severity": "safe"
}
}
}
],
"created": 1700201417,
"model": "gpt-4",
"object": "chat.completion",
"system_fingerprint": "fp_50a4261de5",
"usage": {
"completion_tokens": 200,
"prompt_tokens": 27,
"total_tokens": 227
},
"prompt_filter_results": [
{
"prompt_index": 0,
"content_filter_results": {
"hate": {
"filtered": false,
"severity": "safe"
},
"self_harm": {
"filtered": false,
"severity": "safe"
},
"sexual": {
"filtered": false,
"severity": "safe"
},
"violence": {
"filtered": false,
"severity": "safe"
}
}
}
]
}
其他注意事项
如果要使用可重现的输出,需要在所有聊天完成调用中将 seed
设置为同一整数。 还应匹配其他参数,例如 temperature
、max_tokens
等。