Azure OpenAI 推理模型

發行項
01/18/2025

Azure OpenAI o1 和 o1-mini 模型的設計訴求是透過增加焦點和功能來解決推理和解決問題的工作。這些模型花費更多的時間處理和了解使用者的要求，這使得其在科學、程式碼撰寫和數學等領域比以前的迭代更加強大。

o1 系列的主要功能：

複雜程式碼產生：能夠產生演算法，並處理進階程式碼撰寫工作以支援開發人員。
進階問題解決：非常適合全面腦力激盪的研討會，並解決多方面的挑戰。
複雜文件比較：非常適合分析合約、案件卷宗或法律文件以識別細微差別。
遵循指示和工作流程管理：對於管理需要較短內容的工作流程特別有效。

可用性

o1 系列模型現在可供 API 存取和模型部署使用。 需要註冊，並根據Microsoft的資格準則來授與存取權。先前已套用並收到存取 o1-preview權的客戶不需要重新套用，因為它們會自動在最新模型的等候清單上套用。

要求存取：有限存取模型應用程式 (英文)

一旦授與存取權，您必須為每個模型建立部署。如果您有現有的 o1-preview 部署，目前不支援就地升級，您必須建立新的部署。

區域可用性

模型	區域
`o1`	美國東部 2 （全球標準）瑞典中部（全球標準）
`o1-preview`	請參閱模型頁面。
`o1-mini`	請參閱模型頁面。

API 和功能支援

功能	o1， 2024-12-17	o1-preview， 2024-09-12	o1-mini， 2024-09-12
API 版本	`2024-12-01-preview`	`2024-09-01-preview` `2024-10-01-preview` `2024-12-01-preview`	`2024-09-01-preview` `2024-10-01-preview` `2024-12-01-preview`
開發人員訊息	✅	-	-
結構化輸出	✅	-	-
內容視窗	輸入：200,000 輸出：100,000	輸入：128,000 輸出：32,768	輸入：128,000 輸出：65,536
推理工作	✅	-	-
系統訊息	-	-	-
Functions/Tools	✅	-	-
`max_completion_tokens`	✅	✅	✅

o1 系列 模型只能與參數搭配使用 max_completion_tokens 。

重要

模型和 tool_choice 參數有已知問題o1。目前包含選擇性 tool_choice 參數的函式呼叫將會失敗。此問題解決之後，此頁面將會更新。

不支援

o1 系列模型目前不支援下列專案：

系統訊息
串流
平行工具呼叫
temperature、top_p、presence_penalty、frequency_penalty、logprobs、top_logprobs、、、 logit_biasmax_tokens

使用方式

這些模型目前不支援與使用聊天完成 API 的其他模型相同的參數集。

Python (Microsoft Entra ID)
Python (金鑰驗證)

您必須升級 OpenAI 用戶端連結庫，才能存取最新的參數。

pip install openai --upgrade

如果您不熟悉使用 Microsoft Entra ID 進行驗證，請參閱如何使用 Microsoft Entra ID 驗證來設定 Azure OpenAI 服務。

from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  azure_ad_token_provider=token_provider,
  api_version="2024-12-01-preview"
)

response = client.chat.completions.create(
    model="o1-new", # replace with the model deployment name of your o1-preview, or o1-mini model
    messages=[
        {"role": "user", "content": "What steps should I think about when writing my first Python API?"},
    ],
    max_completion_tokens = 5000

)

print(response.model_dump_json(indent=2))

您可能需要升級 OpenAI Python 連結庫的版本，以利用新的參數，例如 max_completion_tokens。

pip install openai --upgrade


from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-12-01-preview"
)

response = client.chat.completions.create(
    model="o1-new", # replace with the model deployment name of your o1 deployment.
    messages=[
        {"role": "user", "content": "What steps should I think about when writing my first Python API?"},
    ],
    max_completion_tokens = 5000

)

print(response.model_dump_json(indent=2))

輸出：

{
  "id": "chatcmpl-AEj7pKFoiTqDPHuxOcirA9KIvf3yz",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Writing your first Python API is an exciting step in developing software that can communicate with other applications. An API (Application Programming Interface) allows different software systems to interact with each other, enabling data exchange and functionality sharing. Here are the steps you should consider when creating your first Python API...truncated for brevity.",
        "refusal": null,
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      },
      "content_filter_results": {
        "hate": {
          "filtered": false,
          "severity": "safe"
        },
        "protected_material_code": {
          "filtered": false,
          "detected": false
        },
        "protected_material_text": {
          "filtered": false,
          "detected": false
        },
        "self_harm": {
          "filtered": false,
          "severity": "safe"
        },
        "sexual": {
          "filtered": false,
          "severity": "safe"
        },
        "violence": {
          "filtered": false,
          "severity": "safe"
        }
      }
    }
  ],
  "created": 1728073417,
  "model": "o1-2024-12-17",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": "fp_503a95a7d8",
  "usage": {
    "completion_tokens": 1843,
    "prompt_tokens": 20,
    "total_tokens": 1863,
    "completion_tokens_details": {
      "audio_tokens": null,
      "reasoning_tokens": 448
    },
    "prompt_tokens_details": {
      "audio_tokens": null,
      "cached_tokens": 0
    }
  },
  "prompt_filter_results": [
    {
      "prompt_index": 0,
      "content_filter_results": {
        "custom_blocklists": {
          "filtered": false
        },
        "hate": {
          "filtered": false,
          "severity": "safe"
        },
        "jailbreak": {
          "filtered": false,
          "detected": false
        },
        "self_harm": {
          "filtered": false,
          "severity": "safe"
        },
        "sexual": {
          "filtered": false,
          "severity": "safe"
        },
        "violence": {
          "filtered": false,
          "severity": "safe"
        }
      }
    }
  ]
}

推理工作

注意

推理模型在模型回應中具有 reasoning_tokens 的一部分 completion_tokens_details 。這些是隱藏令牌，不會作為訊息回應內容的一部分傳回，但由模型用來協助產生要求的最終答案。 2024-12-01-preview會新增可設定為、或 high 具有最新o1模型的其他新參數reasoning_effort。 mediumlow 工作設定愈高，模型會花費處理要求的時間越長，這通常會產生較大的數目 reasoning_tokens。

開發人員訊息

功能上的開發人員訊息 "role": "developer" 與系統訊息相同。

o1 系列推理模型不支援系統訊息。
o1-2024-12-17 使用 API 版本： 2024-12-01-preview 和更新版本新增對開發人員訊息的支援。

將開發人員訊息新增至先前的程式代碼範例如下所示：

Python (Microsoft Entra ID)
Python (金鑰驗證)