如何使用推理模型搭配 Azure AI 模型推斷

重要

本文中標示為 (預覽) 的項目目前處於公開預覽狀態。此預覽版本沒有服務等級協定，不建議將其用於生產工作負載。可能不支援特定功能，或可能已經限制功能。如需詳細資訊，請參閱 Microsoft Azure 預覽版增補使用條款。

本文說明如何使用部署至 Azure AI 模型推斷的聊天完成模型在 Azure AI 服務中的推理功能。

推理模型

推理模型可以在數學、編碼、科學、策略和物流等領域達到更高的效能層級。這些模型產生輸出的方式，是在產生答案之前，明確地使用思維鏈結來探索所有可能的路徑。他們會在產生答案時驗證答案，以協助他們達成更精確的結論。這表示推理模型在提示時可能需要較少的內容，才能產生有效的結果。

調整模型效能的這類方式稱為 推斷計算時間 ，因為它會以較高的延遲和成本來交易效能。它與透過 定型計算時間進行調整的其他方法形成對比。

接著，推理模型會產生兩種類型的輸出：

推理完成
輸出完成

這兩個完成都會計入從模型產生的內容，因此會計入與模型相關聯的令牌限制和成本。某些模型可能會輸出推理內容，例如 DeepSeek-R1。有些其他專案，例如 o1，只會輸出完成的輸出部分。

必要條件

若要完成本教學課程，您需要：

Azure 訂用帳戶。如果您使用 GitHub Models，您可以升級您的體驗，並在程式中建立 Azure 訂用帳戶。如果您的情況，請閱讀從 GitHub 模型升級至 Azure AI 模型推斷。
Azure AI 服務資源。如需詳細資訊，請參閱建立 Azure AI 服務資源。
端點 URL 和金鑰。

具有推理功能模型部署的模型。如果您沒有一個閱讀將模型新增並設定至 Azure AI 服務以新增推理模型。
- 這個範例使用 DeepSeek-R1。
使用下列命令安裝 Azure AI 推斷套件：
```
pip install -U azure-ai-inference
```

搭配聊天使用推理功能

首先，建立用戶端以取用模型。下列程式碼會使用儲存在環境變數中的端點 URL 和金鑰。

import os
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

client = ChatCompletionsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=AzureKeyCredential(os.environ["AZURE_INFERENCE_CREDENTIAL"]),
    model="deepseek-r1"
)

提示

使用 Azure AI 模型推斷 API，確認您已將模型部署至 Azure AI 服務資源。 Deepseek-R1 也可作為無伺服器 API 端點使用。不過，這些端點不會採用參數 model ，如本教學課程所述。您可以移至 Azure AI Foundry 入口網站> Models + 端點，並確認模型列在 Azure AI 服務一節底下。

如果您已將資源設定為 Microsoft Entra ID 支援，您可以使用下列代碼段來建立用戶端。

import os
from azure.ai.inference import ChatCompletionsClient
from azure.identity import DefaultAzureCredential

client = ChatCompletionsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=DefaultAzureCredential(),
    credential_scopes=["https://cognitiveservices.azure.com/.default"],
    model="deepseek-r1"
)

建立聊天完成要求

下列範例示範如何建立模型的基本聊天要求。

from azure.ai.inference.models import SystemMessage, UserMessage

response = client.complete(
    messages=[
        UserMessage(content="How many languages are in the world?"),
    ],
)

建置推理模型的提示時，請考慮下列事項：

使用簡單的指示，並避免使用思維鏈結技術。
內建的推理功能可讓簡單的零射提示與更複雜的方法一樣有效。
提供其他內容或檔時，例如在RAG案例中，只包含最相關的資訊，可能有助於防止模型過度複雜化其回應。
推理模型可能支援使用系統訊息。不過，它們可能不會像其他非推理模型一樣嚴格遵循它們。
建立多回合應用程式時，請考慮只附加模型的最終答案，而不需在推理內容一節中所述的推理內容。

回應如下，您可以在其中查看模型的使用量統計資料：

print("Response:", response.choices[0].message.content)
print("Model:", response.model)
print("Usage:")
print("\tPrompt tokens:", response.usage.prompt_tokens)
print("\tTotal tokens:", response.usage.total_tokens)
print("\tCompletion tokens:", response.usage.completion_tokens)

Response: <think>Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer...</think>As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
Model: deepseek-r1
Usage: 
  Prompt tokens: 11
  Total tokens: 897
  Completion tokens: 886

推理內容

某些推理模型，例如 DeepSeek-R1，會產生完成，並包含其背後的推理。與完成相關聯的推理包含在回應的內容標籤 <think> 和 </think>內。模型可能會選取要產生推理內容的案例。您可以從回應擷取推理內容，以瞭解模型的想法程式，如下所示：

import re

match = re.match(r"<think>(.*?)</think>(.*)", response.choices[0].message.content, re.DOTALL)

print("Response:", )
if match:
    print("\tThinking:", match.group(1))
    print("\tAnswer:", match.group(2))
else:
    print("\tAnswer:", response.choices[0].message.content)
print("Model:", response.model)
print("Usage:")
print("\tPrompt tokens:", response.usage.prompt_tokens)
print("\tTotal tokens:", response.usage.total_tokens)
print("\tCompletion tokens:", response.usage.completion_tokens)

Thinking: Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer. Let's start by recalling the general consensus from linguistic sources. I remember that the number often cited is around 7,000, but maybe I should check some reputable organizations.\n\nEthnologue is a well-known resource for language data, and I think they list about 7,000 languages. But wait, do they update their numbers? It might be around 7,100 or so. Also, the exact count can vary because some sources might categorize dialects differently or have more recent data. \n\nAnother thing to consider is language endangerment. Many languages are endangered, with some having only a few speakers left. Organizations like UNESCO track endangered languages, so mentioning that adds context. Also, the distribution isn't even. Some countries have hundreds of languages, like Papua New Guinea with over 800, while others have just a few. \n\nA user might also wonder why the exact number is hard to pin down. It's because the distinction between a language and a dialect can be political or cultural. For example, Mandarin and Cantonese are considered dialects of Chinese by some, but they're mutually unintelligible, so others classify them as separate languages. Also, some regions are under-researched, making it hard to document all languages. \n\nI should also touch on language families. The 7,000 languages are grouped into families like Indo-European, Sino-Tibetan, Niger-Congo, etc. Maybe mention a few of the largest families. But wait, the question is just about the count, not the families. Still, it's good to provide a bit more context. \n\nI need to make sure the information is up-to-date. Let me think – recent estimates still hover around 7,000. However, languages are dying out rapidly, so the number decreases over time. Including that note about endangerment and language extinction rates could be helpful. For instance, it's often stated that a language dies every few weeks. \n\nAnother point is sign languages. Does the count include them? Ethnologue includes some, but not all sources might. If the user is including sign languages, that adds more to the count, but I think the 7,000 figure typically refers to spoken languages. For thoroughness, maybe mention that there are also over 300 sign languages. \n\nSummarizing, the answer should state around 7,000, mention Ethnologue's figure, explain why the exact number varies, touch on endangerment, and possibly note sign languages as a separate category. Also, a brief mention of Papua New Guinea as the most linguistically diverse country. \n\nWait, let me verify Ethnologue's current number. As of their latest edition (25th, 2022), they list 7,168 living languages. But I should check if that's the case. Some sources might round to 7,000. Also, SIL International publishes Ethnologue, so citing them as reference makes sense. \n\nOther sources, like Glottolog, might have a different count because they use different criteria. Glottolog might list around 7,000 as well, but exact numbers vary. It's important to highlight that the count isn't exact because of differing definitions and ongoing research. \n\nIn conclusion, the approximate number is 7,000, with Ethnologue being a key source, considerations of endangerment, and the challenges in counting due to dialect vs. language distinctions. I should make sure the answer is clear, acknowledges the variability, and provides key points succinctly.

Answer: The exact number of languages in the world is challenging to determine due to differences in definitions (e.g., distinguishing languages from dialects) and ongoing documentation efforts. However, widely cited estimates suggest there are approximately **7,000 languages** globally.
Model: DeepSeek-R1
Usage: 
  Prompt tokens: 11
  Total tokens: 897
  Completion tokens: 886

進行多回合交談時，避免在聊天記錄中傳送推理內容，因為推理傾向於產生長的解釋，這非常有用。

串流內容

根據預設，完成 API 會在單一回應中傳回整個產生的內容。如果您正在產生的完成很長，則等候回應可能需要數秒鐘的時間。

您可以 [串流] 內容，以在內容產生期間取得它。串流內容可讓您在內容變成可用時立即開始處理完成。此模式會傳回以 [僅限資料的伺服器傳送事件] 形式將回應串流回來的物件。從差異欄位擷取區塊，而不是訊息欄位。

若要串流完成，請在呼叫模型時設定 stream=True。

result = client.complete(
    model="deepseek-r1",
    messages=[
        UserMessage(content="How many languages are in the world?"),
    ],
    max_tokens=2048,
    stream=True,
)

若要將輸出視覺化，請定義協助程式函式來列印串流。下列範例會實作只串流答案且不含推理內容的路由：

def print_stream(result):
    """
    Prints the chat completion with streaming.
    """
    is_thinking = False
    for event in completion:
        if event.choices:
            content = event.choices[0].delta.content
            if content == "<think>":
                is_thinking = True
                print("🧠 Thinking...", end="", flush=True)
            elif content == "</think>":
                is_thinking = False
                print("🛑\n\n")
            elif content:
                print(content, end="", flush=True)

您可以將串流產生內容的方式視覺化：

print_stream(result)

參數

一般而言，推理模型不支援在聊天完成模型中找到的下列參數：

溫度
存在懲罰
重複懲罰
參數 top_p

某些模型支援使用工具或結構化輸出（包括 JSON 架構）。閱讀 [ 模型詳細數據] 頁面，以瞭解每個模型的支援。

套用內容安全

Azure AI 模型推斷 API 支援 Azure AI 內容安全。當您使用已開啟 Azure AI 內容安全的部署時，輸入和輸出都會通過旨在偵測及防止有害內容輸出的一組分類模型。內容篩選系統會偵測並針對輸入提示和輸出完成中的特定類別的潛在有害內容採取動作。

下列範例示範當模型偵測到輸入提示中的有害內容並啟用內容安全時，如何處理事件。

from azure.ai.inference.models import AssistantMessage, UserMessage

try:
    response = client.complete(
        model="deepseek-r1",
        messages=[
            UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."),
        ],
    )

    print(response.choices[0].message.content)

except HttpResponseError as ex:
    if ex.status_code == 400:
        response = ex.response.json()
        if isinstance(response, dict) and "error" in response:
            print(f"Your request triggered an {response['error']['code']} error:\n\t {response['error']['message']}")
        else:
            raise
    raise

提示

若要深入了解如何設定及控制 Azure AI 內容安全設定，請參閱 Azure AI 內容安全文件。

重要

本文中標示為 (預覽) 的項目目前處於公開預覽狀態。此預覽版本沒有服務等級協定，不建議將其用於生產工作負載。可能不支援特定功能，或可能已經限制功能。如需詳細資訊，請參閱 Microsoft Azure 預覽版增補使用條款。

本文說明如何使用部署至 Azure AI 模型推斷的聊天完成模型在 Azure AI 服務中的推理功能。

推理模型

推理模型可以在數學、編碼、科學、策略和物流等領域達到更高的效能層級。這些模型產生輸出的方式，是在產生答案之前，明確地使用思維鏈結來探索所有可能的路徑。他們會在產生答案時驗證答案，以協助他們達成更精確的結論。這表示推理模型在提示時可能需要較少的內容，才能產生有效的結果。

調整模型效能的這類方式稱為 推斷計算時間 ，因為它會以較高的延遲和成本來交易效能。它與透過 定型計算時間進行調整的其他方法形成對比。

接著，推理模型會產生兩種類型的輸出：

推理完成
輸出完成

這兩個完成都會計入從模型產生的內容，因此會計入與模型相關聯的令牌限制和成本。某些模型可能會輸出推理內容，例如 DeepSeek-R1。有些其他專案，例如 o1，只會輸出完成的輸出部分。

必要條件

若要完成本教學課程，您需要：

Azure 訂用帳戶。如果您使用 GitHub Models，您可以升級您的體驗，並在程式中建立 Azure 訂用帳戶。如果您的情況，請閱讀從 GitHub 模型升級至 Azure AI 模型推斷。
Azure AI 服務資源。如需詳細資訊，請參閱建立 Azure AI 服務資源。
端點 URL 和金鑰。

具有推理功能模型部署的模型。如果您沒有一個閱讀將模型新增並設定至 Azure AI 服務以新增推理模型。
- 這個範例使用 DeepSeek-R1。
使用下列命令安裝適用於 JavaScript 的 Azure 推斷連結庫：
```
npm install @azure-rest/ai-inference
```

搭配聊天使用推理功能

首先，建立用戶端以取用模型。下列程式碼會使用儲存在環境變數中的端點 URL 和金鑰。

import ModelClient from "@azure-rest/ai-inference";
import { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = new ModelClient(
    process.env.AZURE_INFERENCE_ENDPOINT, 
    new AzureKeyCredential(process.env.AZURE_INFERENCE_CREDENTIAL)
);

提示

使用 Azure AI 模型推斷 API，確認您已將模型部署至 Azure AI 服務資源。 Deepseek-R1 也可作為無伺服器 API 端點使用。不過，這些端點不會採用參數 model ，如本教學課程所述。您可以移至 Azure AI Foundry 入口網站> Models + 端點，並確認模型列在 Azure AI 服務一節底下。

如果您已將資源設定為 Microsoft Entra ID 支援，您可以使用下列代碼段來建立用戶端。

import ModelClient from "@azure-rest/ai-inference";
import { isUnexpected } from "@azure-rest/ai-inference";
import { DefaultAzureCredential } from "@azure/identity";

const clientOptions = { credentials: { "https://cognitiveservices.azure.com" } };

const client = new ModelClient(
    "https://<resource>.services.ai.azure.com/models", 
    new DefaultAzureCredential(),
    clientOptions,
);

建立聊天完成要求

下列範例示範如何建立模型的基本聊天要求。

var messages = [
    { role: "user", content: "How many languages are in the world?" },
];

var response = await client.path("/chat/completions").post({
    body: {
        model: "DeepSeek-R1",
        messages: messages,
    }
});

建置推理模型的提示時，請考慮下列事項：

使用簡單的指示，並避免使用思維鏈結技術。
內建的推理功能可讓簡單的零射提示與更複雜的方法一樣有效。
提供其他內容或檔時，例如在RAG案例中，只包含最相關的資訊，可能有助於防止模型過度複雜化其回應。
推理模型可能支援使用系統訊息。不過，它們可能不會像其他非推理模型一樣嚴格遵循它們。
建立多回合應用程式時，請考慮只附加模型的最終答案，而不需在推理內容一節中所述的推理內容。

回應如下，您可以在其中查看模型的使用量統計資料：

if (isUnexpected(response)) {
    throw response.body.error;
}

console.log("Response: ", response.body.choices[0].message.content);
console.log("Model: ", response.body.model);
console.log("Usage:");
console.log("\tPrompt tokens:", response.body.usage.prompt_tokens);
console.log("\tTotal tokens:", response.body.usage.total_tokens);
console.log("\tCompletion tokens:", response.body.usage.completion_tokens);

Response: <think>Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer...</think>As of now, it's estimated that there are about 7,000 languages spoken around the world. However, this number can vary as some languages become extinct and new ones develop. It's also important to note that the number of speakers can greatly vary between languages, with some having millions of speakers and others only a few hundred.
Model: deepseek-r1
Usage: 
  Prompt tokens: 11
  Total tokens: 897
  Completion tokens: 886

推理內容

某些推理模型，例如 DeepSeek-R1，會產生完成，並包含其背後的推理。與完成相關聯的推理包含在回應的內容標籤 <think> 和 </think>內。模型可能會選取要產生推理內容的案例。您可以從回應擷取推理內容，以瞭解模型的想法程式，如下所示：

var content = response.body.choices[0].message.content
var match = content.match(/<think>(.*?)<\/think>(.*)/s);

console.log("Response:");
if (match) {
    console.log("\tThinking:", match[1]);
    console.log("\Answer:", match[2]);
}
else {
    console.log("Response:", content);
}
console.log("Model: ", response.body.model);
console.log("Usage:");
console.log("\tPrompt tokens:", response.body.usage.prompt_tokens);
console.log("\tTotal tokens:", response.body.usage.total_tokens);
console.log("\tCompletion tokens:", response.body.usage.completion_tokens);

Thinking: Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer. Let's start by recalling the general consensus from linguistic sources. I remember that the number often cited is around 7,000, but maybe I should check some reputable organizations.\n\nEthnologue is a well-known resource for language data, and I think they list about 7,000 languages. But wait, do they update their numbers? It might be around 7,100 or so. Also, the exact count can vary because some sources might categorize dialects differently or have more recent data. \n\nAnother thing to consider is language endangerment. Many languages are endangered, with some having only a few speakers left. Organizations like UNESCO track endangered languages, so mentioning that adds context. Also, the distribution isn't even. Some countries have hundreds of languages, like Papua New Guinea with over 800, while others have just a few. \n\nA user might also wonder why the exact number is hard to pin down. It's because the distinction between a language and a dialect can be political or cultural. For example, Mandarin and Cantonese are considered dialects of Chinese by some, but they're mutually unintelligible, so others classify them as separate languages. Also, some regions are under-researched, making it hard to document all languages. \n\nI should also touch on language families. The 7,000 languages are grouped into families like Indo-European, Sino-Tibetan, Niger-Congo, etc. Maybe mention a few of the largest families. But wait, the question is just about the count, not the families. Still, it's good to provide a bit more context. \n\nI need to make sure the information is up-to-date. Let me think – recent estimates still hover around 7,000. However, languages are dying out rapidly, so the number decreases over time. Including that note about endangerment and language extinction rates could be helpful. For instance, it's often stated that a language dies every few weeks. \n\nAnother point is sign languages. Does the count include them? Ethnologue includes some, but not all sources might. If the user is including sign languages, that adds more to the count, but I think the 7,000 figure typically refers to spoken languages. For thoroughness, maybe mention that there are also over 300 sign languages. \n\nSummarizing, the answer should state around 7,000, mention Ethnologue's figure, explain why the exact number varies, touch on endangerment, and possibly note sign languages as a separate category. Also, a brief mention of Papua New Guinea as the most linguistically diverse country. \n\nWait, let me verify Ethnologue's current number. As of their latest edition (25th, 2022), they list 7,168 living languages. But I should check if that's the case. Some sources might round to 7,000. Also, SIL International publishes Ethnologue, so citing them as reference makes sense. \n\nOther sources, like Glottolog, might have a different count because they use different criteria. Glottolog might list around 7,000 as well, but exact numbers vary. It's important to highlight that the count isn't exact because of differing definitions and ongoing research. \n\nIn conclusion, the approximate number is 7,000, with Ethnologue being a key source, considerations of endangerment, and the challenges in counting due to dialect vs. language distinctions. I should make sure the answer is clear, acknowledges the variability, and provides key points succinctly.

Answer: The exact number of languages in the world is challenging to determine due to differences in definitions (e.g., distinguishing languages from dialects) and ongoing documentation efforts. However, widely cited estimates suggest there are approximately **7,000 languages** globally.
Model: DeepSeek-R1
Usage: 
  Prompt tokens: 11
  Total tokens: 897
  Completion tokens: 886

進行多回合交談時，避免在聊天記錄中傳送推理內容，因為推理傾向於產生長的解釋，這非常有用。

串流內容

根據預設，完成 API 會在單一回應中傳回整個產生的內容。如果您正在產生的完成很長，則等候回應可能需要數秒鐘的時間。

您可以 [串流] 內容，以在內容產生期間取得它。串流內容可讓您在內容變成可用時立即開始處理完成。此模式會傳回以 [僅限資料的伺服器傳送事件] 形式將回應串流回來的物件。從差異欄位擷取區塊，而不是訊息欄位。

若要串流完成，請在呼叫模型時設定 stream=True。

var messages = [
    { role: "user", content: "How many languages are in the world?" },
];

var response = await client.path("/chat/completions").post({
    body: {
        model: "DeepSeek-R1",
        messages: messages,
    }
}).asNodeStream();

若要將輸出視覺化，請定義協助程式函式來列印串流。下列範例會實作只串流答案且不含推理內容的路由：

function printStream(sses) {
    let isThinking = false;
    
    for await (const event of sses) {
        if (event.data === "[DONE]") {
            return;
        }
        for (const choice of (JSON.parse(event.data)).choices) {
            const content = choice.delta?.content ?? "";
            
            if (content === "<think>") {
                isThinking = true;
                process.stdout.write("🧠 Thinking...");
            } else if (content === "</think>") {
                isThinking = false;
                console.log("🛑\n\n");
            } else if (content) {
                process.stdout.write(content);
            }
        }
    }
}

您可以將串流產生內容的方式視覺化：

var sses = createSseStream(response.body);
printStream(result)

參數

一般而言，推理模型不支援在聊天完成模型中找到的下列參數：

溫度
存在懲罰
重複懲罰
參數 top_p

某些模型支援使用工具或結構化輸出（包括 JSON 架構）。閱讀 [ 模型詳細數據] 頁面，以瞭解每個模型的支援。

套用內容安全

Azure AI 模型推斷 API 支援 Azure AI 內容安全。當您使用已開啟 Azure AI 內容安全的部署時，輸入和輸出都會通過旨在偵測及防止有害內容輸出的一組分類模型。內容篩選系統會偵測並針對輸入提示和輸出完成中的特定類別的潛在有害內容採取動作。

下列範例示範當模型偵測到輸入提示中的有害內容並啟用內容安全時，如何處理事件。

try {
    var messages = [
        { role: "system", content: "You are an AI assistant that helps people find information." },
        { role: "user", content: "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." },
    ];

    var response = await client.path("/chat/completions").post({
        model: "DeepSeek-R1",
        body: {
            messages: messages,
        }
    });

    console.log(response.body.choices[0].message.content);
}
catch (error) {
    if (error.status_code == 400) {
        var response = JSON.parse(error.response._content);
        if (response.error) {
            console.log(`Your request triggered an ${response.error.code} error:\n\t ${response.error.message}`);
        }
        else
        {
            throw error;
        }
    }
}

提示

若要深入了解如何設定及控制 Azure AI 內容安全設定，請參閱 Azure AI 內容安全文件。

重要

本文中標示為 (預覽) 的項目目前處於公開預覽狀態。此預覽版本沒有服務等級協定，不建議將其用於生產工作負載。可能不支援特定功能，或可能已經限制功能。如需詳細資訊，請參閱 Microsoft Azure 預覽版增補使用條款。

本文說明如何使用部署至 Azure AI 模型推斷的聊天完成模型在 Azure AI 服務中的推理功能。

推理模型

推理模型可以在數學、編碼、科學、策略和物流等領域達到更高的效能層級。這些模型產生輸出的方式，是在產生答案之前，明確地使用思維鏈結來探索所有可能的路徑。他們會在產生答案時驗證答案，以協助他們達成更精確的結論。這表示推理模型在提示時可能需要較少的內容，才能產生有效的結果。

調整模型效能的這類方式稱為 推斷計算時間 ，因為它會以較高的延遲和成本來交易效能。它與透過 定型計算時間進行調整的其他方法形成對比。

接著，推理模型會產生兩種類型的輸出：

推理完成
輸出完成

這兩個完成都會計入從模型產生的內容，因此會計入與模型相關聯的令牌限制和成本。某些模型可能會輸出推理內容，例如 DeepSeek-R1。有些其他專案，例如 o1，只會輸出完成的輸出部分。

必要條件

若要完成本教學課程，您需要：

Azure 訂用帳戶。如果您使用 GitHub Models，您可以升級您的體驗，並在程式中建立 Azure 訂用帳戶。如果您的情況，請閱讀從 GitHub 模型升級至 Azure AI 模型推斷。
Azure AI 服務資源。如需詳細資訊，請參閱建立 Azure AI 服務資源。
端點 URL 和金鑰。

具有推理功能模型部署的模型。如果您沒有一個閱讀將模型新增並設定至 Azure AI 服務以新增推理模型。
- 這個範例使用 DeepSeek-R1。

將 Azure AI 推斷套件新增至您的專案：

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-ai-inference</artifactId>
    <version>1.0.0-beta.2</version>
</dependency>

如果您使用 Entra ID，則也需要下列套件：

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-identity</artifactId>
    <version>1.13.3</version>
</dependency>

匯入下列命名空間：

package com.azure.ai.inference.usage;

import com.azure.ai.inference.EmbeddingsClient;
import com.azure.ai.inference.EmbeddingsClientBuilder;
import com.azure.ai.inference.models.EmbeddingsResult;
import com.azure.ai.inference.models.EmbeddingItem;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.Configuration;

import java.util.ArrayList;
import java.util.List;

搭配聊天使用推理功能

首先，建立用戶端以取用模型。下列程式碼會使用儲存在環境變數中的端點 URL 和金鑰。

ChatCompletionsClient client = new ChatCompletionsClient(
        new URI("https://<resource>.services.ai.azure.com/models"),
        new AzureKeyCredential(System.getProperty("AZURE_INFERENCE_CREDENTIAL")),

提示

使用 Azure AI 模型推斷 API，確認您已將模型部署至 Azure AI 服務資源。 Deepseek-R1 也可作為無伺服器 API 端點使用。不過，這些端點不會採用參數 model ，如本教學課程所述。您可以移至 Azure AI Foundry 入口網站> Models + 端點，並確認模型列在 Azure AI 服務一節底下。

如果您已將資源設定為 Microsoft Entra ID 支援，您可以使用下列代碼段來建立用戶端。

client = new ChatCompletionsClient(
        new URI("https://<resource>.services.ai.azure.com/models"),
        new DefaultAzureCredentialBuilder().build()
);

建立聊天完成要求

下列範例示範如何建立模型的基本聊天要求。

ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
        .setModel("DeepSeek-R1")
        .setMessages(Arrays.asList(
                new ChatRequestUserMessage("How many languages are in the world?")
        ));

Response<ChatCompletions> response = client.complete(requestOptions);

建置推理模型的提示時，請考慮下列事項：

使用簡單的指示，並避免使用思維鏈結技術。
內建的推理功能可讓簡單的零射提示與更複雜的方法一樣有效。
提供其他內容或檔時，例如在RAG案例中，只包含最相關的資訊，可能有助於防止模型過度複雜化其回應。
推理模型可能支援使用系統訊息。不過，它們可能不會像其他非推理模型一樣嚴格遵循它們。
建立多回合應用程式時，請考慮只附加模型的最終答案，而不需在推理內容一節中所述的推理內容。

回應如下，您可以在其中查看模型的使用量統計資料：

System.out.println("Response: " + response.getValue().getChoices().get(0).getMessage().getContent());
System.out.println("Model: " + response.getValue().getModel());
System.out.println("Usage:");
System.out.println("\tPrompt tokens: " + response.getValue().getUsage().getPromptTokens());
System.out.println("\tTotal tokens: " + response.getValue().getUsage().getTotalTokens());
System.out.println("\tCompletion tokens: " + response.getValue().getUsage().getCompletionTokens());

Response: <think>Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate...</think>The exact number of languages in the world is challenging to determine due to differences in definitions (e.g., distinguishing languages from dialects) and ongoing documentation efforts. However, widely cited estimates suggest there are approximately **7,000 languages** globally.
Model: deepseek-r1
Usage: 
  Prompt tokens: 11
  Total tokens: 897
  Completion tokens: 886

推理內容

某些推理模型，例如 DeepSeek-R1，會產生完成，並包含其背後的推理。與完成相關聯的推理包含在回應的內容標籤 <think> 和 </think>內。模型可能會選取要產生推理內容的案例。您可以從回應擷取推理內容，以瞭解模型的想法程式，如下所示：

String content = response.getValue().getChoices().get(0).getMessage().getContent()
Pattern pattern = Pattern.compile("<think>(.*?)</think>(.*)", Pattern.DOTALL);
Matcher matcher = pattern.matcher(content);

System.out.println("Response:");
if (matcher.find()) {
    System.out.println("\tThinking: " + matcher.group(1));
    System.out.println("\tAnswer: " + matcher.group(2));
}
else {
    System.out.println("Response: " + content);
}
System.out.println("Model: " + response.getValue().getModel());
System.out.println("Usage:");
System.out.println("\tPrompt tokens: " + response.getValue().getUsage().getPromptTokens());
System.out.println("\tTotal tokens: " + response.getValue().getUsage().getTotalTokens());
System.out.println("\tCompletion tokens: " + response.getValue().getUsage().getCompletionTokens());

Thinking: Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer. Let's start by recalling the general consensus from linguistic sources. I remember that the number often cited is around 7,000, but maybe I should check some reputable organizations.\n\nEthnologue is a well-known resource for language data, and I think they list about 7,000 languages. But wait, do they update their numbers? It might be around 7,100 or so. Also, the exact count can vary because some sources might categorize dialects differently or have more recent data. \n\nAnother thing to consider is language endangerment. Many languages are endangered, with some having only a few speakers left. Organizations like UNESCO track endangered languages, so mentioning that adds context. Also, the distribution isn't even. Some countries have hundreds of languages, like Papua New Guinea with over 800, while others have just a few. \n\nA user might also wonder why the exact number is hard to pin down. It's because the distinction between a language and a dialect can be political or cultural. For example, Mandarin and Cantonese are considered dialects of Chinese by some, but they're mutually unintelligible, so others classify them as separate languages. Also, some regions are under-researched, making it hard to document all languages. \n\nI should also touch on language families. The 7,000 languages are grouped into families like Indo-European, Sino-Tibetan, Niger-Congo, etc. Maybe mention a few of the largest families. But wait, the question is just about the count, not the families. Still, it's good to provide a bit more context. \n\nI need to make sure the information is up-to-date. Let me think – recent estimates still hover around 7,000. However, languages are dying out rapidly, so the number decreases over time. Including that note about endangerment and language extinction rates could be helpful. For instance, it's often stated that a language dies every few weeks. \n\nAnother point is sign languages. Does the count include them? Ethnologue includes some, but not all sources might. If the user is including sign languages, that adds more to the count, but I think the 7,000 figure typically refers to spoken languages. For thoroughness, maybe mention that there are also over 300 sign languages. \n\nSummarizing, the answer should state around 7,000, mention Ethnologue's figure, explain why the exact number varies, touch on endangerment, and possibly note sign languages as a separate category. Also, a brief mention of Papua New Guinea as the most linguistically diverse country. \n\nWait, let me verify Ethnologue's current number. As of their latest edition (25th, 2022), they list 7,168 living languages. But I should check if that's the case. Some sources might round to 7,000. Also, SIL International publishes Ethnologue, so citing them as reference makes sense. \n\nOther sources, like Glottolog, might have a different count because they use different criteria. Glottolog might list around 7,000 as well, but exact numbers vary. It's important to highlight that the count isn't exact because of differing definitions and ongoing research. \n\nIn conclusion, the approximate number is 7,000, with Ethnologue being a key source, considerations of endangerment, and the challenges in counting due to dialect vs. language distinctions. I should make sure the answer is clear, acknowledges the variability, and provides key points succinctly.

Answer: The exact number of languages in the world is challenging to determine due to differences in definitions (e.g., distinguishing languages from dialects) and ongoing documentation efforts. However, widely cited estimates suggest there are approximately **7,000 languages** globally.
Model: DeepSeek-R1
Usage: 
  Prompt tokens: 11
  Total tokens: 897
  Completion tokens: 886

進行多回合交談時，避免在聊天記錄中傳送推理內容，因為推理傾向於產生長的解釋，這非常有用。

串流內容

根據預設，完成 API 會在單一回應中傳回整個產生的內容。如果您正在產生的完成很長，則等候回應可能需要數秒鐘的時間。

您可以 [串流] 內容，以在內容產生期間取得它。串流內容可讓您在內容變成可用時立即開始處理完成。此模式會傳回以 [僅限資料的伺服器傳送事件] 形式將回應串流回來的物件。從差異欄位擷取區塊，而不是訊息欄位。

ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
        .setModel("DeepSeek-R1")
        .setMessages(Arrays.asList(
                new ChatRequestUserMessage("How many languages are in the world? Write an essay about it.")
        ))
        .setMaxTokens(4096);

return client.completeStreamingAsync(requestOptions).thenAcceptAsync(response -> {
    try {
        printStream(response);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
});

若要將輸出視覺化，請定義協助程式函式來列印串流。下列範例會實作只串流答案且不含推理內容的路由：

public void printStream(StreamingResponse<StreamingChatCompletionsUpdate> response) throws Exception {
    boolean isThinking = false;

    for (StreamingChatCompletionsUpdate chatUpdate : response) {
       if (chatUpdate.getContentUpdate() != null && !chatUpdate.getContentUpdate().isEmpty()) {
            String content = chatUpdate.getContentUpdate();

            if ("<think>".equals(content)) {
                isThinking = true;
                System.out.print("🧠 Thinking...");
                System.out.flush();
            } else if ("</think>".equals(content)) {
                isThinking = false;
                System.out.println("🛑\n\n");
            } else if (content != null && !content.isEmpty()) {
                System.out.print(content);
                System.out.flush();
            }
        }
    }
}

您可以將串流產生內容的方式視覺化：

try {
    streamMessageAsync(client).get();
} catch (Exception e) {
    throw new RuntimeException(e);
}

參數

一般而言，推理模型不支援在聊天完成模型中找到的下列參數：

溫度
存在懲罰
重複懲罰
參數 top_p

某些模型支援使用工具或結構化輸出（包括 JSON 架構）。閱讀 [ 模型詳細數據] 頁面，以瞭解每個模型的支援。

重要

本文中標示為 (預覽) 的項目目前處於公開預覽狀態。此預覽版本沒有服務等級協定，不建議將其用於生產工作負載。可能不支援特定功能，或可能已經限制功能。如需詳細資訊，請參閱 Microsoft Azure 預覽版增補使用條款。

本文說明如何使用部署至 Azure AI 模型推斷的聊天完成模型在 Azure AI 服務中的推理功能。

推理模型

推理模型可以在數學、編碼、科學、策略和物流等領域達到更高的效能層級。這些模型產生輸出的方式，是在產生答案之前，明確地使用思維鏈結來探索所有可能的路徑。他們會在產生答案時驗證答案，以協助他們達成更精確的結論。這表示推理模型在提示時可能需要較少的內容，才能產生有效的結果。

調整模型效能的這類方式稱為 推斷計算時間 ，因為它會以較高的延遲和成本來交易效能。它與透過 定型計算時間進行調整的其他方法形成對比。

接著，推理模型會產生兩種類型的輸出：

推理完成
輸出完成

這兩個完成都會計入從模型產生的內容，因此會計入與模型相關聯的令牌限制和成本。某些模型可能會輸出推理內容，例如 DeepSeek-R1。有些其他專案，例如 o1，只會輸出完成的輸出部分。

必要條件

若要完成本教學課程，您需要：

Azure 訂用帳戶。如果您使用 GitHub Models，您可以升級您的體驗，並在程式中建立 Azure 訂用帳戶。如果您的情況，請閱讀從 GitHub 模型升級至 Azure AI 模型推斷。
Azure AI 服務資源。如需詳細資訊，請參閱建立 Azure AI 服務資源。
端點 URL 和金鑰。

具有推理功能模型部署的模型。如果您沒有一個閱讀將模型新增並設定至 Azure AI 服務以新增推理模型。
- 這個範例會使用 DeepSeek-R1。

使用下列命令安裝 Azure AI 推斷套件：

dotnet add package Azure.AI.Inference --prerelease

如果您使用 Entra ID，則也需要下列套件：
```
dotnet add package Azure.Identity
```

搭配聊天使用推理功能

首先，建立用戶端以取用模型。下列程式碼會使用儲存在環境變數中的端點 URL 和金鑰。

ChatCompletionsClient client = new ChatCompletionsClient(
    new Uri("https://<resource>.services.ai.azure.com/models"),
    new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
);

提示

使用 Azure AI 模型推斷 API，確認您已將模型部署至 Azure AI 服務資源。 Deepseek-R1 也可作為無伺服器 API 端點使用。不過，這些端點不會採用參數 model ，如本教學課程所述。您可以移至 Azure AI Foundry 入口網站> Models + 端點，並確認模型列在 Azure AI 服務一節底下。

如果您已將資源設定為 Microsoft Entra ID 支援，您可以使用下列代碼段來建立用戶端。

TokenCredential credential = new DefaultAzureCredential(includeInteractiveCredentials: true);
AzureAIInferenceClientOptions clientOptions = new AzureAIInferenceClientOptions();
BearerTokenAuthenticationPolicy tokenPolicy = new BearerTokenAuthenticationPolicy(credential, new string[] { "https://cognitiveservices.azure.com/.default" });

clientOptions.AddPolicy(tokenPolicy, HttpPipelinePosition.PerRetry);

client = new ChatCompletionsClient(
    new Uri("https://<resource>.services.ai.azure.com/models"),
    credential,
    clientOptions,
);

建立聊天完成要求

下列範例示範如何建立模型的基本聊天要求。

ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
{
    Messages = {
        new ChatRequestUserMessage("How many languages are in the world?")
    },
    Model = "deepseek-r1",
};

Response<ChatCompletions> response = client.Complete(requestOptions);

建置推理模型的提示時，請考慮下列事項：

使用簡單的指示，並避免使用思維鏈結技術。
內建的推理功能可讓簡單的零射提示與更複雜的方法一樣有效。
提供其他內容或檔時，例如在RAG案例中，只包含最相關的資訊，可能有助於防止模型過度複雜化其回應。
推理模型可能支援使用系統訊息。不過，它們可能不會像其他非推理模型一樣嚴格遵循它們。
建立多回合應用程式時，請考慮只附加模型的最終答案，而不需在推理內容一節中所述的推理內容。

回應如下，您可以在其中查看模型的使用量統計資料：

Console.WriteLine($"Response: {response.Value.Content}");
Console.WriteLine($"Model: {response.Value.Model}");
Console.WriteLine("Usage:");
Console.WriteLine($"\tPrompt tokens: {response.Value.Usage.PromptTokens}");
Console.WriteLine($"\tTotal tokens: {response.Value.Usage.TotalTokens}");
Console.WriteLine($"\tCompletion tokens: {response.Value.Usage.CompletionTokens}");

Response: <think>Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate...</think>The exact number of languages in the world is challenging to determine due to differences in definitions (e.g., distinguishing languages from dialects) and ongoing documentation efforts. However, widely cited estimates suggest there are approximately **7,000 languages** globally.
Model: deepseek-r1
Usage: 
  Prompt tokens: 11
  Total tokens: 897
  Completion tokens: 886

推理內容

某些推理模型，例如 DeepSeek-R1，會產生完成，並包含其背後的推理。與完成相關聯的推理包含在回應的內容標籤 <think> 和 </think>內。模型可能會選取要產生推理內容的案例。您可以從回應擷取推理內容，以瞭解模型的想法程式，如下所示：

Regex regex = new Regex(pattern, RegexOptions.Singleline);
Match match = regex.Match(response.Value.Content);

Console.WriteLine("Response:");
if (match.Success)
{
    Console.WriteLine($"\tThinking: {match.Groups[1].Value}");
    Console.WriteLine($"\tAnswer: {match.Groups[2].Value}");
else
{
    Console.WriteLine($"Response: {response.Value.Content}");
}
Console.WriteLine($"Model: {response.Value.Model}");
Console.WriteLine("Usage:");
Console.WriteLine($"\tPrompt tokens: {response.Value.Usage.PromptTokens}");
Console.WriteLine($"\tTotal tokens: {response.Value.Usage.TotalTokens}");
Console.WriteLine($"\tCompletion tokens: {response.Value.Usage.CompletionTokens}");

Thinking: Okay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer. Let's start by recalling the general consensus from linguistic sources. I remember that the number often cited is around 7,000, but maybe I should check some reputable organizations.\n\nEthnologue is a well-known resource for language data, and I think they list about 7,000 languages. But wait, do they update their numbers? It might be around 7,100 or so. Also, the exact count can vary because some sources might categorize dialects differently or have more recent data. \n\nAnother thing to consider is language endangerment. Many languages are endangered, with some having only a few speakers left. Organizations like UNESCO track endangered languages, so mentioning that adds context. Also, the distribution isn't even. Some countries have hundreds of languages, like Papua New Guinea with over 800, while others have just a few. \n\nA user might also wonder why the exact number is hard to pin down. It's because the distinction between a language and a dialect can be political or cultural. For example, Mandarin and Cantonese are considered dialects of Chinese by some, but they're mutually unintelligible, so others classify them as separate languages. Also, some regions are under-researched, making it hard to document all languages. \n\nI should also touch on language families. The 7,000 languages are grouped into families like Indo-European, Sino-Tibetan, Niger-Congo, etc. Maybe mention a few of the largest families. But wait, the question is just about the count, not the families. Still, it's good to provide a bit more context. \n\nI need to make sure the information is up-to-date. Let me think – recent estimates still hover around 7,000. However, languages are dying out rapidly, so the number decreases over time. Including that note about endangerment and language extinction rates could be helpful. For instance, it's often stated that a language dies every few weeks. \n\nAnother point is sign languages. Does the count include them? Ethnologue includes some, but not all sources might. If the user is including sign languages, that adds more to the count, but I think the 7,000 figure typically refers to spoken languages. For thoroughness, maybe mention that there are also over 300 sign languages. \n\nSummarizing, the answer should state around 7,000, mention Ethnologue's figure, explain why the exact number varies, touch on endangerment, and possibly note sign languages as a separate category. Also, a brief mention of Papua New Guinea as the most linguistically diverse country. \n\nWait, let me verify Ethnologue's current number. As of their latest edition (25th, 2022), they list 7,168 living languages. But I should check if that's the case. Some sources might round to 7,000. Also, SIL International publishes Ethnologue, so citing them as reference makes sense. \n\nOther sources, like Glottolog, might have a different count because they use different criteria. Glottolog might list around 7,000 as well, but exact numbers vary. It's important to highlight that the count isn't exact because of differing definitions and ongoing research. \n\nIn conclusion, the approximate number is 7,000, with Ethnologue being a key source, considerations of endangerment, and the challenges in counting due to dialect vs. language distinctions. I should make sure the answer is clear, acknowledges the variability, and provides key points succinctly.

Answer: The exact number of languages in the world is challenging to determine due to differences in definitions (e.g., distinguishing languages from dialects) and ongoing documentation efforts. However, widely cited estimates suggest there are approximately **7,000 languages** globally.
Model: DeepSeek-R1
Usage: 
  Prompt tokens: 11
  Total tokens: 897
  Completion tokens: 886

進行多回合交談時，避免在聊天記錄中傳送推理內容，因為推理傾向於產生長的解釋，這非常有用。

串流內容

根據預設，完成 API 會在單一回應中傳回整個產生的內容。如果您正在產生的完成很長，則等候回應可能需要數秒鐘的時間。

您可以 [串流] 內容，以在內容產生期間取得它。串流內容可讓您在內容變成可用時立即開始處理完成。此模式會傳回以 [僅限資料的伺服器傳送事件] 形式將回應串流回來的物件。從差異欄位擷取區塊，而不是訊息欄位。

static async Task StreamMessageAsync(ChatCompletionsClient client)
{
    ChatCompletionsOptions requestOptions = new ChatCompletionsOptions()
    {
        Messages = {
            new ChatRequestUserMessage("How many languages are in the world?")
        },
        MaxTokens=4096,
        Model = "deepseek-r1",
    };

    StreamingResponse<StreamingChatCompletionsUpdate> streamResponse = await client.CompleteStreamingAsync(requestOptions);

    await PrintStream(streamResponse);
}

若要將輸出視覺化，請定義協助程式函式來列印串流。下列範例會實作只串流答案且不含推理內容的路由：

static void PrintStream(StreamingResponse<StreamingChatCompletionsUpdate> response)
{
    bool isThinking = false;
    await foreach (StreamingChatCompletionsUpdate chatUpdate in response)
    {
        if (!string.IsNullOrEmpty(chatUpdate.ContentUpdate))
        {
            string content = chatUpdate.ContentUpdate;
            if (content == "<think>")
            {
                isThinking = true;
                Console.Write("🧠 Thinking...");
                Console.Out.Flush();
            }
            else if (content == "</think>")
            {
                isThinking = false;
                Console.WriteLine("🛑\n\n");
            }
            else if (!string.IsNullOrEmpty(content))
            {
                Console.Write(content);
                Console.Out.Flush();
            }
        }
    }
}

您可以將串流產生內容的方式視覺化：

StreamMessageAsync(client).GetAwaiter().GetResult();

參數

一般而言，推理模型不支援在聊天完成模型中找到的下列參數：

溫度
存在懲罰
重複懲罰
參數 top_p

某些模型支援使用工具或結構化輸出（包括 JSON 架構）。閱讀 [ 模型詳細數據] 頁面，以瞭解每個模型的支援。

套用內容安全

Azure AI 模型推斷 API 支援 Azure AI 內容安全。當您使用已開啟 Azure AI 內容安全的部署時，輸入和輸出都會通過旨在偵測及防止有害內容輸出的一組分類模型。內容篩選系統會偵測並針對輸入提示和輸出完成中的特定類別的潛在有害內容採取動作。

下列範例示範當模型偵測到輸入提示中的有害內容並啟用內容安全時，如何處理事件。

try
{
    requestOptions = new ChatCompletionsOptions()
    {
        Messages = {
            new ChatRequestSystemMessage("You are an AI assistant that helps people find information."),
            new ChatRequestUserMessage(
                "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
            ),
        },
        Model = "deepseek-r1",
    };

    response = client.Complete(requestOptions);
    Console.WriteLine(response.Value.Content);
}
catch (RequestFailedException ex)
{
    if (ex.ErrorCode == "content_filter")
    {
        Console.WriteLine($"Your query has trigger Azure Content Safety: {ex.Message}");
    }
    else
    {
        throw;
    }
}

提示

若要深入了解如何設定及控制 Azure AI 內容安全設定，請參閱 Azure AI 內容安全文件。

重要

本文中標示為 (預覽) 的項目目前處於公開預覽狀態。此預覽版本沒有服務等級協定，不建議將其用於生產工作負載。可能不支援特定功能，或可能已經限制功能。如需詳細資訊，請參閱 Microsoft Azure 預覽版增補使用條款。

本文說明如何使用部署至 Azure AI 模型推斷的聊天完成模型在 Azure AI 服務中的推理功能。

推理模型

推理模型可以在數學、編碼、科學、策略和物流等領域達到更高的效能層級。這些模型產生輸出的方式，是在產生答案之前，明確地使用思維鏈結來探索所有可能的路徑。他們會在產生答案時驗證答案，以協助他們達成更精確的結論。這表示推理模型在提示時可能需要較少的內容，才能產生有效的結果。

調整模型效能的這類方式稱為 推斷計算時間 ，因為它會以較高的延遲和成本來交易效能。它與透過 定型計算時間進行調整的其他方法形成對比。

接著，推理模型會產生兩種類型的輸出：

推理完成
輸出完成

這兩個完成都會計入從模型產生的內容，因此會計入與模型相關聯的令牌限制和成本。某些模型可能會輸出推理內容，例如 DeepSeek-R1。有些其他專案，例如 o1，只會輸出完成的輸出部分。

必要條件

若要完成本教學課程，您需要：

Azure 訂用帳戶。如果您使用 GitHub Models，您可以升級您的體驗，並在程式中建立 Azure 訂用帳戶。如果您的情況，請閱讀從 GitHub 模型升級至 Azure AI 模型推斷。
Azure AI 服務資源。如需詳細資訊，請參閱建立 Azure AI 服務資源。
端點 URL 和金鑰。

具有推理功能模型部署的模型。如果您沒有一個閱讀將模型新增並設定至 Azure AI 服務以新增推理模型。
- 這個範例使用 DeepSeek-R1。

搭配聊天使用推理功能

首先，建立用戶端以取用模型。下列程式碼會使用儲存在環境變數中的端點 URL 和金鑰。

POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Content-Type: application/json
api-key: <key>

提示

使用 Azure AI 模型推斷 API，確認您已將模型部署至 Azure AI 服務資源。 Deepseek-R1 也可作為無伺服器 API 端點使用。不過，這些端點不會採用參數 model ，如本教學課程所述。您可以移至 Azure AI Foundry 入口網站> Models + 端點，並確認模型列在 Azure AI 服務一節底下。

如果您已使用 Microsoft Entra ID 支援來設定資源，請在標頭中 Authorization 傳遞令牌：

POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Content-Type: application/json
Authorization: Bearer <token>

建立聊天完成要求

下列範例示範如何建立模型的基本聊天要求。

{
    "model": "deepseek-r1",
    "messages": [
        {
            "role": "user",
            "content": "How many languages are in the world?"
        }
    ]
}

建置推理模型的提示時，請考慮下列事項：

使用簡單的指示，並避免使用思維鏈結技術。
內建的推理功能可讓簡單的零射提示與更複雜的方法一樣有效。
提供其他內容或檔時，例如在RAG案例中，只包含最相關的資訊，可能有助於防止模型過度複雜化其回應。
推理模型可能支援使用系統訊息。不過，它們可能不會像其他非推理模型一樣嚴格遵循它們。
建立多回合應用程式時，請考慮只附加模型的最終答案，而不需在推理內容一節中所述的推理內容。

回應如下，您可以在其中查看模型的使用量統計資料：

{
    "id": "0a1234b5de6789f01gh2i345j6789klm",
    "object": "chat.completion",
    "created": 1718726686,
    "model": "DeepSeek-R1",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "<think>\nOkay, the user is asking how many languages exist in the world. I need to provide a clear and accurate answer. Let's start by recalling the general consensus from linguistic sources. I remember that the number often cited is around 7,000, but maybe I should check some reputable organizations.\n\nEthnologue is a well-known resource for language data, and I think they list about 7,000 languages. But wait, do they update their numbers? It might be around 7,100 or so. Also, the exact count can vary because some sources might categorize dialects differently or have more recent data. \n\nAnother thing to consider is language endangerment. Many languages are endangered, with some having only a few speakers left. Organizations like UNESCO track endangered languages, so mentioning that adds context. Also, the distribution isn't even. Some countries have hundreds of languages, like Papua New Guinea with over 800, while others have just a few. \n\nA user might also wonder why the exact number is hard to pin down. It's because the distinction between a language and a dialect can be political or cultural. For example, Mandarin and Cantonese are considered dialects of Chinese by some, but they're mutually unintelligible, so others classify them as separate languages. Also, some regions are under-researched, making it hard to document all languages. \n\nI should also touch on language families. The 7,000 languages are grouped into families like Indo-European, Sino-Tibetan, Niger-Congo, etc. Maybe mention a few of the largest families. But wait, the question is just about the count, not the families. Still, it's good to provide a bit more context. \n\nI need to make sure the information is up-to-date. Let me think – recent estimates still hover around 7,000. However, languages are dying out rapidly, so the number decreases over time. Including that note about endangerment and language extinction rates could be helpful. For instance, it's often stated that a language dies every few weeks. \n\nAnother point is sign languages. Does the count include them? Ethnologue includes some, but not all sources might. If the user is including sign languages, that adds more to the count, but I think the 7,000 figure typically refers to spoken languages. For thoroughness, maybe mention that there are also over 300 sign languages. \n\nSummarizing, the answer should state around 7,000, mention Ethnologue's figure, explain why the exact number varies, touch on endangerment, and possibly note sign languages as a separate category. Also, a brief mention of Papua New Guinea as the most linguistically diverse country. \n\nWait, let me verify Ethnologue's current number. As of their latest edition (25th, 2022), they list 7,168 living languages. But I should check if that's the case. Some sources might round to 7,000. Also, SIL International publishes Ethnologue, so citing them as reference makes sense. \n\nOther sources, like Glottolog, might have a different count because they use different criteria. Glottolog might list around 7,000 as well, but exact numbers vary. It's important to highlight that the count isn't exact because of differing definitions and ongoing research. \n\nIn conclusion, the approximate number is 7,000, with Ethnologue being a key source, considerations of endangerment, and the challenges in counting due to dialect vs. language distinctions. I should make sure the answer is clear, acknowledges the variability, and provides key points succinctly.\n</think>\n\nThe exact number of languages in the world is challenging to determine due to differences in definitions (e.g., distinguishing languages from dialects) and ongoing documentation efforts. However, widely cited estimates suggest there are approximately **7,000 languages** globally.",
                "tool_calls": null
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 11,
        "total_tokens": 897,
        "completion_tokens": 886
    }
}

推理內容

某些推理模型，例如 DeepSeek-R1，會產生完成，並包含其背後的推理。與完成相關聯的推理包含在回應的內容標籤 <think> 和 </think>內。模型可能會選取要產生推理內容的案例。

進行多回合交談時，避免在聊天記錄中傳送推理內容，因為推理傾向於產生長的解釋，這非常有用。

串流內容

根據預設，完成 API 會在單一回應中傳回整個產生的內容。如果您正在產生的完成很長，則等候回應可能需要數秒鐘的時間。

您可以 [串流] 內容，以在內容產生期間取得它。串流內容可讓您在內容變成可用時立即開始處理完成。此模式會傳回以 [僅限資料的伺服器傳送事件] 形式將回應串流回來的物件。從差異欄位擷取區塊，而不是訊息欄位。

若要串流完成，請在呼叫模型時設定 "stream": true。

{
    "model": "DeepSeek-R1",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "How many languages are in the world?"
        }
    ],
    "stream": true,
    "max_tokens": 2048
}

若要將輸出視覺化，請定義協助程式函式來列印串流。下列範例會實作只串流答案且不含推理內容的路由：

{
    "id": "23b54589eba14564ad8a2e6978775a39",
    "object": "chat.completion.chunk",
    "created": 1718726371,
    "model": "DeepSeek-R1",
    "choices": [
        {
            "index": 0,
            "delta": {
                "role": "assistant",
                "content": ""
            },
            "finish_reason": null,
            "logprobs": null
        }
    ]
}

串流中的最後一則訊息已設定 finish_reason，其會指出產生流程停止的原因。

{
    "id": "23b54589eba14564ad8a2e6978775a39",
    "object": "chat.completion.chunk",
    "created": 1718726371,
    "model": "DeepSeek-R1",
    "choices": [
        {
            "index": 0,
            "delta": {
                "content": ""
            },
            "finish_reason": "stop",
            "logprobs": null
        }
    ],
    "usage": {
        "prompt_tokens": 11,
        "total_tokens": 897,
        "completion_tokens": 886
    }
}

參數

一般而言，推理模型不支援在聊天完成模型中找到的下列參數：

溫度
存在懲罰
重複懲罰
參數 top_p

某些模型支援使用工具或結構化輸出（包括 JSON 架構）。閱讀 [ 模型詳細數據] 頁面，以瞭解每個模型的支援。

套用內容安全

Azure AI 模型推斷 API 支援 Azure AI 內容安全。當您使用已開啟 Azure AI 內容安全的部署時，輸入和輸出都會通過旨在偵測及防止有害內容輸出的一組分類模型。內容篩選系統會偵測並針對輸入提示和輸出完成中的特定類別的潛在有害內容採取動作。

下列範例示範當模型偵測到輸入提示中的有害內容並啟用內容安全時，如何處理事件。

{
    "model": "DeepSeek-R1",
    "messages": [
        {
            "role": "user",
            "content": "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
        }
    ]
}

{
    "error": {
        "message": "The response was filtered due to the prompt triggering Microsoft's content management policy. Please modify your prompt and retry.",
        "type": null,
        "param": "prompt",
        "code": "content_filter",
        "status": 400
    }
}

提示

若要深入了解如何設定及控制 Azure AI 內容安全設定，請參閱 Azure AI 內容安全文件。

共用方式為

推理模型

必要條件

搭配聊天使用推理功能

建立聊天完成要求

推理內容

串流內容

參數

套用內容安全

推理模型

必要條件

搭配聊天使用推理功能

建立聊天完成要求

推理內容

串流內容

參數

套用內容安全

推理模型

必要條件

搭配聊天使用推理功能

建立聊天完成要求

推理內容

串流內容

參數

推理模型

必要條件

搭配聊天使用推理功能

建立聊天完成要求

推理內容

串流內容

參數

套用內容安全

推理模型

必要條件

搭配聊天使用推理功能

建立聊天完成要求

推理內容

串流內容

參數

套用內容安全

相關內容

意見反應

其他資源