의미 체계 커널 SDK에서 사용자 지정 및 로컬 AI 모델 사용

아티클
02/02/2025

이 문서에서는 사용자 지정 및 로컬 모델을 의미 체계 커널 SDK 통합하고 텍스트 생성 및 채팅 완료에 사용하는 방법을 보여 줍니다.

액세스하는 위치 또는 방법에 관계없이 액세스할 수 있는 모든 모델에서 사용할 수 있도록 단계를 조정할 수 있습니다. 예를 들어 코델라마 모델을 의미 체계 커널 SDK와 통합하여 코드 생성 및 토론을 사용하도록 설정할 수 있습니다.

사용자 지정 및 로컬 모델은 종종 REST API를 통해 액세스를 제공합니다. 예를 들어 Ollama OpenAI 호환성을 참조하세요. 모델을 통합하기 전에 HTTPS를 통해 .NET 애플리케이션에 호스트되고 액세스할 수 있어야 합니다.

필수 구성 요소

활성 구독이 있는 Azure 계정입니다. 계정을 무료로만드세요.
.NET SDK
NuGet 패키지 Microsoft.SemanticKernel
.NET 애플리케이션에 배포되고 액세스할 수 있는 사용자 지정 또는 로컬 모델

로컬 모델을 사용하여 텍스트 생성 구현

다음 섹션에서는 모델을 의미 체계 커널 SDK와 통합한 다음 이를 사용하여 텍스트 완성을 생성하는 방법을 보여 줍니다.

ITextGenerationService 인터페이스를 구현하는 서비스 클래스를 만듭니다. 예를 들어:

class MyTextGenerationService : ITextGenerationService
{
    private IReadOnlyDictionary<string, object?>? _attributes;
    public IReadOnlyDictionary<string, object?> Attributes =>
        _attributes ??= new Dictionary<string, object?>();

    public string ModelUrl { get; init; } = "<default url to your model's Chat API>";
    public required string ModelApiKey { get; init; }

    public async IAsyncEnumerable<StreamingTextContent> GetStreamingTextContentsAsync(
        string prompt,
        PromptExecutionSettings? executionSettings = null,
        Kernel? kernel = null,
        [EnumeratorCancellation] CancellationToken cancellationToken = default
    )
    {
        // Build your model's request object, specify that streaming is requested
        MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings);
        request.Stream = true;

        // Send the completion request via HTTP
        using var httpClient = new HttpClient();

        // Send a POST to your model with the serialized request in the body
        using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
            ModelUrl,
            request,
            cancellationToken
        );

        // Verify the request was completed successfully
        httpResponse.EnsureSuccessStatusCode();

        // Read your models response as a stream
        using StreamReader reader =
            new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken));

        // Iteratively read a chunk of the response until the end of the stream
        // It is more efficient to use a buffer that is the same size as the internal buffer of the stream
        // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters)
        char[] buffer = new char[2048];
        while (!reader.EndOfStream)
        {
            // Check the cancellation token with each iteration
            cancellationToken.ThrowIfCancellationRequested();

            // Fill the buffer with the next set of characters, track how many characters were read
            int readCount = reader.Read(buffer, 0, buffer.Length);

            // Convert the character buffer to a string, only include as many characters as were just read
            string chunk = new(buffer, 0, readCount);

            yield return new StreamingTextContent(chunk);
        }
    }

    public async Task<IReadOnlyList<TextContent>> GetTextContentsAsync(
        string prompt,
        PromptExecutionSettings? executionSettings = null,
        Kernel? kernel = null,
        CancellationToken cancellationToken = default
    )
    {
        // Build your model's request object
        MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings);

        // Send the completion request via HTTP
        using var httpClient = new HttpClient();

        // Send a POST to your model with the serialized request in the body
        using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
            ModelUrl,
            request,
            cancellationToken
        );

        // Verify the request was completed successfully
        httpResponse.EnsureSuccessStatusCode();

        // Deserialize the response body to your model's response object
        // Handle when the deserialization fails and returns null
        MyModelResponse response =
            await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken)
            ?? throw new Exception("Failed to deserialize response from model");

        // Convert your model's response into a list of ChatMessageContent
        return response
            .Completions.Select<string, TextContent>(completion => new(completion))
            .ToImmutableList();
    }
}

Kernel빌드할 때 새 서비스 클래스를 포함합니다. 예를 들어:

IKernelBuilder builder = Kernel.CreateBuilder();

// Add your text generation service as a singleton instance
builder.Services.AddKeyedSingleton<ITextGenerationService>(
    "myTextService1",
    new MyTextGenerationService
    {
        // Specify any properties specific to your service, such as the url or API key
        ModelUrl = "https://localhost:38748",
        ModelApiKey = "myApiKey"
    }
);

// Alternatively, add your text generation service as a factory method
builder.Services.AddKeyedSingleton<ITextGenerationService>(
    "myTextService2",
    (_, _) =>
        new MyTextGenerationService
        {
            // Specify any properties specific to your service, such as the url or API key
            ModelUrl = "https://localhost:38748",
            ModelApiKey = "myApiKey"
        }
);

// Add any other Kernel services or configurations
// ...
Kernel kernel = builder.Build();

Kernel 통해 또는 서비스 클래스를 사용하여 직접 모델에 텍스트 생성 프롬프트를 보냅니다. 예를 들어:

var executionSettings = new PromptExecutionSettings
{
    // Add execution settings, such as the ModelID and ExtensionData
    ModelId = "MyModelId",
    ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } }
};

// Send a prompt to your model directly through the Kernel
// The Kernel response will be null if the model can't be reached
string prompt = "Please list three services offered by Azure";
string? response = await kernel.InvokePromptAsync<string>(prompt);
Console.WriteLine($"Output: {response}");

// Alteratively, send a prompt to your model through the text generation service
ITextGenerationService textService = kernel.GetRequiredService<ITextGenerationService>();
TextContent responseContents = await textService.GetTextContentAsync(
    prompt,
    executionSettings
);
Console.WriteLine($"Output: {responseContents.Text}");

로컬 모델을 사용하여 채팅 완료 구현

다음 섹션에서는 모델을 의미 체계 커널 SDK와 통합한 다음 채팅 완료에 사용하는 방법을 보여 줍니다.

IChatCompletionService 인터페이스를 구현하는 서비스 클래스를 만듭니다. 예를 들어:

class MyChatCompletionService : IChatCompletionService
{
    private IReadOnlyDictionary<string, object?>? _attributes;
    public IReadOnlyDictionary<string, object?> Attributes =>
        _attributes ??= new Dictionary<string, object?>();

    public string ModelUrl { get; init; } = "<default url to your model's Chat API>";
    public required string ModelApiKey { get; init; }

    public async Task<IReadOnlyList<ChatMessageContent>> GetChatMessageContentsAsync(
        ChatHistory chatHistory,
        PromptExecutionSettings? executionSettings = null,
        Kernel? kernel = null,
        CancellationToken cancellationToken = default
    )
    {
        // Build your model's request object
        MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings);

        // Send the completion request via HTTP
        using var httpClient = new HttpClient();

        // Send a POST to your model with the serialized request in the body
        using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
            ModelUrl,
            request,
            cancellationToken
        );

        // Verify the request was completed successfully
        httpResponse.EnsureSuccessStatusCode();

        // Deserialize the response body to your model's response object
        // Handle when the deserialization fails and returns null
        MyModelResponse response =
            await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken)
            ?? throw new Exception("Failed to deserialize response from model");

        // Convert your model's response into a list of ChatMessageContent
        return response
            .Completions.Select<string, ChatMessageContent>(completion =>
                new(AuthorRole.Assistant, completion)
            )
            .ToImmutableList();
    }

    public async IAsyncEnumerable<StreamingChatMessageContent> GetStreamingChatMessageContentsAsync(
        ChatHistory chatHistory,
        PromptExecutionSettings? executionSettings = null,
        Kernel? kernel = null,
        [EnumeratorCancellation] CancellationToken cancellationToken = default
    )
    {
        // Build your model's request object, specify that streaming is requested
        MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings);
        request.Stream = true;

        // Send the completion request via HTTP
        using var httpClient = new HttpClient();

        // Send a POST to your model with the serialized request in the body
        using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
            ModelUrl,
            request,
            cancellationToken
        );

        // Verify the request was completed successfully
        httpResponse.EnsureSuccessStatusCode();

        // Read your models response as a stream
        using StreamReader reader =
            new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken));

        // Iteratively read a chunk of the response until the end of the stream
        // It is more efficient to use a buffer that is the same size as the internal buffer of the stream
        // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters)
        char[] buffer = new char[2048];
        while (!reader.EndOfStream)
        {
            // Check the cancellation token with each iteration
            cancellationToken.ThrowIfCancellationRequested();

            // Fill the buffer with the next set of characters, track how many characters were read
            int readCount = reader.Read(buffer, 0, buffer.Length);

            // Convert the character buffer to a string, only include as many characters as were just read
            string chunk = new(buffer, 0, readCount);

            yield return new StreamingChatMessageContent(AuthorRole.Assistant, chunk);
        }
    }
}

Kernel빌드할 때 새 서비스 클래스를 포함합니다. 예를 들어:

IKernelBuilder builder = Kernel.CreateBuilder();

// Add your chat completion service as a singleton instance
builder.Services.AddKeyedSingleton<IChatCompletionService>(
    "myChatService1",
    new MyChatCompletionService
    {
        // Specify any properties specific to your service, such as the url or API key
        ModelUrl = "https://localhost:38748",
        ModelApiKey = "myApiKey"
    }
);

// Alternatively, add your chat completion service as a factory method
builder.Services.AddKeyedSingleton<IChatCompletionService>(
    "myChatService2",
    (_, _) =>
        new MyChatCompletionService
        {
            // Specify any properties specific to your service, such as the url or API key
            ModelUrl = "https://localhost:38748",
            ModelApiKey = "myApiKey"
        }
);

// Add any other Kernel services or configurations
// ...
Kernel kernel = builder.Build();

Kernel 통해 또는 서비스 클래스를 사용하여 직접 모델에 채팅 완료 프롬프트를 보냅니다. 예를 들어:

var executionSettings = new PromptExecutionSettings
{
    // Add execution settings, such as the ModelID and ExtensionData
    ModelId = "MyModelId",
    ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } }
};

// Send a string representation of the chat history to your model directly through the Kernel
// This uses a special syntax to denote the role for each message
// For more information on this syntax see:
// https://learn.microsoft.com/en-us/semantic-kernel/prompts/your-first-prompt?tabs=Csharp
string prompt = """
    <message role="system">the initial system message for your chat history</message>
    <message role="user">the user's initial message</message>
    """;

string? response = await kernel.InvokePromptAsync<string>(prompt);
Console.WriteLine($"Output: {response}");

// Alteratively, send a prompt to your model through the chat completion service
// First, initialize a chat history with your initial system message
string systemMessage = "<the initial system message for your chat history>";
Console.WriteLine($"System Prompt: {systemMessage}");
var chatHistory = new ChatHistory(systemMessage);

// Add the user's input to your chat history
string userRequest = "<the user's initial message>";
Console.WriteLine($"User: {userRequest}");
chatHistory.AddUserMessage(userRequest);

// Get the models response and add it to the chat history
IChatCompletionService service = kernel.GetRequiredService<IChatCompletionService>();
ChatMessageContent responseMessage = await service.GetChatMessageContentAsync(
    chatHistory,
    executionSettings
);
Console.WriteLine($"Assistant: {responseMessage.Content}");
chatHistory.Add(responseMessage);

// Continue sending and receiving messages between the user and model
// ...

다음을 통해 공유

의미 체계 커널 SDK에서 사용자 지정 및 로컬 AI 모델 사용

필수 구성 요소

로컬 모델을 사용하여 텍스트 생성 구현

로컬 모델을 사용하여 채팅 완료 구현

추가 리소스

다음을 통해 공유

의미 체계 커널 SDK에서 사용자 지정 및 로컬 AI 모델 사용

필수 구성 요소

로컬 모델을 사용하여 텍스트 생성 구현

로컬 모델을 사용하여 채팅 완료 구현

관련 콘텐츠

추가 리소스