次の方法で共有


Semantic Kernel SDK でカスタムおよびローカル AI モデルを使用する

この記事では、カスタムおよびローカル モデルを Semantic Kernel SDK に統合しテキスト生成とチャット入力候補のためにそれらを使用する方法について説明します。

手順を調整すれば、アクセスする場所や方法に関係なく、アクセスできるどのモデルともそれらを使用できます。 たとえば、codellama モデルを Semantic Kernel SDK と統合すると、コードの生成とディスカッションができるようになります。

カスタムおよびローカル モデルでは、多くの場合、REST API を介してアクセスが提供されます。例については、Ollama の OpenAI との互換性に関するページをご覧ください。 モデルは統合する前に、ホストされており、ご自分の .NET アプリケーションにHTTPS 経由でアクセスできる必要があります。

前提条件

ローカル モデルを使用してテキスト生成を実装する

次の項では、モデルを Semantic Kernel SDK と統合してからそれを使用してテキスト入力候補を生成する方法を示します。

  1. ITextGenerationService インターフェイスを実装するサービス クラスを作成します。 次に例を示します。

    class MyTextGenerationService : ITextGenerationService
    {
        private IReadOnlyDictionary<string, object?>? _attributes;
        public IReadOnlyDictionary<string, object?> Attributes =>
            _attributes ??= new Dictionary<string, object?>();
    
        public string ModelUrl { get; init; } = "<default url to your model's Chat API>";
        public required string ModelApiKey { get; init; }
    
        public async IAsyncEnumerable<StreamingTextContent> GetStreamingTextContentsAsync(
            string prompt,
            PromptExecutionSettings? executionSettings = null,
            Kernel? kernel = null,
            [EnumeratorCancellation] CancellationToken cancellationToken = default
        )
        {
            // Build your model's request object, specify that streaming is requested
            MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings);
            request.Stream = true;
    
            // Send the completion request via HTTP
            using var httpClient = new HttpClient();
    
            // Send a POST to your model with the serialized request in the body
            using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
                ModelUrl,
                request,
                cancellationToken
            );
    
            // Verify the request was completed successfully
            httpResponse.EnsureSuccessStatusCode();
    
            // Read your models response as a stream
            using StreamReader reader =
                new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken));
    
            // Iteratively read a chunk of the response until the end of the stream
            // It is more efficient to use a buffer that is the same size as the internal buffer of the stream
            // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters)
            char[] buffer = new char[2048];
            while (!reader.EndOfStream)
            {
                // Check the cancellation token with each iteration
                cancellationToken.ThrowIfCancellationRequested();
    
                // Fill the buffer with the next set of characters, track how many characters were read
                int readCount = reader.Read(buffer, 0, buffer.Length);
    
                // Convert the character buffer to a string, only include as many characters as were just read
                string chunk = new(buffer, 0, readCount);
    
                yield return new StreamingTextContent(chunk);
            }
        }
    
        public async Task<IReadOnlyList<TextContent>> GetTextContentsAsync(
            string prompt,
            PromptExecutionSettings? executionSettings = null,
            Kernel? kernel = null,
            CancellationToken cancellationToken = default
        )
        {
            // Build your model's request object
            MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings);
    
            // Send the completion request via HTTP
            using var httpClient = new HttpClient();
    
            // Send a POST to your model with the serialized request in the body
            using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
                ModelUrl,
                request,
                cancellationToken
            );
    
            // Verify the request was completed successfully
            httpResponse.EnsureSuccessStatusCode();
    
            // Deserialize the response body to your model's response object
            // Handle when the deserialization fails and returns null
            MyModelResponse response =
                await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken)
                ?? throw new Exception("Failed to deserialize response from model");
    
            // Convert your model's response into a list of ChatMessageContent
            return response
                .Completions.Select<string, TextContent>(completion => new(completion))
                .ToImmutableList();
        }
    }
    
  2. Kernel をビルドするときに、その新しいサービス クラスを含めます。 次に例を示します。

    IKernelBuilder builder = Kernel.CreateBuilder();
    
    // Add your text generation service as a singleton instance
    builder.Services.AddKeyedSingleton<ITextGenerationService>(
        "myTextService1",
        new MyTextGenerationService
        {
            // Specify any properties specific to your service, such as the url or API key
            ModelUrl = "https://localhost:38748",
            ModelApiKey = "myApiKey"
        }
    );
    
    // Alternatively, add your text generation service as a factory method
    builder.Services.AddKeyedSingleton<ITextGenerationService>(
        "myTextService2",
        (_, _) =>
            new MyTextGenerationService
            {
                // Specify any properties specific to your service, such as the url or API key
                ModelUrl = "https://localhost:38748",
                ModelApiKey = "myApiKey"
            }
    );
    
    // Add any other Kernel services or configurations
    // ...
    Kernel kernel = builder.Build();
    
  3. Kernel によって、またはそのサービス クラスを使用して、テキスト生成のプロンプトをモデルに直接送信します。 次に例を示します。

    var executionSettings = new PromptExecutionSettings
    {
        // Add execution settings, such as the ModelID and ExtensionData
        ModelId = "MyModelId",
        ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } }
    };
    
    // Send a prompt to your model directly through the Kernel
    // The Kernel response will be null if the model can't be reached
    string prompt = "Please list three services offered by Azure";
    string? response = await kernel.InvokePromptAsync<string>(prompt);
    Console.WriteLine($"Output: {response}");
    
    // Alteratively, send a prompt to your model through the text generation service
    ITextGenerationService textService = kernel.GetRequiredService<ITextGenerationService>();
    TextContent responseContents = await textService.GetTextContentAsync(
        prompt,
        executionSettings
    );
    Console.WriteLine($"Output: {responseContents.Text}");
    

ローカル モデルを使用してチャット入力候補を実装する

次の項では、モデルを Semantic Kernel SDK と統合してからチャット入力候補用にそれを使用する方法を示します。

  1. IChatCompletionService インターフェイスを実装するサービス クラスを作成します。 次に例を示します。

    class MyChatCompletionService : IChatCompletionService
    {
        private IReadOnlyDictionary<string, object?>? _attributes;
        public IReadOnlyDictionary<string, object?> Attributes =>
            _attributes ??= new Dictionary<string, object?>();
    
        public string ModelUrl { get; init; } = "<default url to your model's Chat API>";
        public required string ModelApiKey { get; init; }
    
        public async Task<IReadOnlyList<ChatMessageContent>> GetChatMessageContentsAsync(
            ChatHistory chatHistory,
            PromptExecutionSettings? executionSettings = null,
            Kernel? kernel = null,
            CancellationToken cancellationToken = default
        )
        {
            // Build your model's request object
            MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings);
    
            // Send the completion request via HTTP
            using var httpClient = new HttpClient();
    
            // Send a POST to your model with the serialized request in the body
            using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
                ModelUrl,
                request,
                cancellationToken
            );
    
            // Verify the request was completed successfully
            httpResponse.EnsureSuccessStatusCode();
    
            // Deserialize the response body to your model's response object
            // Handle when the deserialization fails and returns null
            MyModelResponse response =
                await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken)
                ?? throw new Exception("Failed to deserialize response from model");
    
            // Convert your model's response into a list of ChatMessageContent
            return response
                .Completions.Select<string, ChatMessageContent>(completion =>
                    new(AuthorRole.Assistant, completion)
                )
                .ToImmutableList();
        }
    
        public async IAsyncEnumerable<StreamingChatMessageContent> GetStreamingChatMessageContentsAsync(
            ChatHistory chatHistory,
            PromptExecutionSettings? executionSettings = null,
            Kernel? kernel = null,
            [EnumeratorCancellation] CancellationToken cancellationToken = default
        )
        {
            // Build your model's request object, specify that streaming is requested
            MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings);
            request.Stream = true;
    
            // Send the completion request via HTTP
            using var httpClient = new HttpClient();
    
            // Send a POST to your model with the serialized request in the body
            using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
                ModelUrl,
                request,
                cancellationToken
            );
    
            // Verify the request was completed successfully
            httpResponse.EnsureSuccessStatusCode();
    
            // Read your models response as a stream
            using StreamReader reader =
                new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken));
    
            // Iteratively read a chunk of the response until the end of the stream
            // It is more efficient to use a buffer that is the same size as the internal buffer of the stream
            // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters)
            char[] buffer = new char[2048];
            while (!reader.EndOfStream)
            {
                // Check the cancellation token with each iteration
                cancellationToken.ThrowIfCancellationRequested();
    
                // Fill the buffer with the next set of characters, track how many characters were read
                int readCount = reader.Read(buffer, 0, buffer.Length);
    
                // Convert the character buffer to a string, only include as many characters as were just read
                string chunk = new(buffer, 0, readCount);
    
                yield return new StreamingChatMessageContent(AuthorRole.Assistant, chunk);
            }
        }
    }
    
  2. Kernel をビルドするときに、その新しいサービス クラスを含めます。 次に例を示します。

    IKernelBuilder builder = Kernel.CreateBuilder();
    
    // Add your chat completion service as a singleton instance
    builder.Services.AddKeyedSingleton<IChatCompletionService>(
        "myChatService1",
        new MyChatCompletionService
        {
            // Specify any properties specific to your service, such as the url or API key
            ModelUrl = "https://localhost:38748",
            ModelApiKey = "myApiKey"
        }
    );
    
    // Alternatively, add your chat completion service as a factory method
    builder.Services.AddKeyedSingleton<IChatCompletionService>(
        "myChatService2",
        (_, _) =>
            new MyChatCompletionService
            {
                // Specify any properties specific to your service, such as the url or API key
                ModelUrl = "https://localhost:38748",
                ModelApiKey = "myApiKey"
            }
    );
    
    // Add any other Kernel services or configurations
    // ...
    Kernel kernel = builder.Build();
    
  3. Kernel によって、またはそのサービス クラスを使用して、チャット入力候補のプロンプトをモデルに直接送信します。 次に例を示します。

    var executionSettings = new PromptExecutionSettings
    {
        // Add execution settings, such as the ModelID and ExtensionData
        ModelId = "MyModelId",
        ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } }
    };
    
    // Send a string representation of the chat history to your model directly through the Kernel
    // This uses a special syntax to denote the role for each message
    // For more information on this syntax see:
    // https://learn.microsoft.com/en-us/semantic-kernel/prompts/your-first-prompt?tabs=Csharp
    string prompt = """
        <message role="system">the initial system message for your chat history</message>
        <message role="user">the user's initial message</message>
        """;
    
    string? response = await kernel.InvokePromptAsync<string>(prompt);
    Console.WriteLine($"Output: {response}");
    
    // Alteratively, send a prompt to your model through the chat completion service
    // First, initialize a chat history with your initial system message
    string systemMessage = "<the initial system message for your chat history>";
    Console.WriteLine($"System Prompt: {systemMessage}");
    var chatHistory = new ChatHistory(systemMessage);
    
    // Add the user's input to your chat history
    string userRequest = "<the user's initial message>";
    Console.WriteLine($"User: {userRequest}");
    chatHistory.AddUserMessage(userRequest);
    
    // Get the models response and add it to the chat history
    IChatCompletionService service = kernel.GetRequiredService<IChatCompletionService>();
    ChatMessageContent responseMessage = await service.GetChatMessageContentAsync(
        chatHistory,
        executionSettings
    );
    Console.WriteLine($"Assistant: {responseMessage.Content}");
    chatHistory.Add(responseMessage);
    
    // Continue sending and receiving messages between the user and model
    // ...