Dela via


Använda anpassade och lokala AI-modeller med Semantic Kernel SDK

Den här artikeln visar hur du integrerar anpassade och lokala modeller i Semantic Kernel SDK och använder dem för textgenerering och chattavslutningar.

Du kan anpassa stegen för att använda dem med valfri modell som du kan komma åt, oavsett var eller hur du kommer åt den. Du kan till exempel integrera codellama-modellen med Semantic Kernel SDK för att aktivera kodgenerering och diskussion.

Anpassade och lokala modeller ger ofta åtkomst via REST-API:er, till exempel se Ollama OpenAI-kompatibilitet. Innan du integrerar din modell måste den vara värdhanterad och tillgänglig för ditt .NET-program via HTTPS.

Förutsättningar

Implementera textgenerering med hjälp av en lokal modell

I följande avsnitt visas hur du kan integrera din modell med Semantic Kernel SDK och sedan använda den för att generera textavslutningar.

  1. Skapa en tjänstklass som implementerar ITextGenerationService gränssnittet. Till exempel:

    class MyTextGenerationService : ITextGenerationService
    {
        private IReadOnlyDictionary<string, object?>? _attributes;
        public IReadOnlyDictionary<string, object?> Attributes =>
            _attributes ??= new Dictionary<string, object?>();
    
        public string ModelUrl { get; init; } = "<default url to your model's Chat API>";
        public required string ModelApiKey { get; init; }
    
        public async IAsyncEnumerable<StreamingTextContent> GetStreamingTextContentsAsync(
            string prompt,
            PromptExecutionSettings? executionSettings = null,
            Kernel? kernel = null,
            [EnumeratorCancellation] CancellationToken cancellationToken = default
        )
        {
            // Build your model's request object, specify that streaming is requested
            MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings);
            request.Stream = true;
    
            // Send the completion request via HTTP
            using var httpClient = new HttpClient();
    
            // Send a POST to your model with the serialized request in the body
            using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
                ModelUrl,
                request,
                cancellationToken
            );
    
            // Verify the request was completed successfully
            httpResponse.EnsureSuccessStatusCode();
    
            // Read your models response as a stream
            using StreamReader reader =
                new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken));
    
            // Iteratively read a chunk of the response until the end of the stream
            // It is more efficient to use a buffer that is the same size as the internal buffer of the stream
            // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters)
            char[] buffer = new char[2048];
            while (!reader.EndOfStream)
            {
                // Check the cancellation token with each iteration
                cancellationToken.ThrowIfCancellationRequested();
    
                // Fill the buffer with the next set of characters, track how many characters were read
                int readCount = reader.Read(buffer, 0, buffer.Length);
    
                // Convert the character buffer to a string, only include as many characters as were just read
                string chunk = new(buffer, 0, readCount);
    
                yield return new StreamingTextContent(chunk);
            }
        }
    
        public async Task<IReadOnlyList<TextContent>> GetTextContentsAsync(
            string prompt,
            PromptExecutionSettings? executionSettings = null,
            Kernel? kernel = null,
            CancellationToken cancellationToken = default
        )
        {
            // Build your model's request object
            MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings);
    
            // Send the completion request via HTTP
            using var httpClient = new HttpClient();
    
            // Send a POST to your model with the serialized request in the body
            using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
                ModelUrl,
                request,
                cancellationToken
            );
    
            // Verify the request was completed successfully
            httpResponse.EnsureSuccessStatusCode();
    
            // Deserialize the response body to your model's response object
            // Handle when the deserialization fails and returns null
            MyModelResponse response =
                await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken)
                ?? throw new Exception("Failed to deserialize response from model");
    
            // Convert your model's response into a list of ChatMessageContent
            return response
                .Completions.Select<string, TextContent>(completion => new(completion))
                .ToImmutableList();
        }
    }
    
  2. Inkludera den nya tjänstklassen när du Kernelskapar . Till exempel:

    IKernelBuilder builder = Kernel.CreateBuilder();
    
    // Add your text generation service as a singleton instance
    builder.Services.AddKeyedSingleton<ITextGenerationService>(
        "myTextService1",
        new MyTextGenerationService
        {
            // Specify any properties specific to your service, such as the url or API key
            ModelUrl = "https://localhost:38748",
            ModelApiKey = "myApiKey"
        }
    );
    
    // Alternatively, add your text generation service as a factory method
    builder.Services.AddKeyedSingleton<ITextGenerationService>(
        "myTextService2",
        (_, _) =>
            new MyTextGenerationService
            {
                // Specify any properties specific to your service, such as the url or API key
                ModelUrl = "https://localhost:38748",
                ModelApiKey = "myApiKey"
            }
    );
    
    // Add any other Kernel services or configurations
    // ...
    Kernel kernel = builder.Build();
    
  3. Skicka en textgenereringsprompt till din modell direkt via Kernel eller med hjälp av tjänstklassen. Till exempel:

    var executionSettings = new PromptExecutionSettings
    {
        // Add execution settings, such as the ModelID and ExtensionData
        ModelId = "MyModelId",
        ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } }
    };
    
    // Send a prompt to your model directly through the Kernel
    // The Kernel response will be null if the model can't be reached
    string prompt = "Please list three services offered by Azure";
    string? response = await kernel.InvokePromptAsync<string>(prompt);
    Console.WriteLine($"Output: {response}");
    
    // Alteratively, send a prompt to your model through the text generation service
    ITextGenerationService textService = kernel.GetRequiredService<ITextGenerationService>();
    TextContent responseContents = await textService.GetTextContentAsync(
        prompt,
        executionSettings
    );
    Console.WriteLine($"Output: {responseContents.Text}");
    

Implementera chattens slutförande med hjälp av en lokal modell

Följande avsnitt visar hur du kan integrera din modell med Semantic Kernel SDK och sedan använda den för att slutföra chatten.

  1. Skapa en tjänstklass som implementerar IChatCompletionService gränssnittet. Till exempel:

    class MyChatCompletionService : IChatCompletionService
    {
        private IReadOnlyDictionary<string, object?>? _attributes;
        public IReadOnlyDictionary<string, object?> Attributes =>
            _attributes ??= new Dictionary<string, object?>();
    
        public string ModelUrl { get; init; } = "<default url to your model's Chat API>";
        public required string ModelApiKey { get; init; }
    
        public async Task<IReadOnlyList<ChatMessageContent>> GetChatMessageContentsAsync(
            ChatHistory chatHistory,
            PromptExecutionSettings? executionSettings = null,
            Kernel? kernel = null,
            CancellationToken cancellationToken = default
        )
        {
            // Build your model's request object
            MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings);
    
            // Send the completion request via HTTP
            using var httpClient = new HttpClient();
    
            // Send a POST to your model with the serialized request in the body
            using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
                ModelUrl,
                request,
                cancellationToken
            );
    
            // Verify the request was completed successfully
            httpResponse.EnsureSuccessStatusCode();
    
            // Deserialize the response body to your model's response object
            // Handle when the deserialization fails and returns null
            MyModelResponse response =
                await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken)
                ?? throw new Exception("Failed to deserialize response from model");
    
            // Convert your model's response into a list of ChatMessageContent
            return response
                .Completions.Select<string, ChatMessageContent>(completion =>
                    new(AuthorRole.Assistant, completion)
                )
                .ToImmutableList();
        }
    
        public async IAsyncEnumerable<StreamingChatMessageContent> GetStreamingChatMessageContentsAsync(
            ChatHistory chatHistory,
            PromptExecutionSettings? executionSettings = null,
            Kernel? kernel = null,
            [EnumeratorCancellation] CancellationToken cancellationToken = default
        )
        {
            // Build your model's request object, specify that streaming is requested
            MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings);
            request.Stream = true;
    
            // Send the completion request via HTTP
            using var httpClient = new HttpClient();
    
            // Send a POST to your model with the serialized request in the body
            using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync(
                ModelUrl,
                request,
                cancellationToken
            );
    
            // Verify the request was completed successfully
            httpResponse.EnsureSuccessStatusCode();
    
            // Read your models response as a stream
            using StreamReader reader =
                new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken));
    
            // Iteratively read a chunk of the response until the end of the stream
            // It is more efficient to use a buffer that is the same size as the internal buffer of the stream
            // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters)
            char[] buffer = new char[2048];
            while (!reader.EndOfStream)
            {
                // Check the cancellation token with each iteration
                cancellationToken.ThrowIfCancellationRequested();
    
                // Fill the buffer with the next set of characters, track how many characters were read
                int readCount = reader.Read(buffer, 0, buffer.Length);
    
                // Convert the character buffer to a string, only include as many characters as were just read
                string chunk = new(buffer, 0, readCount);
    
                yield return new StreamingChatMessageContent(AuthorRole.Assistant, chunk);
            }
        }
    }
    
  2. Inkludera den nya tjänstklassen när du Kernelskapar . Till exempel:

    IKernelBuilder builder = Kernel.CreateBuilder();
    
    // Add your chat completion service as a singleton instance
    builder.Services.AddKeyedSingleton<IChatCompletionService>(
        "myChatService1",
        new MyChatCompletionService
        {
            // Specify any properties specific to your service, such as the url or API key
            ModelUrl = "https://localhost:38748",
            ModelApiKey = "myApiKey"
        }
    );
    
    // Alternatively, add your chat completion service as a factory method
    builder.Services.AddKeyedSingleton<IChatCompletionService>(
        "myChatService2",
        (_, _) =>
            new MyChatCompletionService
            {
                // Specify any properties specific to your service, such as the url or API key
                ModelUrl = "https://localhost:38748",
                ModelApiKey = "myApiKey"
            }
    );
    
    // Add any other Kernel services or configurations
    // ...
    Kernel kernel = builder.Build();
    
  3. Skicka en uppmaning om att slutföra chatten Kernel till din modell direkt via eller med hjälp av tjänstklassen. Till exempel:

    var executionSettings = new PromptExecutionSettings
    {
        // Add execution settings, such as the ModelID and ExtensionData
        ModelId = "MyModelId",
        ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } }
    };
    
    // Send a string representation of the chat history to your model directly through the Kernel
    // This uses a special syntax to denote the role for each message
    // For more information on this syntax see:
    // https://learn.microsoft.com/en-us/semantic-kernel/prompts/your-first-prompt?tabs=Csharp
    string prompt = """
        <message role="system">the initial system message for your chat history</message>
        <message role="user">the user's initial message</message>
        """;
    
    string? response = await kernel.InvokePromptAsync<string>(prompt);
    Console.WriteLine($"Output: {response}");
    
    // Alteratively, send a prompt to your model through the chat completion service
    // First, initialize a chat history with your initial system message
    string systemMessage = "<the initial system message for your chat history>";
    Console.WriteLine($"System Prompt: {systemMessage}");
    var chatHistory = new ChatHistory(systemMessage);
    
    // Add the user's input to your chat history
    string userRequest = "<the user's initial message>";
    Console.WriteLine($"User: {userRequest}");
    chatHistory.AddUserMessage(userRequest);
    
    // Get the models response and add it to the chat history
    IChatCompletionService service = kernel.GetRequiredService<IChatCompletionService>();
    ChatMessageContent responseMessage = await service.GetChatMessageContentAsync(
        chatHistory,
        executionSettings
    );
    Console.WriteLine($"Assistant: {responseMessage.Content}");
    chatHistory.Add(responseMessage);
    
    // Continue sending and receiving messages between the user and model
    // ...