Verwenden von benutzerdefinierten und lokalen KI-Modellen mit dem Semantic Kernel SDK
In diesem Artikel werden die Integration von benutzerdefinierten und lokalen Modellen in das Semantic Kernel SDK und ihre Verwendung für Textgenerierung und Chatvervollständigung erläutert.
Sie können die Schritte anpassen, um sie mit jedem Modell zu verwenden, auf das Sie zugreifen können, unabhängig davon, wo oder wie Sie darauf zugreifen. Sie können z. B. das Codellama-Modell in das Semantic Kernel SDK integrieren, um Codegenerierung und Diskussionen zu ermöglichen.
Benutzerdefinierte und lokale Modelle bieten häufig Zugriff über REST-APIs, siehe z. B. Ollama OpenAI-Kompatibilität. Bevor Sie Ihr Modell integrieren, muss es über HTTPS gehostet und für Ihre .NET-Anwendung zugänglich gemacht werden.
Voraussetzungen
- Ein Azure-Konto mit einem aktiven Abonnement. Sie können kostenlos ein Konto erstellen.
- .NET SDK
- NuGet-Paket „
Microsoft.SemanticKernel
“ - Ein benutzerdefiniertes oder lokales Modell, das für Ihre .NET-Anwendung bereitgestellt wurde und zugänglich ist
Implementieren der Textgenerierung mithilfe eines lokalen Modells
Im folgenden Abschnitt wird gezeigt, wie Sie Ihr Modell in das Semantic Kernel SDK integrieren und es dann verwenden können, um Textvervollständigungen zu generieren.
Erstellen Sie eine Dienstklasse, die die
ITextGenerationService
-Schnittstelle implementiert. Zum Beispiel:class MyTextGenerationService : ITextGenerationService { private IReadOnlyDictionary<string, object?>? _attributes; public IReadOnlyDictionary<string, object?> Attributes => _attributes ??= new Dictionary<string, object?>(); public string ModelUrl { get; init; } = "<default url to your model's Chat API>"; public required string ModelApiKey { get; init; } public async IAsyncEnumerable<StreamingTextContent> GetStreamingTextContentsAsync( string prompt, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, [EnumeratorCancellation] CancellationToken cancellationToken = default ) { // Build your model's request object, specify that streaming is requested MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings); request.Stream = true; // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Read your models response as a stream using StreamReader reader = new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken)); // Iteratively read a chunk of the response until the end of the stream // It is more efficient to use a buffer that is the same size as the internal buffer of the stream // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters) char[] buffer = new char[2048]; while (!reader.EndOfStream) { // Check the cancellation token with each iteration cancellationToken.ThrowIfCancellationRequested(); // Fill the buffer with the next set of characters, track how many characters were read int readCount = reader.Read(buffer, 0, buffer.Length); // Convert the character buffer to a string, only include as many characters as were just read string chunk = new(buffer, 0, readCount); yield return new StreamingTextContent(chunk); } } public async Task<IReadOnlyList<TextContent>> GetTextContentsAsync( string prompt, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, CancellationToken cancellationToken = default ) { // Build your model's request object MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings); // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Deserialize the response body to your model's response object // Handle when the deserialization fails and returns null MyModelResponse response = await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken) ?? throw new Exception("Failed to deserialize response from model"); // Convert your model's response into a list of ChatMessageContent return response .Completions.Select<string, TextContent>(completion => new(completion)) .ToImmutableList(); } }
Fügen Sie beim Erstellen von
Kernel
die neue Dienstklasse ein. Zum Beispiel:IKernelBuilder builder = Kernel.CreateBuilder(); // Add your text generation service as a singleton instance builder.Services.AddKeyedSingleton<ITextGenerationService>( "myTextService1", new MyTextGenerationService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Alternatively, add your text generation service as a factory method builder.Services.AddKeyedSingleton<ITextGenerationService>( "myTextService2", (_, _) => new MyTextGenerationService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Add any other Kernel services or configurations // ... Kernel kernel = builder.Build();
Senden Sie eine Textgenerierungsaufforderung direkt über
Kernel
oder die Dienstklasse an Ihr Modell. Zum Beispiel:var executionSettings = new PromptExecutionSettings { // Add execution settings, such as the ModelID and ExtensionData ModelId = "MyModelId", ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } } }; // Send a prompt to your model directly through the Kernel // The Kernel response will be null if the model can't be reached string prompt = "Please list three services offered by Azure"; string? response = await kernel.InvokePromptAsync<string>(prompt); Console.WriteLine($"Output: {response}"); // Alteratively, send a prompt to your model through the text generation service ITextGenerationService textService = kernel.GetRequiredService<ITextGenerationService>(); TextContent responseContents = await textService.GetTextContentAsync( prompt, executionSettings ); Console.WriteLine($"Output: {responseContents.Text}");
Implementieren der Chatvervollständigung mithilfe eines lokalen Modells
Im folgenden Abschnitt wird gezeigt, wie Sie Ihr Modell in das Semantic Kernel SDK integrieren und es dann für Chatvervollständigungen verwenden können.
Erstellen Sie eine Dienstklasse, die die
IChatCompletionService
-Schnittstelle implementiert. Zum Beispiel:class MyChatCompletionService : IChatCompletionService { private IReadOnlyDictionary<string, object?>? _attributes; public IReadOnlyDictionary<string, object?> Attributes => _attributes ??= new Dictionary<string, object?>(); public string ModelUrl { get; init; } = "<default url to your model's Chat API>"; public required string ModelApiKey { get; init; } public async Task<IReadOnlyList<ChatMessageContent>> GetChatMessageContentsAsync( ChatHistory chatHistory, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, CancellationToken cancellationToken = default ) { // Build your model's request object MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings); // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Deserialize the response body to your model's response object // Handle when the deserialization fails and returns null MyModelResponse response = await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken) ?? throw new Exception("Failed to deserialize response from model"); // Convert your model's response into a list of ChatMessageContent return response .Completions.Select<string, ChatMessageContent>(completion => new(AuthorRole.Assistant, completion) ) .ToImmutableList(); } public async IAsyncEnumerable<StreamingChatMessageContent> GetStreamingChatMessageContentsAsync( ChatHistory chatHistory, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, [EnumeratorCancellation] CancellationToken cancellationToken = default ) { // Build your model's request object, specify that streaming is requested MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings); request.Stream = true; // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Read your models response as a stream using StreamReader reader = new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken)); // Iteratively read a chunk of the response until the end of the stream // It is more efficient to use a buffer that is the same size as the internal buffer of the stream // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters) char[] buffer = new char[2048]; while (!reader.EndOfStream) { // Check the cancellation token with each iteration cancellationToken.ThrowIfCancellationRequested(); // Fill the buffer with the next set of characters, track how many characters were read int readCount = reader.Read(buffer, 0, buffer.Length); // Convert the character buffer to a string, only include as many characters as were just read string chunk = new(buffer, 0, readCount); yield return new StreamingChatMessageContent(AuthorRole.Assistant, chunk); } } }
Fügen Sie beim Erstellen von
Kernel
die neue Dienstklasse ein. Zum Beispiel:IKernelBuilder builder = Kernel.CreateBuilder(); // Add your chat completion service as a singleton instance builder.Services.AddKeyedSingleton<IChatCompletionService>( "myChatService1", new MyChatCompletionService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Alternatively, add your chat completion service as a factory method builder.Services.AddKeyedSingleton<IChatCompletionService>( "myChatService2", (_, _) => new MyChatCompletionService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Add any other Kernel services or configurations // ... Kernel kernel = builder.Build();
Senden Sie eine Chatvervollständigungsaufforderung direkt über
Kernel
oder die Dienstklasse an Ihr Modell. Zum Beispiel:var executionSettings = new PromptExecutionSettings { // Add execution settings, such as the ModelID and ExtensionData ModelId = "MyModelId", ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } } }; // Send a string representation of the chat history to your model directly through the Kernel // This uses a special syntax to denote the role for each message // For more information on this syntax see: // https://learn.microsoft.com/en-us/semantic-kernel/prompts/your-first-prompt?tabs=Csharp string prompt = """ <message role="system">the initial system message for your chat history</message> <message role="user">the user's initial message</message> """; string? response = await kernel.InvokePromptAsync<string>(prompt); Console.WriteLine($"Output: {response}"); // Alteratively, send a prompt to your model through the chat completion service // First, initialize a chat history with your initial system message string systemMessage = "<the initial system message for your chat history>"; Console.WriteLine($"System Prompt: {systemMessage}"); var chatHistory = new ChatHistory(systemMessage); // Add the user's input to your chat history string userRequest = "<the user's initial message>"; Console.WriteLine($"User: {userRequest}"); chatHistory.AddUserMessage(userRequest); // Get the models response and add it to the chat history IChatCompletionService service = kernel.GetRequiredService<IChatCompletionService>(); ChatMessageContent responseMessage = await service.GetChatMessageContentAsync( chatHistory, executionSettings ); Console.WriteLine($"Assistant: {responseMessage.Content}"); chatHistory.Add(responseMessage); // Continue sending and receiving messages between the user and model // ...