Utiliser des modèles IA personnalisés et locaux avec le kit SDK Semantic Kernel
Cet article montre comment intégrer des modèles personnalisés et locaux dans le kit SDK Semantic Kernel et les utiliser pour la génération de texte et la saisie semi-automatique de conversation.
Vous pouvez adapter les étapes pour les utiliser avec n’importe quel modèle auquel vous avez accès, quel que soit l’emplacement ou la façon dont vous y accédez. Par exemple, vous pouvez intégrer le modèle codellama avec le SDK Semantic Kernel pour permettre la génération de code et la discussion.
Souvent, les modèles personnalisés et locaux fournissent l’accès via des API REST. Pour un exemple, consultez Ollama OpenAI compatibility. Pour pouvoir intégrer votre modèle, celui-ci doit être hébergé et accessible à votre application .NET via HTTPS.
Prérequis
- Compte Azure avec un abonnement actif. Créez un compte gratuitement.
- Kit de développement logiciel (SDK) .NET
- Le package NuGet
Microsoft.SemanticKernel
- Un modèle personnalisé ou local, déployé et accessible dans votre application .NET
Implémenter la génération de texte à l’aide d’un modèle local
La section suivante vous montre comment intégrer votre modèle au kit SDK Semantic Kernel et comment l’utiliser pour générer la saisie semi-automatique de texte.
Créez une classe de service qui implémente l’interface
ITextGenerationService
. Par exemple :class MyTextGenerationService : ITextGenerationService { private IReadOnlyDictionary<string, object?>? _attributes; public IReadOnlyDictionary<string, object?> Attributes => _attributes ??= new Dictionary<string, object?>(); public string ModelUrl { get; init; } = "<default url to your model's Chat API>"; public required string ModelApiKey { get; init; } public async IAsyncEnumerable<StreamingTextContent> GetStreamingTextContentsAsync( string prompt, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, [EnumeratorCancellation] CancellationToken cancellationToken = default ) { // Build your model's request object, specify that streaming is requested MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings); request.Stream = true; // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Read your models response as a stream using StreamReader reader = new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken)); // Iteratively read a chunk of the response until the end of the stream // It is more efficient to use a buffer that is the same size as the internal buffer of the stream // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters) char[] buffer = new char[2048]; while (!reader.EndOfStream) { // Check the cancellation token with each iteration cancellationToken.ThrowIfCancellationRequested(); // Fill the buffer with the next set of characters, track how many characters were read int readCount = reader.Read(buffer, 0, buffer.Length); // Convert the character buffer to a string, only include as many characters as were just read string chunk = new(buffer, 0, readCount); yield return new StreamingTextContent(chunk); } } public async Task<IReadOnlyList<TextContent>> GetTextContentsAsync( string prompt, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, CancellationToken cancellationToken = default ) { // Build your model's request object MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings); // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Deserialize the response body to your model's response object // Handle when the deserialization fails and returns null MyModelResponse response = await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken) ?? throw new Exception("Failed to deserialize response from model"); // Convert your model's response into a list of ChatMessageContent return response .Completions.Select<string, TextContent>(completion => new(completion)) .ToImmutableList(); } }
Incluez la nouvelle classe de service lors de la génération de
Kernel
. Par exemple :IKernelBuilder builder = Kernel.CreateBuilder(); // Add your text generation service as a singleton instance builder.Services.AddKeyedSingleton<ITextGenerationService>( "myTextService1", new MyTextGenerationService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Alternatively, add your text generation service as a factory method builder.Services.AddKeyedSingleton<ITextGenerationService>( "myTextService2", (_, _) => new MyTextGenerationService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Add any other Kernel services or configurations // ... Kernel kernel = builder.Build();
Envoyez un prompt de génération de texte à votre modèle directement via
Kernel
ou en utilisant la classe de service. Par exemple :var executionSettings = new PromptExecutionSettings { // Add execution settings, such as the ModelID and ExtensionData ModelId = "MyModelId", ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } } }; // Send a prompt to your model directly through the Kernel // The Kernel response will be null if the model can't be reached string prompt = "Please list three services offered by Azure"; string? response = await kernel.InvokePromptAsync<string>(prompt); Console.WriteLine($"Output: {response}"); // Alteratively, send a prompt to your model through the text generation service ITextGenerationService textService = kernel.GetRequiredService<ITextGenerationService>(); TextContent responseContents = await textService.GetTextContentAsync( prompt, executionSettings ); Console.WriteLine($"Output: {responseContents.Text}");
Implémenter la saisie semi-automatique de conversation à l’aide d’un modèle local
La section suivante vous montre comment intégrer votre modèle au kit SDK Semantic Kernel et comment l’utiliser pour la saisie semi-automatique de conversation.
Créez une classe de service qui implémente l’interface
IChatCompletionService
. Par exemple :class MyChatCompletionService : IChatCompletionService { private IReadOnlyDictionary<string, object?>? _attributes; public IReadOnlyDictionary<string, object?> Attributes => _attributes ??= new Dictionary<string, object?>(); public string ModelUrl { get; init; } = "<default url to your model's Chat API>"; public required string ModelApiKey { get; init; } public async Task<IReadOnlyList<ChatMessageContent>> GetChatMessageContentsAsync( ChatHistory chatHistory, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, CancellationToken cancellationToken = default ) { // Build your model's request object MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings); // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Deserialize the response body to your model's response object // Handle when the deserialization fails and returns null MyModelResponse response = await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken) ?? throw new Exception("Failed to deserialize response from model"); // Convert your model's response into a list of ChatMessageContent return response .Completions.Select<string, ChatMessageContent>(completion => new(AuthorRole.Assistant, completion) ) .ToImmutableList(); } public async IAsyncEnumerable<StreamingChatMessageContent> GetStreamingChatMessageContentsAsync( ChatHistory chatHistory, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, [EnumeratorCancellation] CancellationToken cancellationToken = default ) { // Build your model's request object, specify that streaming is requested MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings); request.Stream = true; // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Read your models response as a stream using StreamReader reader = new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken)); // Iteratively read a chunk of the response until the end of the stream // It is more efficient to use a buffer that is the same size as the internal buffer of the stream // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters) char[] buffer = new char[2048]; while (!reader.EndOfStream) { // Check the cancellation token with each iteration cancellationToken.ThrowIfCancellationRequested(); // Fill the buffer with the next set of characters, track how many characters were read int readCount = reader.Read(buffer, 0, buffer.Length); // Convert the character buffer to a string, only include as many characters as were just read string chunk = new(buffer, 0, readCount); yield return new StreamingChatMessageContent(AuthorRole.Assistant, chunk); } } }
Incluez la nouvelle classe de service lors de la génération de
Kernel
. Par exemple :IKernelBuilder builder = Kernel.CreateBuilder(); // Add your chat completion service as a singleton instance builder.Services.AddKeyedSingleton<IChatCompletionService>( "myChatService1", new MyChatCompletionService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Alternatively, add your chat completion service as a factory method builder.Services.AddKeyedSingleton<IChatCompletionService>( "myChatService2", (_, _) => new MyChatCompletionService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Add any other Kernel services or configurations // ... Kernel kernel = builder.Build();
Envoyez un prompt de saisie semi-automatique de conversation à votre modèle directement via
Kernel
ou en utilisant la classe de service. Par exemple :var executionSettings = new PromptExecutionSettings { // Add execution settings, such as the ModelID and ExtensionData ModelId = "MyModelId", ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } } }; // Send a string representation of the chat history to your model directly through the Kernel // This uses a special syntax to denote the role for each message // For more information on this syntax see: // https://learn.microsoft.com/en-us/semantic-kernel/prompts/your-first-prompt?tabs=Csharp string prompt = """ <message role="system">the initial system message for your chat history</message> <message role="user">the user's initial message</message> """; string? response = await kernel.InvokePromptAsync<string>(prompt); Console.WriteLine($"Output: {response}"); // Alteratively, send a prompt to your model through the chat completion service // First, initialize a chat history with your initial system message string systemMessage = "<the initial system message for your chat history>"; Console.WriteLine($"System Prompt: {systemMessage}"); var chatHistory = new ChatHistory(systemMessage); // Add the user's input to your chat history string userRequest = "<the user's initial message>"; Console.WriteLine($"User: {userRequest}"); chatHistory.AddUserMessage(userRequest); // Get the models response and add it to the chat history IChatCompletionService service = kernel.GetRequiredService<IChatCompletionService>(); ChatMessageContent responseMessage = await service.GetChatMessageContentAsync( chatHistory, executionSettings ); Console.WriteLine($"Assistant: {responseMessage.Content}"); chatHistory.Add(responseMessage); // Continue sending and receiving messages between the user and model // ...