Używanie niestandardowych i lokalnych modeli sztucznej inteligencji z zestawem SDK jądra semantycznego
W tym artykule pokazano, jak zintegrować niestandardowe i lokalne modele z zestawem SDK jądra semantycznego i używać ich do generowania tekstu i uzupełniania czatów.
Możesz dostosować kroki, aby używać ich z dowolnym modelem, do którego możesz uzyskać dostęp, niezależnie od tego, gdzie lub jak uzyskujesz do niego dostęp. Na przykład można zintegrować model codellama z zestawem SDK jądra semantycznego, aby umożliwić generowanie i dyskusję na temat kodu.
Niestandardowe i lokalne modele często zapewniają dostęp za pośrednictwem interfejsów API REST, na przykład zobacz Zgodność rozwiązania Ollama OpenAI. Zanim zintegrujesz model, będzie on hostowany i dostępny dla aplikacji platformy .NET za pośrednictwem protokołu HTTPS.
Wymagania wstępne
- Konto platformy Azure z aktywną subskrypcją. Utwórz konto bezpłatnie.
- Zestaw SDK platformy .NET
Microsoft.SemanticKernel
Pakiet NuGet- Niestandardowy lub lokalny model, wdrożony i dostępny dla aplikacji .NET
Implementowanie generowania tekstu przy użyciu modelu lokalnego
W poniższej sekcji pokazano, jak zintegrować model z zestawem SDK jądra semantycznego, a następnie użyć go do generowania uzupełniania tekstu.
Utwórz klasę usługi, która implementuje
ITextGenerationService
interfejs. Na przykład:class MyTextGenerationService : ITextGenerationService { private IReadOnlyDictionary<string, object?>? _attributes; public IReadOnlyDictionary<string, object?> Attributes => _attributes ??= new Dictionary<string, object?>(); public string ModelUrl { get; init; } = "<default url to your model's Chat API>"; public required string ModelApiKey { get; init; } public async IAsyncEnumerable<StreamingTextContent> GetStreamingTextContentsAsync( string prompt, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, [EnumeratorCancellation] CancellationToken cancellationToken = default ) { // Build your model's request object, specify that streaming is requested MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings); request.Stream = true; // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Read your models response as a stream using StreamReader reader = new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken)); // Iteratively read a chunk of the response until the end of the stream // It is more efficient to use a buffer that is the same size as the internal buffer of the stream // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters) char[] buffer = new char[2048]; while (!reader.EndOfStream) { // Check the cancellation token with each iteration cancellationToken.ThrowIfCancellationRequested(); // Fill the buffer with the next set of characters, track how many characters were read int readCount = reader.Read(buffer, 0, buffer.Length); // Convert the character buffer to a string, only include as many characters as were just read string chunk = new(buffer, 0, readCount); yield return new StreamingTextContent(chunk); } } public async Task<IReadOnlyList<TextContent>> GetTextContentsAsync( string prompt, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, CancellationToken cancellationToken = default ) { // Build your model's request object MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings); // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Deserialize the response body to your model's response object // Handle when the deserialization fails and returns null MyModelResponse response = await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken) ?? throw new Exception("Failed to deserialize response from model"); // Convert your model's response into a list of ChatMessageContent return response .Completions.Select<string, TextContent>(completion => new(completion)) .ToImmutableList(); } }
Dołącz nową klasę usługi podczas kompilowania elementu
Kernel
. Na przykład:IKernelBuilder builder = Kernel.CreateBuilder(); // Add your text generation service as a singleton instance builder.Services.AddKeyedSingleton<ITextGenerationService>( "myTextService1", new MyTextGenerationService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Alternatively, add your text generation service as a factory method builder.Services.AddKeyedSingleton<ITextGenerationService>( "myTextService2", (_, _) => new MyTextGenerationService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Add any other Kernel services or configurations // ... Kernel kernel = builder.Build();
Wyślij do modelu monit generowania tekstu bezpośrednio za pomocą
Kernel
klasy usługi lub . Na przykład:var executionSettings = new PromptExecutionSettings { // Add execution settings, such as the ModelID and ExtensionData ModelId = "MyModelId", ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } } }; // Send a prompt to your model directly through the Kernel // The Kernel response will be null if the model can't be reached string prompt = "Please list three services offered by Azure"; string? response = await kernel.InvokePromptAsync<string>(prompt); Console.WriteLine($"Output: {response}"); // Alteratively, send a prompt to your model through the text generation service ITextGenerationService textService = kernel.GetRequiredService<ITextGenerationService>(); TextContent responseContents = await textService.GetTextContentAsync( prompt, executionSettings ); Console.WriteLine($"Output: {responseContents.Text}");
Implementowanie uzupełniania czatu przy użyciu modelu lokalnego
W poniższej sekcji pokazano, jak zintegrować model z zestawem SDK jądra semantycznego, a następnie użyć go do ukończenia czatu.
Utwórz klasę usługi, która implementuje
IChatCompletionService
interfejs. Na przykład:class MyChatCompletionService : IChatCompletionService { private IReadOnlyDictionary<string, object?>? _attributes; public IReadOnlyDictionary<string, object?> Attributes => _attributes ??= new Dictionary<string, object?>(); public string ModelUrl { get; init; } = "<default url to your model's Chat API>"; public required string ModelApiKey { get; init; } public async Task<IReadOnlyList<ChatMessageContent>> GetChatMessageContentsAsync( ChatHistory chatHistory, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, CancellationToken cancellationToken = default ) { // Build your model's request object MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings); // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Deserialize the response body to your model's response object // Handle when the deserialization fails and returns null MyModelResponse response = await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken) ?? throw new Exception("Failed to deserialize response from model"); // Convert your model's response into a list of ChatMessageContent return response .Completions.Select<string, ChatMessageContent>(completion => new(AuthorRole.Assistant, completion) ) .ToImmutableList(); } public async IAsyncEnumerable<StreamingChatMessageContent> GetStreamingChatMessageContentsAsync( ChatHistory chatHistory, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, [EnumeratorCancellation] CancellationToken cancellationToken = default ) { // Build your model's request object, specify that streaming is requested MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings); request.Stream = true; // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Read your models response as a stream using StreamReader reader = new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken)); // Iteratively read a chunk of the response until the end of the stream // It is more efficient to use a buffer that is the same size as the internal buffer of the stream // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters) char[] buffer = new char[2048]; while (!reader.EndOfStream) { // Check the cancellation token with each iteration cancellationToken.ThrowIfCancellationRequested(); // Fill the buffer with the next set of characters, track how many characters were read int readCount = reader.Read(buffer, 0, buffer.Length); // Convert the character buffer to a string, only include as many characters as were just read string chunk = new(buffer, 0, readCount); yield return new StreamingChatMessageContent(AuthorRole.Assistant, chunk); } } }
Dołącz nową klasę usługi podczas kompilowania elementu
Kernel
. Na przykład:IKernelBuilder builder = Kernel.CreateBuilder(); // Add your chat completion service as a singleton instance builder.Services.AddKeyedSingleton<IChatCompletionService>( "myChatService1", new MyChatCompletionService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Alternatively, add your chat completion service as a factory method builder.Services.AddKeyedSingleton<IChatCompletionService>( "myChatService2", (_, _) => new MyChatCompletionService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Add any other Kernel services or configurations // ... Kernel kernel = builder.Build();
Wyślij monit o ukończenie czatu do modelu bezpośrednio za pośrednictwem
Kernel
klasy usługi lub . Na przykład:var executionSettings = new PromptExecutionSettings { // Add execution settings, such as the ModelID and ExtensionData ModelId = "MyModelId", ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } } }; // Send a string representation of the chat history to your model directly through the Kernel // This uses a special syntax to denote the role for each message // For more information on this syntax see: // https://learn.microsoft.com/en-us/semantic-kernel/prompts/your-first-prompt?tabs=Csharp string prompt = """ <message role="system">the initial system message for your chat history</message> <message role="user">the user's initial message</message> """; string? response = await kernel.InvokePromptAsync<string>(prompt); Console.WriteLine($"Output: {response}"); // Alteratively, send a prompt to your model through the chat completion service // First, initialize a chat history with your initial system message string systemMessage = "<the initial system message for your chat history>"; Console.WriteLine($"System Prompt: {systemMessage}"); var chatHistory = new ChatHistory(systemMessage); // Add the user's input to your chat history string userRequest = "<the user's initial message>"; Console.WriteLine($"User: {userRequest}"); chatHistory.AddUserMessage(userRequest); // Get the models response and add it to the chat history IChatCompletionService service = kernel.GetRequiredService<IChatCompletionService>(); ChatMessageContent responseMessage = await service.GetChatMessageContentAsync( chatHistory, executionSettings ); Console.WriteLine($"Assistant: {responseMessage.Content}"); chatHistory.Add(responseMessage); // Continue sending and receiving messages between the user and model // ...