Como fazer: Open AI Assistant Agent Code Interpreter (Experimental)

Artigo
11/03/2024

Aviso

O Semantic Kernel Agent Framework é experimental, ainda está em desenvolvimento e está sujeito a alterações.

Descrição geral

Neste exemplo, exploraremos como usar a ferramenta de interpretação de código de um Open AI Assistant Agent para concluir tarefas de análise de dados. A abordagem será dividida passo a passo para destacar as principais partes do processo de codificação. Como parte da tarefa, o agente gerará respostas de imagem e texto. Isso demonstrará a versatilidade desta ferramenta na realização de análises quantitativas.

O streaming será usado para entregar as respostas do agente. Isso fornecerá atualizações em tempo real à medida que a tarefa progride.

Introdução

Antes de prosseguir com a codificação de recursos, verifique se o ambiente de desenvolvimento está totalmente configurado e configurado.

Comece criando um projeto de console . Em seguida, inclua as seguintes referências de pacote para garantir que todas as dependências necessárias estejam disponíveis.

Para adicionar dependências de pacote a partir da linha de comando, use o dotnet comando:

dotnet add package Azure.Identity
dotnet add package Microsoft.Extensions.Configuration
dotnet add package Microsoft.Extensions.Configuration.Binder
dotnet add package Microsoft.Extensions.Configuration.UserSecrets
dotnet add package Microsoft.Extensions.Configuration.EnvironmentVariables
dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.SemanticKernel.Agents.OpenAI --prerelease

O ficheiro de projeto (.csproj) deve conter as seguintes PackageReference definições:

  <ItemGroup>
    <PackageReference Include="Azure.Identity" Version="<stable>" />
    <PackageReference Include="Microsoft.Extensions.Configuration" Version="<stable>" />
    <PackageReference Include="Microsoft.Extensions.Configuration.Binder" Version="<stable>" />
    <PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" Version="<stable>" />
    <PackageReference Include="Microsoft.Extensions.Configuration.EnvironmentVariables" Version="<stable>" />
    <PackageReference Include="Microsoft.SemanticKernel" Version="<latest>" />
    <PackageReference Include="Microsoft.SemanticKernel.Agents.OpenAI" Version="<latest>" />
  </ItemGroup>

O Agent Framework é experimental e requer supressão de aviso. Isso pode ser abordado como uma propriedade no arquivo de projeto (.csproj):

  <PropertyGroup>
    <NoWarn>$(NoWarn);CA2007;IDE1006;SKEXP0001;SKEXP0110;OPENAI001</NoWarn>
  </PropertyGroup>

Além disso, copie os arquivos e dados do Semantic KernelLearnResources Project.PopulationByCountry.csv PopulationByAdmin1.csv Adicione esses arquivos na pasta do projeto e configure para copiá-los para o diretório de saída:

  <ItemGroup>
    <None Include="PopulationByAdmin1.csv">
      <CopyToOutputDirectory>Always</CopyToOutputDirectory>
    </None>
    <None Include="PopulationByCountry.csv">
      <CopyToOutputDirectory>Always</CopyToOutputDirectory>
    </None>
  </ItemGroup>

Comece criando uma pasta que armazenará seu script (.py arquivo) e os recursos de exemplo. Inclua as seguintes importações na parte superior do arquivo .py :

import asyncio
import os

from semantic_kernel.agents.open_ai.azure_assistant_agent import AzureAssistantAgent
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.streaming_file_reference_content import StreamingFileReferenceContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.kernel import Kernel

Além disso, copie os arquivos e dados do Semantic KernelLearnResources Project.PopulationByCountry.csv PopulationByAdmin1.csv Adicione esses arquivos na pasta do projeto.

Configuração

Este exemplo requer definição de configuração para se conectar a serviços remotos. Você precisará definir configurações para Open AI ou Azure Open AI.

# Open AI
dotnet user-secrets set "OpenAISettings:ApiKey" "<api-key>"
dotnet user-secrets set "OpenAISettings:ChatModel" "gpt-4o"

# Azure Open AI
dotnet user-secrets set "AzureOpenAISettings:ApiKey" "<api-key>" # Not required if using token-credential
dotnet user-secrets set "AzureOpenAISettings:Endpoint" "<model-endpoint>"
dotnet user-secrets set "AzureOpenAISettings:ChatModelDeployment" "gpt-4o"

A classe a seguir é usada em todos os exemplos de agente. Certifique-se de incluí-lo em seu projeto para garantir a funcionalidade adequada. Esta classe serve como um componente fundamental para os exemplos que se seguem.

using System.Reflection;
using Microsoft.Extensions.Configuration;

namespace AgentsSample;

public class Settings
{
    private readonly IConfigurationRoot configRoot;

    private AzureOpenAISettings azureOpenAI;
    private OpenAISettings openAI;

    public AzureOpenAISettings AzureOpenAI => this.azureOpenAI ??= this.GetSettings<Settings.AzureOpenAISettings>();
    public OpenAISettings OpenAI => this.openAI ??= this.GetSettings<Settings.OpenAISettings>();

    public class OpenAISettings
    {
        public string ChatModel { get; set; } = string.Empty;
        public string ApiKey { get; set; } = string.Empty;
    }

    public class AzureOpenAISettings
    {
        public string ChatModelDeployment { get; set; } = string.Empty;
        public string Endpoint { get; set; } = string.Empty;
        public string ApiKey { get; set; } = string.Empty;
    }

    public TSettings GetSettings<TSettings>() =>
        this.configRoot.GetRequiredSection(typeof(TSettings).Name).Get<TSettings>()!;

    public Settings()
    {
        this.configRoot =
            new ConfigurationBuilder()
                .AddEnvironmentVariables()
                .AddUserSecrets(Assembly.GetExecutingAssembly(), optional: true)
                .Build();
    }
}

A maneira mais rápida de começar com a configuração adequada para executar o código de exemplo é criar um .env arquivo na raiz do seu projeto (onde o script é executado).

Configure as seguintes configurações em seu .env arquivo para o Azure OpenAI ou OpenAI:

AZURE_OPENAI_API_KEY="..."
AZURE_OPENAI_ENDPOINT="https://..."
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME="..."
AZURE_OPENAI_API_VERSION="..."

OPENAI_API_KEY="sk-..."
OPENAI_ORG_ID=""
OPENAI_CHAT_MODEL_ID=""

Uma vez configuradas, as respetivas classes de serviço de IA pegarão as variáveis necessárias e as usarão durante a instanciação.

Codificação

O processo de codificação para este exemplo envolve:

Configuração - Inicializando as configurações e o plug-in.
Definição do agente - Crie o OpenAI_Assistant_Agent com instruções modeladas e plug-in.
The Chat Loop - Escreva o loop que impulsiona a interação usuário/agente.

O código de exemplo completo é fornecido na seção Final . Consulte essa seção para obter a implementação completa.

Configurar

Antes de criar um Open AI Assistant Agent, verifique se as definições de configuração estão disponíveis e prepare os recursos do arquivo.

Instancie a Settings classe referenciada na seção Configuração anterior. Use as configurações para também criar um OpenAIClientProvider que será usado para a Definição do Agente, bem como para o upload de arquivos.

Settings settings = new();

OpenAIClientProvider clientProvider =
    OpenAIClientProvider.ForAzureOpenAI(new AzureCliCredential(), new Uri(settings.AzureOpenAI.Endpoint));

Use o OpenAIClientProvider para acessar e OpenAIFileClient carregar os dois arquivos de dados descritos na seção Configuração anterior, preservando a Referência de arquivo para limpeza final.

Console.WriteLine("Uploading files...");
OpenAIFileClient fileClient = clientProvider.Client.GetOpenAIFileClient();
OpenAIFile fileDataCountryDetail = await fileClient.UploadFileAsync("PopulationByAdmin1.csv", FileUploadPurpose.Assistants);
OpenAIFile fileDataCountryList = await fileClient.UploadFileAsync("PopulationByCountry.csv", FileUploadPurpose.Assistants);

# Let's form the file paths that we will later pass to the assistant
csv_file_path_1 = os.path.join(
    os.path.dirname(os.path.dirname(os.path.realpath(__file__))),
    "PopulationByAdmin1.csv",
)

csv_file_path_2 = os.path.join(
    os.path.dirname(os.path.dirname(os.path.realpath(__file__))),
    "PopulationByCountry.csv",
)

Definição do agente

Agora estamos prontos para instanciar um OpenAI Assistant Agent. O agente é configurado com seu modelo de destino, Instruções e a ferramenta Interpretador de código habilitada. Além disso, associamos explicitamente os dois arquivos de dados à ferramenta Interpretador de código.

Console.WriteLine("Defining agent...");
OpenAIAssistantAgent agent =
    await OpenAIAssistantAgent.CreateAsync(
        clientProvider,
        new OpenAIAssistantDefinition(settings.AzureOpenAI.ChatModelDeployment)
        {
            Name = "SampleAssistantAgent",
            Instructions =
                """
                Analyze the available data to provide an answer to the user's question.
                Always format response using markdown.
                Always include a numerical index that starts at 1 for any lists or tables.
                Always sort lists in ascending order.
                """,
            EnableCodeInterpreter = true,
            CodeInterpreterFileIds = [fileDataCountryList.Id, fileDataCountryDetail.Id],
        },
        new Kernel());

agent = await AzureAssistantAgent.create(
        kernel=Kernel(),
        service_id="agent",
        name="SampleAssistantAgent",
        instructions="""
                Analyze the available data to provide an answer to the user's question.
                Always format response using markdown.
                Always include a numerical index that starts at 1 for any lists or tables.
                Always sort lists in ascending order.
                """,
        enable_code_interpreter=True,
        code_interpreter_filenames=[csv_file_path_1, csv_file_path_2],
    )

O Loop do Chat

Finalmente, somos capazes de coordenar a interação entre o usuário e o Agente. Comece criando um Thread Assistente para manter o estado da conversa e criando um loop vazio.

Vamos também garantir que os recursos sejam removidos no final da execução para minimizar cobranças desnecessárias.

Console.WriteLine("Creating thread...");
string threadId = await agent.CreateThreadAsync();

Console.WriteLine("Ready!");

try
{
    bool isComplete = false;
    List<string> fileIds = [];
    do
    {

    } while (!isComplete);
}
finally
{
    Console.WriteLine();
    Console.WriteLine("Cleaning-up...");
    await Task.WhenAll(
        [
            agent.DeleteThreadAsync(threadId),
            agent.DeleteAsync(),
            fileClient.DeleteFileAsync(fileDataCountryList.Id),
            fileClient.DeleteFileAsync(fileDataCountryDetail.Id),
        ]);
}

print("Creating thread...")
thread_id = await agent.create_thread()

try:
    is_complete: bool = False
    file_ids: list[str] = []
    while not is_complete:
        # agent interaction logic here
finally:
    print("Cleaning up resources...")
    if agent is not None:
        [await agent.delete_file(file_id) for file_id in agent.code_interpreter_file_ids]
        await agent.delete_thread(thread_id)
        await agent.delete()

Agora vamos capturar a entrada do usuário dentro do loop anterior. Neste caso, a entrada vazia será ignorada e o termo EXIT sinalizará que a conversa está concluída. A entrada válida será adicionada ao Thread Assistente como uma mensagem de usuário.

Console.WriteLine();
Console.Write("> ");
string input = Console.ReadLine();
if (string.IsNullOrWhiteSpace(input))
{
    continue;
}
if (input.Trim().Equals("EXIT", StringComparison.OrdinalIgnoreCase))
{
    isComplete = true;
    break;
}

await agent.AddChatMessageAsync(threadId, new ChatMessageContent(AuthorRole.User, input));

Console.WriteLine();

user_input = input("User:> ")
if not user_input:
    continue

if user_input.lower() == "exit":
    is_complete = True
    break

await agent.add_chat_message(thread_id=thread_id, message=ChatMessageContent(role=AuthorRole.USER, content=user_input))

Antes de invocar a resposta do Agente , vamos adicionar alguns métodos auxiliares para baixar quaisquer arquivos que possam ser produzidos pelo Agente.

Aqui estamos colocando o conteúdo do arquivo no diretório temporário definido pelo sistema e, em seguida, iniciando o aplicativo visualizador definido pelo sistema.

private static async Task DownloadResponseImageAsync(OpenAIFileClient client, ICollection<string> fileIds)
{
    if (fileIds.Count > 0)
    {
        Console.WriteLine();
        foreach (string fileId in fileIds)
        {
            await DownloadFileContentAsync(client, fileId, launchViewer: true);
        }
    }
}

private static async Task DownloadFileContentAsync(OpenAIFileClient client, string fileId, bool launchViewer = false)
{
    OpenAIFile fileInfo = client.GetFile(fileId);
    if (fileInfo.Purpose == FilePurpose.AssistantsOutput)
    {
        string filePath =
            Path.Combine(
                Path.GetTempPath(),
                Path.GetFileName(Path.ChangeExtension(fileInfo.Filename, ".png")));

        BinaryData content = await client.DownloadFileAsync(fileId);
        await using FileStream fileStream = new(filePath, FileMode.CreateNew);
        await content.ToStream().CopyToAsync(fileStream);
        Console.WriteLine($"File saved to: {filePath}.");

        if (launchViewer)
        {
            Process.Start(
                new ProcessStartInfo
                {
                    FileName = "cmd.exe",
                    Arguments = $"/C start {filePath}"
                });
        }
    }
}

import os

async def download_file_content(agent, file_id: str):
    try:
        # Fetch the content of the file using the provided method
        response_content = await agent.client.files.content(file_id)

        # Get the current working directory of the file
        current_directory = os.path.dirname(os.path.abspath(__file__))

        # Define the path to save the image in the current directory
        file_path = os.path.join(
            current_directory,  # Use the current directory of the file
            f"{file_id}.png"  # You can modify this to use the actual filename with proper extension
        )

        # Save content to a file asynchronously
        with open(file_path, "wb") as file:
            file.write(response_content.content)

        print(f"File saved to: {file_path}")
    except Exception as e:
        print(f"An error occurred while downloading file {file_id}: {str(e)}")

async def download_response_image(agent, file_ids: list[str]):
    if file_ids:
        # Iterate over file_ids and download each one
        for file_id in file_ids:
            await download_file_content(agent, file_id)

Para gerar uma resposta do Agente à entrada do usuário, invoque o agente especificando o Thread do Assistente. Neste exemplo, escolhemos uma resposta transmitida em fluxo e capturamos todas as Referências de Arquivo geradas para download e revisão no final do ciclo de resposta. É importante notar que o código gerado é identificado pela presença de uma chave de metadados na mensagem de resposta, distinguindo-a da resposta conversacional.

bool isCode = false;
await foreach (StreamingChatMessageContent response in agent.InvokeStreamingAsync(threadId))
{
    if (isCode != (response.Metadata?.ContainsKey(OpenAIAssistantAgent.CodeInterpreterMetadataKey) ?? false))
    {
        Console.WriteLine();
        isCode = !isCode;
    }

    // Display response.
    Console.Write($"{response.Content}");

    // Capture file IDs for downloading.
    fileIds.AddRange(response.Items.OfType<StreamingFileReferenceContent>().Select(item => item.FileId));
}
Console.WriteLine();

// Download any files referenced in the response.
await DownloadResponseImageAsync(fileClient, fileIds);
fileIds.Clear();

is_code: bool = False
async for response in agent.invoke(stream(thread_id=thread_id):
    if is_code != metadata.get("code"):
        print()
        is_code = not is_code

    print(f"{response.content})

    file_ids.extend(
        [item.file_id for item in response.items if isinstance(item, StreamingFileReferenceContent)]
    )

print()

await download_response_image(agent, file_ids)
file_ids.clear()

Final

Juntando todas as etapas, temos o código final para este exemplo. A implementação completa é fornecida abaixo.

Tente usar estas sugestões de entradas:

Compare os arquivos para determinar o número de países que não têm um estado ou província definido em comparação com a contagem total
Crie uma tabela para países com estado ou província definidos. Incluir a contagem de estados ou províncias e a população total
Forneça um gráfico de barras para países cujos nomes comecem com a mesma letra e classifique o eixo x da maior contagem para a mais baixa (inclua todos os países)

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using Azure.Identity;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Agents.OpenAI;
using Microsoft.SemanticKernel.ChatCompletion;
using OpenAI.Files;

namespace AgentsSample;

public static class Program
{
    public static async Task Main()
    {
        // Load configuration from environment variables or user secrets.
        Settings settings = new();

        OpenAIClientProvider clientProvider =
            OpenAIClientProvider.ForAzureOpenAI(new AzureCliCredential(), new Uri(settings.AzureOpenAI.Endpoint));

        Console.WriteLine("Uploading files...");
        OpenAIFileClient fileClient = clientProvider.Client.GetOpenAIFileClient();
        OpenAIFile fileDataCountryDetail = await fileClient.UploadFileAsync("PopulationByAdmin1.csv", FileUploadPurpose.Assistants);
        OpenAIFile fileDataCountryList = await fileClient.UploadFileAsync("PopulationByCountry.csv", FileUploadPurpose.Assistants);

        Console.WriteLine("Defining agent...");
        OpenAIAssistantAgent agent =
            await OpenAIAssistantAgent.CreateAsync(
                clientProvider,
                new OpenAIAssistantDefinition(settings.AzureOpenAI.ChatModelDeployment)
                {
                    Name = "SampleAssistantAgent",
                    Instructions =
                        """
                        Analyze the available data to provide an answer to the user's question.
                        Always format response using markdown.
                        Always include a numerical index that starts at 1 for any lists or tables.
                        Always sort lists in ascending order.
                        """,
                    EnableCodeInterpreter = true,
                    CodeInterpreterFileIds = [fileDataCountryList.Id, fileDataCountryDetail.Id],
                },
                new Kernel());

        Console.WriteLine("Creating thread...");
        string threadId = await agent.CreateThreadAsync();

        Console.WriteLine("Ready!");

        try
        {
            bool isComplete = false;
            List<string> fileIds = [];
            do
            {
                Console.WriteLine();
                Console.Write("> ");
                string input = Console.ReadLine();
                if (string.IsNullOrWhiteSpace(input))
                {
                    continue;
                }
                if (input.Trim().Equals("EXIT", StringComparison.OrdinalIgnoreCase))
                {
                    isComplete = true;
                    break;
                }

                await agent.AddChatMessageAsync(threadId, new ChatMessageContent(AuthorRole.User, input));

                Console.WriteLine();

                bool isCode = false;
                await foreach (StreamingChatMessageContent response in agent.InvokeStreamingAsync(threadId))
                {
                    if (isCode != (response.Metadata?.ContainsKey(OpenAIAssistantAgent.CodeInterpreterMetadataKey) ?? false))
                    {
                        Console.WriteLine();
                        isCode = !isCode;
                    }

                    // Display response.
                    Console.Write($"{response.Content}");

                    // Capture file IDs for downloading.
                    fileIds.AddRange(response.Items.OfType<StreamingFileReferenceContent>().Select(item => item.FileId));
                }
                Console.WriteLine();

                // Download any files referenced in the response.
                await DownloadResponseImageAsync(fileClient, fileIds);
                fileIds.Clear();

            } while (!isComplete);
        }
        finally
        {
            Console.WriteLine();
            Console.WriteLine("Cleaning-up...");
            await Task.WhenAll(
                [
                    agent.DeleteThreadAsync(threadId),
                    agent.DeleteAsync(),
                    fileClient.DeleteFileAsync(fileDataCountryList.Id),
                    fileClient.DeleteFileAsync(fileDataCountryDetail.Id),
                ]);
        }
    }

    private static async Task DownloadResponseImageAsync(OpenAIFileClient client, ICollection<string> fileIds)
    {
        if (fileIds.Count > 0)
        {
            Console.WriteLine();
            foreach (string fileId in fileIds)
            {
                await DownloadFileContentAsync(client, fileId, launchViewer: true);
            }
        }
    }

    private static async Task DownloadFileContentAsync(OpenAIFileClient client, string fileId, bool launchViewer = false)
    {
        OpenAIFile fileInfo = client.GetFile(fileId);
        if (fileInfo.Purpose == FilePurpose.AssistantsOutput)
        {
            string filePath =
                Path.Combine(
                    Path.GetTempPath(),
                    Path.GetFileName(Path.ChangeExtension(fileInfo.Filename, ".png")));

            BinaryData content = await client.DownloadFileAsync(fileId);
            await using FileStream fileStream = new(filePath, FileMode.CreateNew);
            await content.ToStream().CopyToAsync(fileStream);
            Console.WriteLine($"File saved to: {filePath}.");

            if (launchViewer)
            {
                Process.Start(
                    new ProcessStartInfo
                    {
                        FileName = "cmd.exe",
                        Arguments = $"/C start {filePath}"
                    });
            }
        }
    }
}

import asyncio
import os

from semantic_kernel.agents.open_ai.azure_assistant_agent import AzureAssistantAgent
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.streaming_file_reference_content import StreamingFileReferenceContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.kernel import Kernel

# Let's form the file paths that we will later pass to the assistant
csv_file_path_1 = os.path.join(
    os.path.dirname(os.path.dirname(os.path.realpath(__file__))),
    "PopulationByAdmin1.csv",
)

csv_file_path_2 = os.path.join(
    os.path.dirname(os.path.dirname(os.path.realpath(__file__))),
    "PopulationByCountry.csv",
)


async def download_file_content(agent, file_id: str):
    try:
        # Fetch the content of the file using the provided method
        response_content = await agent.client.files.content(file_id)

        # Get the current working directory of the file
        current_directory = os.path.dirname(os.path.abspath(__file__))

        # Define the path to save the image in the current directory
        file_path = os.path.join(
            current_directory,  # Use the current directory of the file
            f"{file_id}.png",  # You can modify this to use the actual filename with proper extension
        )

        # Save content to a file asynchronously
        with open(file_path, "wb") as file:
            file.write(response_content.content)

        print(f"File saved to: {file_path}")
    except Exception as e:
        print(f"An error occurred while downloading file {file_id}: {str(e)}")


async def download_response_image(agent, file_ids: list[str]):
    if file_ids:
        # Iterate over file_ids and download each one
        for file_id in file_ids:
            await download_file_content(agent, file_id)


async def main():
    agent = await AzureAssistantAgent.create(
        kernel=Kernel(),
        service_id="agent",
        name="SampleAssistantAgent",
        instructions="""
                    Analyze the available data to provide an answer to the user's question.
                    Always format response using markdown.
                    Always include a numerical index that starts at 1 for any lists or tables.
                    Always sort lists in ascending order.
                    """,
        enable_code_interpreter=True,
        code_interpreter_filenames=[csv_file_path_1, csv_file_path_2],
    )

    print("Creating thread...")
    thread_id = await agent.create_thread()

    try:
        is_complete: bool = False
        file_ids: list[str] = []
        while not is_complete:
            user_input = input("User:> ")
            if not user_input:
                continue

            if user_input.lower() == "exit":
                is_complete = True
                break

            await agent.add_chat_message(
                thread_id=thread_id, message=ChatMessageContent(role=AuthorRole.USER, content=user_input)
            )
            is_code: bool = False
            async for response in agent.invoke_stream(thread_id=thread_id):
                if is_code != response.metadata.get("code"):
                    print()
                    is_code = not is_code

                print(f"{response.content}", end="", flush=True)

                file_ids.extend([
                    item.file_id for item in response.items if isinstance(item, StreamingFileReferenceContent)
                ])

            print()

            await download_response_image(agent, file_ids)
            file_ids.clear()

    finally:
        print("Cleaning up resources...")
        if agent is not None:
            [await agent.delete_file(file_id) for file_id in agent.code_interpreter_file_ids]
            await agent.delete_thread(thread_id)
            await agent.delete()


if __name__ == "__main__":
    asyncio.run(main())

Como fazer: Abrir a pesquisa de arquivo de código do agente do AI Assistant

Partilhar via