How to fix Unknown Model error?

Aqib Riaz 0 Reputation points
2024-12-09T14:58:39.17+00:00

I am following Build a Custom Knowledge Retrieval (RAG) chatbot using Azure AI Foundry. I have setted up the AI search and also deployed two models gpt-4o-mini and text-embedding-ada-002, but I can't seem to access them in my computer using azure sdk. I have copied the code from tutorial and try to run it but It gives error

azure.core.exceptions.HttpResponseError: (unavailable_model) Unavailable model: gpt-4o-mini

Code: unavailable_model

Message: Unavailable model: gpt-4o-mini.
in when running code for intent_mapping_response. Additionally the same thing happens when I try to run embedding code.

I am trying to run following code:

import os
from pathlib import Path
from opentelemetry import trace
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import ConnectionType
from azure.identity import DefaultAzureCredential
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from config import ASSET_PATH, get_logger
from azure.ai.inference.prompts import PromptTemplate

from azure.search.documents.models import VectorizedQuery
from dotenv import load_dotenv

from azure.identity import DefaultAzureCredential
import os



load_dotenv()

# initialize logging and tracing objects
logger = get_logger(__name__)
tracer = trace.get_tracer(__name__)

# create a project client using environment variables loaded from the .env file
project = AIProjectClient.from_connection_string(
    conn_str=os.environ["AIPROJECT_CONNECTION_STRING"], credential=DefaultAzureCredential()
)
print(dir(project.inference))

# create a vector embeddings client that will be used to generate vector embeddings
chat = project.inference.get_chat_completions_client()
print("chat completion client created :", chat)
model_info = chat.get_model_info('gpt-4o-mini')  # Replace with your model name
print("Model Info:", model_info)

# Print available attributes and methods

embeddings = project.inference.get_embeddings_client()
print("embedding client created :", embeddings)

# use the project client to get the default search connection
search_connection = project.connections.get_default(
    connection_type=ConnectionType.AZURE_AI_SEARCH, include_credentials=True
)

# Create a search index client using the search connection
# This client will be used to create and delete search indexes
search_client = SearchClient(
    index_name=os.environ["AISEARCH_INDEX_NAME"],
    endpoint=search_connection.endpoint_url,
    credential=AzureKeyCredential(key=search_connection.key),
)

@tracer.start_as_current_span(name="get_relevant_documents")
def get_relevant_documents(messages: list, context: dict = None) -> dict:
    if context is None:
        context = {}

    overrides = context.get("overrides", {})
    top = overrides.get("top", 5)


    # # generate a search query from the chat messages
    intent_prompty = PromptTemplate.from_prompty("intent_mapping.prompty")

    intent_mapping_response = chat.complete(
        model=os.environ["INTENT_MAPPING_MODEL"],
        messages=intent_prompty.create_messages(conversation=messages),
        **intent_prompty.parameters,
    )

    search_query = intent_mapping_response.choices[0].message.content
    logger.debug(f"🧠 Intent mapping: {search_query}")

    # generate a vector representation of the search query
    embedding = embeddings.embed(model=os.environ["EMBEDDINGS_MODEL"], input=search_query)
    search_vector = embedding.data[0].embedding

    # search the index for products matching the search query
    vector_query = VectorizedQuery(vector=search_vector, k_nearest_neighbors=top, fields="contentVector")

    search_results = search_client.search(
        search_text=search_query, vector_queries=[vector_query], select=["id", "content", "filepath", "title", "url"]
    )

    documents = [
        {
            "id": result["id"],
            "content": result["content"],
            "filepath": result["filepath"],
            "title": result["title"],
            "url": result["url"],
        }
        for result in search_results
    ]

    # add results to the provided context
    if "thoughts" not in context:
        context["thoughts"] = []

    # add thoughts and documents to the context object so it can be returned to the caller
    context["thoughts"].append(
        {
            "title": "Generated search query",
            "description": search_query,
        }
    )

    if "grounding_data" not in context:
        context["grounding_data"] = []
    context["grounding_data"].append(documents)

    logger.debug(f"📄 {len(documents)} documents retrieved: {documents}")
    return documents

if __name__ == "__main__":
    import logging
    import argparse

    # set logging level to debug when running this module directly
    logger.setLevel(logging.DEBUG)

    # load command line arguments
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--query",
        type=str,
        help="Query to use to search product",
        default="I need a new tent for 4 people, what would you recommend?",
    )

    args = parser.parse_args()
    query = args.query

    result = get_relevant_documents(messages=[{"role": "user", "content": query}])
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,119 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,452 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,001 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 14,551 Reputation points
    2024-12-11T18:52:52.3666667+00:00

    Hello Aqib Riaz,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    Regarding your experience with Unknown Model error. I have put together the following steps to identify and resolve the root causes of the error "unavailable_model: gpt-4o-mini". This guide ensures the problem is addressed effectively and each step includes actionable insights, necessary code examples, and references to Azure documentation.

    1. To ensure the models are correctly deployed, log into the Azure Portal:
    1. Replace the model names in your code with the exact names found in the Azure Portal. Often, errors arise from mismatched names. For an example:
    from azure.ai.projects import AIProjectClient
    from azure.identity import DefaultAzureCredential
    # Replace with the exact names shown in the Azure Portal
    model_info = chat.get_model_info("DEPLOYED_MODEL_NAME")
    embedding = embeddings.embed(model="DEPLOYED_EMBEDDING_MODEL_NAME", input=search_query)
    

    This ensures your application uses the correct deployment identifiers.

    1. To programmatically confirm available models, use the Azure SDK to list deployed models. This is especially useful to cross-check portal information. For an example:
    from azure.ai.projects import AIProjectClient
    from azure.identity import DefaultAzureCredential
    import os
    # Initialize client with connection string and credential
    project = AIProjectClient.from_connection_string(
        conn_str=os.environ["AIPROJECT_CONNECTION_STRING"], credential=DefaultAzureCredential()
    )
    # List and display available models
    available_models = project.inference.list_models()
    print("Available models:", available_models)
    

    If the desired model (gpt-4o-mini) is not listed, it has not been deployed or may be unavailable in your subscription or region. Check for more here - https://learn.microsoft.com/en-us/python/api/overview/azure/ai?view=azure-python

    1. Ensure the models are deployed in the correct region:
    • Check your connection string and confirm it matches the region of your resource.
    • Verify that the resource type (e.g., Azure OpenAI Foundry) supports the models you intend to use.

    Certain models may have regional restrictions or limited availability. For instance, some high-performance models are available only in select regions. Check Azure OpenAI Region Availability here - https://learn.microsoft.com/en-us/azure/ai-services/openai/overview#regional-availability

    1. Incorrect or missing environment variables can cause authentication or connection errors:

    Verify the .env file contains:

    AIPROJECT_CONNECTION_STRING: The correct connection string for your Azure AI resource.

    INTENT_MAPPING_MODEL and EMBEDDINGS_MODEL: The names of deployed models as shown in the Azure Portal. For an example:

    import os
    # Check environment variable setup
    print("Connection String:", os.getenv("AIPROJECT_CONNECTION_STRING"))
    print("Intent Model:", os.getenv("INTENT_MAPPING_MODEL"))
    print("Embeddings Model:", os.getenv("EMBEDDINGS_MODEL"))
    
    1. If the problem persists:
    • Use the Azure CLI or Portal to redeploy the models using bash command: az openai deployment create --resource-group <resource-group> --name <deployment-name> --model <model-id>
    • Start with models like gpt-3.5-turbo or ada to verify deployment configurations.
    • Read more about Azure CLI OpenAI Deployment here - https://learn.microsoft.com/en-us/cli/azure/openai

    NOTE: This is an additional step. If none of these steps resolve the issue:

    Ensure your Azure subscription includes access to restricted models AND / OR contact Azure support from your Azure Portal.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.