How to fix "Similarity index was not found for a vector similarity search query." error from cosmos mongodb

Priyanka Hurakadli 25 Reputation points Microsoft Employee
2025-02-05T20:25:01.5266667+00:00

I am trying to use cosmos mongodb vcore as memory for Semantic Kernel to implement RAG. I have code that inserts the sample data into cosmos mongodb vcore and does a vector search. I am able to insert the data using below code but when I try to search the query, I get the "Similarity index was not found for a vector similarity search query." error. Please find more details on error in the attachment.

Error.txt

Code to query data:

query_term = "What do you know about the movie Breaking Fast?"
result = await memory.search(collection_name, query_term)

Code used to Insert data:

async def upsert_data_to_memory_store(memory: SemanticTextMemory, store: MemoryStoreBase, data_file_path: str) -> None:
    """
    This asynchronous function takes two memory stores and a data file path as arguments.
    It is designed to upsert (update or insert) data into the memory stores from the data file.

    Args:
        memory (callable): A callable object that represents the semantic kernel memory.
        store (callable): A callable object that represents the memory store where data will be upserted.
        data_file_path (str): The path to the data file that contains the data to be upserted.

    Returns:
        None. The function performs an operation that modifies the memory stores in-place.
    """
    with open(file=data_file_path, encoding="utf-8") as f:
        data = json.load(f)
        n = 0
        for item in data:
            n += 1
            # check if the item already exists in the memory store
            # if the id doesn't exist, it throws an exception
            try:
                already_created = bool(await store.get(collection_name, item["id"], with_embedding=True))
            except Exception:
                already_created = False
            # if the record doesn't exist, we generate embeddings and save it to the database
            if not already_created:
                await memory.save_information(
                    collection=collection_name,
                    id=item["id"],
                    # the embedding is generated from the text field
                    text=item["content"],
                    description=item["title"],
                )
                print(
                    "Generating embeddings and saving new item:",
                    n,
                    "/",
                    len(data),
                    end="\r",
                )
            else:
                print("Skipping item already exits:", n, "/", len(data), end="\r")
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,749 questions
{count} vote

1 answer

Sort by: Most helpful
  1. Vijayalaxmi Kattimani 1,085 Reputation points Microsoft Vendor
    2025-02-06T03:02:27.1666667+00:00

    Hi @Priyanka Hurakadli,

    Welcome to the Microsoft Q&A Platform! Thank you for asking your question here.

    The error indicates that MongoDB could not locate a similarity index for your vector similarity search. This issue usually arises when the necessary index for vector search has not been created. Please ensure you are using the correct version, as this feature is available only from MongoDB 6.0 and above. And also ensure that, you are connecting to the correct Database and Collection.

    If you have already created the similarity index make sure to verify the index using below mentioned code.

    indexes = collection.index_information()
    for index in indexes:  
        print(index)
    

    Look for an index with 'similarity' in the output to ensure it has been created correctly.

    I hope, This response will address your query and helped you to overcome on your challenges.

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.