How to fix "Similarity index was not found for a vector similarity search query." error from cosmos mongodb

Priyanka Hurakadli 25 Microsoft Employee

I am trying to use cosmos mongodb vcore as memory for Semantic Kernel to implement RAG. I have code that inserts the sample data into cosmos mongodb vcore and does a vector search. I am able to insert the data using below code but when I try to search the query, I get the "Similarity index was not found for a vector similarity search query." error. Please find more details on error in the attachment.

Error.txt

Code to query data:

query_term = "What do you know about the movie Breaking Fast?"
result = await memory.search(collection_name, query_term)

Code used to Insert data:

async def upsert_data_to_memory_store(memory: SemanticTextMemory, store: MemoryStoreBase, data_file_path: str) -> None:
    """
    This asynchronous function takes two memory stores and a data file path as arguments.
    It is designed to upsert (update or insert) data into the memory stores from the data file.

    Args:
        memory (callable): A callable object that represents the semantic kernel memory.
        store (callable): A callable object that represents the memory store where data will be upserted.
        data_file_path (str): The path to the data file that contains the data to be upserted.

    Returns:
        None. The function performs an operation that modifies the memory stores in-place.
    """
    with open(file=data_file_path, encoding="utf-8") as f:
        data = json.load(f)
        n = 0
        for item in data:
            n += 1
            # check if the item already exists in the memory store
            # if the id doesn't exist, it throws an exception
            try:
                already_created = bool(await store.get(collection_name, item["id"], with_embedding=True))
            except Exception:
                already_created = False
            # if the record doesn't exist, we generate embeddings and save it to the database
            if not already_created:
                await memory.save_information(
                    collection=collection_name,
                    id=item["id"],
                    # the embedding is generated from the text field
                    text=item["content"],
                    description=item["title"],
                )
                print(
                    "Generating embeddings and saving new item:",
                    n,
                    "/",
                    len(data),
                    end="\r",
                )
            else:
                print("Skipping item already exits:", n, "/", len(data), end="\r")

Priyanka Hurakadli 25 Reputation points Microsoft Employee

2025-02-06T03:39:10.7+00:00

@Vijayalaxmi Kattimani Can you please give me more details on how to create these indexes
Vijayalaxmi Kattimani 1,085 Reputation points Microsoft Vendor

2025-02-06T08:04:08.2933333+00:00
Hi @Priyanka Hurakadli,

We would like to inform you that, to create vector indexes in Azure Cosmos DB for MongoDB vCore, you need to follow these steps:

Firstly, Use a MongoDB client to connect to your Azure Cosmos DB instance.

Use the create Indexes command to create a vector index. Here is an example of how to do this:

import pymongo # Connect to MongoDB client = pymongo.MongoClient("<your_connection_string>") db = client["your_database"] collection = db["your_collection"] # Create the vector index collection.create_index( [("Vector", "cosmosSearch")], cosmosSearchOptions={ "kind": "vector-ivf", "numLists": 800, "similarity": "COS", "dimensions": 1536 } )

In this example:

"Vector" is the field in your documents that contains the vector data.

"cosmosSearch" specifies that this is a vector index.

cosmosSearchOptions contains the options for the vector index:

"kind" specifies the type of vector index. Options include "vector-ivf" (Inverted File Index) and "vector-hnsw" (Hierarchical Navigable Small Worlds).

"numLists" specifies the number of clusters for the IVF index.

"similarity" specifies the similarity metric, such as "COS" for cosine similarity.

"dimensions" specifies the number of dimensions in the vector.

After creating the index, you can verify that it exists in your collection using below code

indexes = collection.index_information() for index in indexes: print(index)

Insert your data into the collection. Ensure that the vector data is included in the documents you insert.

Use the $search functionality to perform vector similarity searches. Here is an example of how to perform a vector search:

query_vector = [0.1, 0.2, 0.3, ...] # Your query vector results = collection.aggregate([ { "$search": { "index":"cosmosSearch", "knnBeta": { "vector": query_vector, "path": "Vector", "k": 10 } } } ]) for result in results: print(result)

In this example:

"Index" is the name of the vector index.

"vector" is the query vector.

"path" is the field in your documents that contains the vector data.

"k" specifies the number of nearest neighbours to return.

By following these steps, you should be able to create vector indexes and perform vector similarity searches in Azure Cosmos DB for MongoDB vCore.

Please refer to the below mentioned links for more information.

https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search?tabs=diskann

https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/indexing

I hope this information helps. Please do let us know if you have any further queries.
Priyanka Hurakadli 25 Reputation points Microsoft Employee

2025-02-06T20:48:34.0366667+00:00
Hi @Vijayalaxmi Kattimani

I tried what you suggested, please find the code attached. I am able to see "vector_cosmosSearch" index on the data. But I am still facing the same issue. Please find the error details below.

colde.txt

UserWarning: You appear to be connected to a CosmosDB cluster. For more information regarding feature compatibility and support please visit https://www.mongodb.com/supportability/cosmosdb

mongo_client = pymongo.MongoClient(mongo_conn)

id

vector_cosmosSearch

Traceback (most recent call last):

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\InsertData.py", line 131, in <module>

asyncio.run(main())

File "C:\Users\prhurakadli\AppData\Local\Programs\Python\Python312\Lib\asyncio\runners.py", line 195, in run

return runner.run(main) ^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\AppData\Local\Programs\Python\Python312\Lib\asyncio\runners.py", line 118, in run

return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\AppData\Local\Programs\Python\Python312\Lib\asyncio\base_events.py", line 691, in run_until_complete

return future.result() ^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\InsertData.py", line 112, in main

results = collection.aggregate([ ^^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\collection.py", line 2978, in aggregate

return self._aggregate( ^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo_csot.py", line 119, in csot_wrapper

return func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\collection.py", line 2886, in _aggregate

return self._database.client._retryable_read( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\mongo_client.py", line 1861, in _retryable_read

return self._retry_internal( ^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo_csot.py", line 119, in csot_wrapper

return func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\mongo_client.py", line 1828, in _retry_internal

).run() ^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\mongo_client.py", line 2565, in run

return self._read() if self._is_read else self._write() ^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\mongo_client.py", line 2708, in _read

return self._func(self._session, self._server, conn, read_pref) # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\aggregation.py", line 164, in get_cursor

result = conn.command( ^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\helpers.py", line 47, in inner

return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\pool.py", line 536, in command

return command( ^^^^^^^^

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\synchronous\network.py", line 213, in command

helpers_shared._check_command_response(

File "C:\Users\prhurakadli\OneDrive - Microsoft\Desktop\Multi Agent POC\env\Lib\site-packages\pymongo\helpers_shared.py", line 247, in _check_command_response

raise OperationFailure(errmsg, code, response, max_wire_version)

pymongo.errors.OperationFailure: Similarity index was not found for a vector similarity search query., full error: {'ok': 0.0, 'errmsg': 'Similarity index was not found for a vector similarity search query.', 'code': 2, 'codeName': 'BadValue'}
Vijayalaxmi Kattimani 1,085 Reputation points Microsoft Vendor

2025-02-07T03:32:10.4766667+00:00
Hi @Priyanka Hurakadli,

Could you please try executing the following command: db.collection.getIndexes(). It returns an array containing a list of documents that identify and describe the existing indexes on the collection, including hidden indexes and those currently being built.

If you are able to see your index, Based on the file you have shared. please try modifying the code mentioned below.

query_vector = [0.1, 0.2, 0.3, ...] # Your query vector results = collection.aggregate([ { "$search": { "index":"cosmosSearch", -- Kindly specify the index you have created here. "knnBeta": { "vector": query_vector, "path": "Vector", "k": 10 } } } ]) for result in results: print(result)

I hope, This response will address your query and helped you to overcome on your challenges.

1 answer

Vijayalaxmi Kattimani 1,085 Reputation points Microsoft Vendor

2025-02-06T03:02:27.1666667+00:00
Hi @Priyanka Hurakadli,

Welcome to the Microsoft Q&A Platform! Thank you for asking your question here.

The error indicates that MongoDB could not locate a similarity index for your vector similarity search. This issue usually arises when the necessary index for vector search has not been created. Please ensure you are using the correct version, as this feature is available only from MongoDB 6.0 and above. And also ensure that, you are connecting to the correct Database and Collection.

If you have already created the similarity index make sure to verify the index using below mentioned code.

indexes = collection.index_information() for index in indexes: print(index)

Look for an index with 'similarity' in the output to ensure it has been created correctly.

I hope, This response will address your query and helped you to overcome on your challenges.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.
Please sign in to rate this answer.
Omkar Joshi 0 Reputation points

2025-02-06T06:45:29.0766667+00:00

Hi

Use this code before testing

from db = client[ collection = db[ indexes = collection.index_information() print print

In my case I was getting indexn_name as"_id"" indicating there is no vector index availaible

collection.create_index( [( cosmosSearchOptions={

Since I am using free tier vector-hnsw is not availaible. Hence updated my store and created index for vector-ivf . This resolve my issue.
Sign in to comment

Use comments to ask for clarification, additional information, or improvements to the question.

Share via

How to fix "Similarity index was not found for a vector similarity search query." error from cosmos mongodb

1 answer

Your answer