Azure AISearch Indexer: "'JSON arrays with element type 'Float' map to Collection(Edm.Double)"

Ilg Alexander 0 Reputation points
2024-12-17T10:01:28.1466667+00:00

I have the following problem. I am trying to build an indexer in Azure AI Search. I have a skillset with a “Custom.WebApiSkill” skill. This provides me with the following response body:

{
  "values": [
    {
      "recordId": "1",
      "data": {
        "embedding": [
          -0.013657977,
          0.004854262,
          -0.015335504,
          -0.010732211,
          ...
        ]
      }
    }
  ]
}  

As part of the indexer, I am now trying to map the “embedding” value of the response body to a field in my index:

    "outputFieldMappings": [
    {
      "sourceFieldName": "/document/pages/*/embedding",
      "targetFieldName": "content_vector",
      "mappingFunction": null
    }
  ]

My index field "content_vector" looks like that:

   
    {
      "name": "content_vector",
      "type": "Collection(Edm.Single)",
      "key": false,
      "retrievable": true,
      "stored": true,
      "searchable": true,
      "filterable": false,
      "sortable": false,
      "facetable": false,
      "synonymMaps": [],
      "dimensions": 1536,
      "vectorSearchProfile": "myHnswProfile"
    }

However, I receive the following error when executing:


The data field 'content_vector/0' in the document with key 'aHR0cHM6Ly9zdHJhZ3Byb3RvdHlwZGV2My5ibG9iLmNvcmUud2luZG93cy5uZXQvdGVzdGRhdGEvS29tbXVuaWthdGlvbnN0ZWNobmlrLUZpYmVsLnBkZg2' has an invalid value of type 'Collection(Edm.Double)' ('JSON arrays with element type 'Float' map to Collection(Edm.Double)'). The expected type was 'Collection(Edm.Single)'.

How can I make sure that my custom WebApi returns the embedding array with float32 values, or how can I make sure that my indexer interprets the values as float32 (Edm.Single) and not as float64 (Edm-Double)?

Thank you very much!

I tried to use numpy in my Custon WebAPI (python) to convert the values of "embedding" to float32, but that didn't worked.

Something like that:

embedding_float32 = np.array(embedding, dtype=np.float32).tolist()

UPDATE:

I tried using “numpy” to convert the array to “float32”, just like you showed in your first code snippet. Nevertheless, the indexer interprets it as float64 (Edm.Double):

The data field 'content_vector/0' in the document with key 'xyz' has an invalid value of type 'Collection(Edm.Double)' ('JSON arrays with element type 'Float' map to Collection(Edm.Double)'). The expected type was 'Collection(Edm.Single)

Is there a possibility that the indexer interprets the values as float32 (Edm.Single) or that I force the data type in my CustomWebAPI? The problem is that Python does not natively differentiate between float32 and float64 and therefore treats and returns the value as float64 by default.

Here is the link to my WebAPI in GitHub: https://github.com/Alexkanns/CustomWebAPI/blob/main/init.py

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,249 questions
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,118 questions
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.