@Ilg Alexander Here are a few steps you can take to ensure your embeddings are interpreted as float32
(Edm.Single) instead of float64
(Edm.Double):
- Ensure Proper Conversion in Python: Even though you've tried using
numpy
to convert the array tofloat32
, it's possible that the conversion isn't being applied correctly. Make sure you're converting the array and then serializing it properly. Here's an example:
import numpy as np
import json
def convert_to_float32(embedding):
embedding_float32 = np.array(embedding, dtype=np.float32).tolist()
return embedding_float32
# Example usage
embedding = [-0.013657977, 0.004854262, -0.015335504, -0.010732211]
embedding_float32 = convert_to_float32(embedding)
response_body = {
"values": [
{
"recordId": "1",
"data": {
"embedding": embedding_float32
}
}
]
}
# Convert to JSON
response_json = json.dumps(response_body)
print(response_json)
- Check Your API Response: Ensure that your API response is correctly formatted and that the
embedding
values are indeed infloat32
format. You can print the type of the elements in theembedding
array to verify:
print(type(embedding_float32[0])) # Should print <class 'float'>
- Update Your Indexer Configuration: If the above steps don't resolve the issue, you might need to explicitly specify the data type in your indexer configuration. Unfortunately, Azure AI Search might still interpret the values as
float64
due to the way JSON serialization works in Python. - Custom Serialization: You can create a custom JSON encoder to ensure the values are serialized as
float32
. Here's an example:
import json
import numpy as np
class Float32Encoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.float32):
return float(obj)
return json.JSONEncoder.default(self, obj)
def convert_to_float32(embedding):
embedding_float32 = np.array(embedding, dtype=np.float32).tolist()
return embedding_float32
embedding = [-0.013657977, 0.004854262, -0.015335504, -0.010732211]
embedding_float32 = convert_to_float32(embedding)
response_body = {
"values": [
{
"recordId": "1",
"data": {
"embedding": embedding_float32
}
}
]
}
response_json = json.dumps(response_body, cls=Float32Encoder)
print(response_json)
- Azure Function Configuration: Ensure that your Azure Function is correctly configured to handle the data types. Sometimes, the issue might be with how the function processes the data.