Rediger

Del via


series_cosine_similarity()

Applies to: ✅ Microsoft FabricAzure Data ExplorerAzure MonitorMicrosoft Sentinel

Calculate the cosine similarity of two numerical vectors.

The function series_cosine_similarity() takes two numeric series as input, and calculates their cosine similarity.

Syntax

series_cosine_similarity(series1, series2, [*magnitude1, [*magnitude2]])

Learn more about syntax conventions.

Parameters

Name Type Required Description
series1, series2 dynamic ✔️ Input arrays with numeric data.
magnitude1, magnitude2 real Optional magnitude of the first and the second vectors respectively. The magnitude is the square root of the dot product of the vector with itself. If the magnitude isn't provided, it will be calculated.

Returns

Returns a value of type real whose value is the cosine similarity of series1 with series2. In case both series length isn't equal, the longer series will be truncated to the length of the shorter one. Any non-numeric element of the input series will be ignored.

Note

If one or both input arrays are empty, the result will be null.

Optimizing performance

For enhanced performance and reduced storage requirements when using this function, consider using the Vector16 encoding policy for storing floating-point vectors that don't require 64 bits precision, such as ML vector embeddings. The Vector16 profile, which utilizes the Bfloat16 floating-point representation, can significantly optimize the operation and reduce storage size by a factor of 4. For more details on the Vector16 encoding policy, refer to the Encoding Policy Types.

Example

datatable(s1:dynamic, s2:dynamic)
[
    dynamic([0.1,0.2,0.1,0.2]), dynamic([0.11,0.2,0.11,0.21]),
    dynamic([0.1,0.2,0.1,0.2]), dynamic([1,2,3,4]),
]
| extend cosine_similarity=series_cosine_similarity(s1, s2)
s1 s2 cosine_similarity
[0.1,0.2,0.1,0.2] [0.11,0.2,0.11,0.21] 0.99935343825504
[0.1,0.2,0.1,0.2] [1,2,3,4] 0.923760430703401