Hi @Balaji Mogadali since you're using SharePoint Online and REST APIs, you'll need to generate embeddings (vectors) for your SharePoint documents and then index them in Azure AI Search. Here's a loose guide to help you get started:
1. Generating Embeddings (Vectors):
You'll need an embedding model to convert your document content into vectors. Azure OpenAI's embeddings models (like text-embedding-ada-002
) are a good choice. You can use the Azure OpenAI REST API or SDK to generate embeddings.
Here's a C# code snippet using the Azure.AI.OpenAI NuGet package:
using Azure;
using Azure.AI.OpenAI;
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
public class EmbeddingGenerator
{
private readonly OpenAIClient _openAIClient;
public EmbeddingGenerator(string endpoint, string key)
{
_openAIClient = new OpenAIClient(new Uri(endpoint), new AzureKeyCredential(key));
}
public async Task<List<float>> GetEmbeddingsAsync(string text)
{
try
{
EmbeddingsOptions embeddingsOptions = new EmbeddingsOptions(text);
Response<Embeddings> embeddingsResponse = await _openAIClient.GetEmbeddingsAsync("text-embedding-ada-002", embeddingsOptions);
return embeddingsResponse.Value.Data[0].Embedding;
}
catch (RequestFailedException ex)
{
Console.WriteLine($"Error generating embeddings: {ex.Message}");
return null; // Or handle the error as needed
}
}
public static async Task<string> GetSharePointFileContentAsync(string siteUrl, string relativeUrl, string accessToken)
{
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", accessToken);
string fileUrl = $"{siteUrl}/_api/web/GetFileByServerRelativeUrl('{relativeUrl}')/$value";
try
{
HttpResponseMessage response = await client.GetAsync(fileUrl);
response.EnsureSuccessStatusCode(); // Throw if not successful
return await response.Content.ReadAsStringAsync();
}
catch (HttpRequestException ex)
{
Console.WriteLine($"Error getting SharePoint file: {ex.Message}");
return null;
}
}
}
}
// Example usage:
public static async Task Main(string[] args)
{
string openAiEndpoint = Environment.GetEnvironmentVariable("OPENAI_ENDPOINT");
string openAiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY");
string sharepointSiteUrl = Environment.GetEnvironmentVariable("SHAREPOINT_SITE_URL");
string sharepointRelativeUrl = "/sites/yoursite/Shared Documents/yourdocument.txt";
string accessToken = Environment.GetEnvironmentVariable("SHAREPOINT_ACCESS_TOKEN");
var generator = new EmbeddingGenerator(openAiEndpoint, openAiKey);
string fileContent = await EmbeddingGenerator.GetSharePointFileContentAsync(sharepointSiteUrl, sharepointRelativeUrl, accessToken);
if (fileContent != null)
{
List<float> embeddings = await generator.GetEmbeddingsAsync(fileContent);
if (embeddings != null)
{
Console.WriteLine("Embeddings generated successfully:");
Console.WriteLine(JsonSerializer.Serialize(embeddings));
}
}
Console.ReadKey();
}
2. Indexing in Azure AI Search:
Your index schema needs a field to store the vectors. A Collection(Edm.Single)
is suitable.
{
"name": "contentVector",
"type": "Collection(Edm.Single)",
"searchable": true, // Important for vector search
"filterable": false,
"retrievable": true,
"vectorSearchConfiguration": "my-vector-config" // Link to your vector config
}
You'll also need a vector search configuration:
{
"name": "my-vector-config",
"kind": "hnsw",
"hnswParameters": {
"m": 4,
"efConstruction": 400,
"efSearch": 500,
"metric": "cosine"
}
}
Use the Azure AI Search REST API or SDK to index your documents, including the generated vectors.
Steps to Re-sync the Index with New/Updated Documents
- Using Power Automate:
- Create a flow in Power Automate that triggers when a new document is created or an existing document is updated in SharePoint.
- Use the HTTP action to call an Azure Function that generates embeddings and updates the search index.
- Using Azure Functions:
- Create an Azure Function that generates embeddings for new or updated documents and updates the search index.
- Use the Azure Cognitive Search REST API to update the search index with the new embeddings.
- Power Automate Flow Example:
- Trigger: When a file is created or modified in SharePoint.
- Action: HTTP request to Azure Function.
Hope that helps
-Grace