Hi Gopi,
Thanks for the question. As part of this question, To convert speech to text while extracting content via Azure AI Search, you'll need to integrate Azure Cognitive Services - Speech to Text with your skillset. Since Azure AI Search does not include a built-in Speech-to-Text skill, you'll need to use Azure Speech Services separately or create a Custom Skill in your Cognitive Search pipeline.
As part of the solution, Decide whether to use a Custom Skill in Azure AI Search or process audio before indexing. Set up Azure Speech Services and test transcription. Integrate with Azure AI Search using Custom Skills, Logic Apps, or Azure Functions. Here are the steps below which can help you with
- Use Azure Cognitive Search Custom Skill with Speech-to-Text API
Azure AI Search allows Custom Skills to process data before indexing. You can create an Azure Function that calls Azure Speech Services and returns transcribed text.
- Learn how to create a custom skill in Azure AI Search → Custom Skills in Azure AI Search
- Learn how to integrate Azure AI Search with Blob Storage → Azure AI Search Indexing
Steps:
- Store your audio files in Azure Blob Storage.
- Configure an Azure AI Search Indexer to read these files.
- Create an Azure Function to call the Speech-to-Text API and return the transcribed text.
- Use the function as a Custom Skill in your Azure AI Search pipeline.
- Use Azure Speech Services Directly Before Indexing
If your audio files are not indexed yet, you can:
Use Azure Speech SDK or REST API to transcribe the audio files.
Store the transcribed text in Azure Blob Storage or Cosmos DB.
Use Azure AI Search to index the transcribed text.
- Learn how to use Azure Speech-to-Text API → Azure Speech Service
- How to transcribe speech-to-text using Azure SDK → Speech-to-Text with Python
Sample Python Code Using Azure Speech SDK:
import azure.cognitiveservices.speech as speechsdk
speech_key = "Your_Speech_API_Key"
service_region = "Your_Region"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_config = speechsdk.audio.AudioConfig(filename="your_audio_file.wav")
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
result = speech_recognizer.recognize_once()
print("Transcription:", result.text)
- Automate Speech-to-Text Conversion Using Logic Apps
You can automate the process using Azure Logic Apps:
Trigger: When a new audio file is uploaded to Azure Blob Storage.
Action: Use Azure Speech Services to transcribe the text.
Output: Store the transcribed text in Azure Storage/Table/CosmosDB for Azure AI Search to index.
- Learn how to automate workflows with Logic Apps → Azure Logic Apps Overview
Please try out these steps and check if there any solution to it. Hope this answer helps you with solution! Please comment below if you need any assistance on the same. Happy to help!
Regards,
Chakravarthi Rangarajan Bhargavi
-Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thanks a lot.