I want to setup and test a basic & simple "Chatgpt-4o Voice-To-Voice App".

I already have setup ChatGPT-4o-preview model in Azure AI Foundry, and I already have the API Key & the endpoint. I just want "THE CODE FOR THE APP" (Microsoft has not provided the code to use, for setting up an app for this)

Question

I want to setup and test a basic & simple "Chatgpt-4o Voice-To-Voice App".

I already have setup ChatGPT-4o-preview model in Azure AI Foundry, and I already have the API Key & the endpoint. I just want "THE CODE FOR THE APP" (Microsoft has not provided the code to use, for setting up an app for this)

Answer

Hi there Santosh Kumar

Thanks for using QandA platform

you will need to integrate both speech recognition (converting voice to text) and speech synthesis in addition to using the ChatGPT-4 model through the Azure API.

cofde snipp for voice to voice app

import os
import openai
import azure.cognitiveservices.speech as speechsdk
import pyaudio
import wave
# Setup Azure Cognitive Services Speech API
speech_key = "YOUR_AZURE_SPEECH_KEY"
region = "YOUR_AZURE_REGION"
# Setup OpenAI (ChatGPT-4) API
openai.api_key = 'YOUR_OPENAI_API_KEY'
chatgpt_endpoint = "YOUR_CHATGPT_ENDPOINT"
# Speech-to-Text: Initialize speech recognizer
def speech_to_text():
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=region)
    audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
    recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
    print("Say something...")
    result = recognizer.recognize_once()
    if result.reason == speechsdk.ResultReason.RecognizedSpeech:
        print("Recognized: {}".format(result.text))
        return result.text
    else:
        print("Speech could not be recognized.")
        return None
# ChatGPT-4: Send text to ChatGPT-4 and get response
def chatgpt_response(prompt):
    response = openai.Completion.create(
        model="gpt-4",
        prompt=prompt,
        max_tokens=150
    )
    return response.choices[0].text.strip()
# Text-to-Speech: Convert text response to speech
def text_to_speech(text):
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=region)
    audio_config = speechsdk.audio.AudioConfig(filename="output_audio.wav")
    synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
    synthesizer.speak_text_async(text)
    print("Speaking the response...")
# Main function to run Voice-to-Voice
def voice_to_voice():
    while True:
        # Step 1: Listen and convert speech to text
        user_input = speech_to_text()
        if user_input is None:
            break
        
        # Step 2: Send input to ChatGPT-4 for processing
        chatgpt_text = chatgpt_response(user_input)
        print(f"ChatGPT-4 Response: {chatgpt_text}")
        # Step 3: Convert ChatGPT-4 response back to speech
        text_to_speech(chatgpt_text)
# Run the app
if __name__ == "__main__":
    voice_to_voice()

Kindly accept the answer if this helps thanks.

Share via

I want to setup and test a basic & simple "Chatgpt-4o Voice-To-Voice App". I already have setup ChatGPT-4o-preview model in Azure AI Foundry, and I already have the API Key & the endpoint. I just want "THE CODE FOR THE APP"

1 answer

Your answer