練習:建立一個連續辨識語音轉換文字應用程式

已完成

在此練習中,您將建立使用連續辨識的應用程式,以謄寫您在上一個練習中下載的範例音訊檔案。

修改文字轉換語音應用程式的程式碼

  1. 在右側的 Cloud Shell 中,開啟 Program.cs 檔案。

    code Program.cs
    
  2. 使用下列程式碼更新 try/catch 區塊,以修改應用程式以使用連續辨識,而不是單次辨識:

    try
    {
        FileInfo fileInfo = new FileInfo(waveFile);
        if (fileInfo.Exists)
        {
            var speechConfig = SpeechConfig.FromSubscription(azureKey, azureLocation);
            using var audioConfig = AudioConfig.FromWavFileInput(fileInfo.FullName);
            using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);
            var stopRecognition = new TaskCompletionSource<int>();
    
            FileStream fileStream = File.OpenWrite(textFile);
            StreamWriter streamWriter = new StreamWriter(fileStream, Encoding.UTF8);
    
            speechRecognizer.Recognized += (s, e) =>
            {
                switch(e.Result.Reason)
                {
                    case ResultReason.RecognizedSpeech:
                        streamWriter.WriteLine(e.Result.Text);
                        break;
                    case ResultReason.NoMatch:
                        Console.WriteLine("Speech could not be recognized.");
                        break;
                }
            };
    
            speechRecognizer.Canceled += (s, e) =>
            {
                if (e.Reason != CancellationReason.EndOfStream)
                {
                    Console.WriteLine("Speech recognition canceled.");
                }
                stopRecognition.TrySetResult(0);
                streamWriter.Close();
            };
    
            speechRecognizer.SessionStopped += (s, e) =>
            {
                Console.WriteLine("Speech recognition stopped.");
                stopRecognition.TrySetResult(0);
                streamWriter.Close();
            };
    
            Console.WriteLine("Speech recognition started.");
            await speechRecognizer.StartContinuousRecognitionAsync();
            Task.WaitAny(new[] { stopRecognition.Task });
            await speechRecognizer.StopContinuousRecognitionAsync();
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message);
    }
    
  3. 當您完成修改程式碼之後,您的檔案應該會類似下列範例:

    using System.Text;
    using Microsoft.CognitiveServices.Speech;
    using Microsoft.CognitiveServices.Speech.Audio;
    
    string azureKey = "ENTER YOUR KEY FROM THE FIRST EXERCISE";
    string azureLocation = "ENTER YOUR LOCATION FROM THE FIRST EXERCISE";
    string textFile = "Shakespeare.txt";
    string waveFile = "Shakespeare.wav";
    
    try
    {
        FileInfo fileInfo = new FileInfo(waveFile);
        if (fileInfo.Exists)
        {
            var speechConfig = SpeechConfig.FromSubscription(azureKey, azureLocation);
            using var audioConfig = AudioConfig.FromWavFileInput(fileInfo.FullName);
            using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);
            var stopRecognition = new TaskCompletionSource<int>();
    
            FileStream fileStream = File.OpenWrite(textFile);
            StreamWriter streamWriter = new StreamWriter(fileStream, Encoding.UTF8);
    
            speechRecognizer.Recognized += (s, e) =>
            {
                switch(e.Result.Reason)
                {
                    case ResultReason.RecognizedSpeech:
                        streamWriter.WriteLine(e.Result.Text);
                        break;
                    case ResultReason.NoMatch:
                        Console.WriteLine("Speech could not be recognized.");
                        break;
                }
            };
    
            speechRecognizer.Canceled += (s, e) =>
            {
                if (e.Reason != CancellationReason.EndOfStream)
                {
                    Console.WriteLine("Speech recognition canceled.");
                }
                stopRecognition.TrySetResult(0);
                streamWriter.Close();
            };
    
            speechRecognizer.SessionStopped += (s, e) =>
            {
                Console.WriteLine("Speech recognition stopped.");
                stopRecognition.TrySetResult(0);
                streamWriter.Close();
            };
    
            Console.WriteLine("Speech recognition started.");
            await speechRecognizer.StartContinuousRecognitionAsync();
            Task.WaitAny(new[] { stopRecognition.Task });
            await speechRecognizer.StopContinuousRecognitionAsync();
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message);
    }
    

    如同上一個練習,請務必使用第一個練習中的金鑰和位置來更新 azureKeyazureLocation 變數的值。

  4. 若要儲存您的變更,請按 Ctrl+S 以儲存檔案,然後按 Ctrl+Q 以結束編輯器。

執行應用程式

  1. 若要執行您的應用程式,請在右側的 Cloud Shell 中使用下列命令:

    dotnet run
    
  2. 如果您沒有看到任何錯誤,您的應用程式已成功執行,您應該會看到顯示下列回應:

    Speech recognition started.
    Speech recognition stopped.
    
  3. 請執行下列命令以取得目錄中的檔案清單:

    ls -l
    

    您應該會收到類似下列範例的回應,而且您應該會在檔案清單中看到 Shakespeare.txt 檔案:

    drwxr-xr-x 3 user   user     4096 Oct  1 11:11 bin
    drwxr-xr-x 3 user   user     4096 Oct  1 11:11 obj
    -rw-r--r-- 1 user   user     2926 Oct  1 11:11 Program.cs
    -rw-r--r-- 1 user   user      412 Oct  1 11:11 Shakespeare.txt
    -rwxr-xr-x 1 user   user   978242 Oct  1 11:11 Shakespeare.wav
    -rw-r--r-- 1 user   user      348 Oct  1 11:11 speech to text.csproj
    

    您會發現文字檔大於上一個練習的結果。 檔案大小的差異是因為連續語音辨識轉換了更多音訊檔案。

  4. 若要檢視 Shakespeare.txt 檔案的內容,請使用下列命令:

    cat Shakespeare.txt
    

    您應該會看到類似以下範例的回應:

    The following quotes are from Act 2, scene seven of William Shakespeare's play as you like it.
    Though CS we are not all alone unhappy.
    This wide and universal theater presents more woeful pageants than the scene wherein we play in.
    All the world's a stage and all the men and women merely players.
    They have their exits and their entrances, and one man in his time plays many parts, his act being seven ages.
    

    如果您接聽範例 WAV 檔案,您會注意到此文字現在包含整個音訊。 由於我們使用 SpeechRecognizerStartContinuousRecognitionAsync() 方法,因此即使說話者暫停,語音轉換文字辨識仍然繼續。

改善應用程式的辨識結果

在上一節中,您會注意到第二行文字的結果並不完美;辨識中的這個錯誤是因為 William Shakespeare 戲劇中的古英文詞彙所造成。 此範例類似您的醫療客戶將用於其筆記和聽寫的特殊詞彙。

Azure AI 語音讓您藉由提供語音辨識引擎可能不熟悉的片語清單來改善辨識結果。

若要查看此類型的動作改進範例,請使用下列步驟。

  1. 在右側的 Cloud Shell 中,開啟 Program.cs 檔案:

    code Program.cs
    
  2. 找出下列程式碼行:

    FileStream fileStream = File.OpenWrite(textFile);
    StreamWriter streamWriter = new StreamWriter(fileStream, Encoding.UTF8);
    
  3. 直接在這兩行之後新增下列程式碼。 請確定這兩行都符合前一行的縮排:

    var phraseList = PhraseListGrammar.FromRecognizer(speechRecognizer);
    phraseList.AddPhrase("thou seest");
    

    這幾行可讓語音辨識引擎偵測來自 Shakespeare 戲劇中的古英文詞彙。

  4. 若要儲存您的變更,請按 Ctrl+S 以儲存檔案,然後按 Ctrl+Q 以結束編輯器。

  5. 使用下列命令重新執行應用程式:

    dotnet run
    
  6. 當您的應用程式完成時,請使用下列命令來檢視 Shakespeare.txt 檔案的內容:

    cat Shakespeare.txt
    

    您應該會看到類似以下範例的回應:

    The following quotes are from Act 2, scene seven of William Shakespeare's play as you like it.
    Thou seest, we are not all alone unhappy.
    This wide and universal theater presents more woeful pageants than the scene wherein we play in.
    All the world's a stage and all the men and women merely players.
    They have their exits and their entrances, and one man in his time plays many parts, his act being seven ages.
    

    您會注意到結果中已修正辨識錯誤。