How to train a custom model from a custom model in Azure Speech Studio?

Bruno Vaz 0 Reputation points
2025-02-20T09:24:21.05+00:00

Hello!

I have a custom model in Azure Speech Studio that I've trained a few months back with audio + human-labeled transcripts. Now, I need to further train the model to ensure it properly filters profanity words. I thought I could just select my custom model as the baseline model and train it on custom display text formatting, but that option is not shown (see image below, where the custom model is on the foreground, but it is not shown on the dropdown to select it as the baseline model).

Is this not possible at all? I don't know if the REST API provides more options.

If not possible, any ideas on how I can accomplish this task?

Thank you in advance!

All the best,

Brunospeech_studio

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,924 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Pavankumar Purilla 3,410 Reputation points Microsoft Vendor
    2025-02-21T20:31:58.0266667+00:00

    Hi Bruno Vaz,
    Greetings & Welcome to the Microsoft Q&A forum! Thank you for sharing your query.

    Training a Custom Model from a Custom Model:

    Unfortunately, Azure Speech Studio does not currently support using an existing custom model as a baseline for further training directly through the UI. This means you can't select your previously trained custom model as the baseline for new training sessions.

    Using the REST API:

    The REST API for Azure Speech Service might offer more flexibility. You can upload datasets and specify how they are used (for training or testing) when you train a model or run a test. However, the API also follows the same general principles as the UI, so you might still face similar limitations regarding baseline models.

    Combining Datasets for Training:

    You can train a baseline model with multiple datasets. For your use case, you can use the original audio + human-labeled transcripts dataset along with a custom display text formatting dataset for profanity filtering. This approach should help improve the model's ability to filter out profanity.

    Steps to Train with Multiple Datasets:

    Upload Datasets: Ensure both datasets (original audio + transcripts and custom display text formatting) are uploaded to your Speech Studio project.

    Train a New Model: Select the most recent base model available as your starting point. On the "Choose data" page, select both datasets for training.

    Configure Profanity Filtering: Make sure your custom display text formatting dataset includes the necessary rules for profanity filtering.

    Additional Resources:

    You might find it helpful to refer to the Azure AI services documentation for detailed steps on training custom models and managing datasets

    I hope this information helps. Thank you!

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.