Hello Fabio Puddu,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you would like to Azure AI services feature container Speech to Text in an on-premises.
Running Speech to Text containers on-premises is a great choice for maintaining control over your data. However, you need to put into consideration the followings as you requested in your question:
- The cost of running Speech to Text containers on-premises depends on your usage and the resources allocated. For example, each decoder in batch processing mode can handle 2-3x real-time with two CPU cores. For more explanations: https://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services
- You can run multiple Speech to Text containers on the same host. This setup allows you to handle multiple requests simultaneously, which is useful for multi-agent services. - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-stt
- The Speech to Text containers support various features, but specific details about anonymization capabilities are not explicitly mentioned in the documentation. You might need to implement additional layers of data processing to ensure anonymization. Check the link above for the same.
- The containers can handle various audio formats, including MP3. You can use the
docker run
command to run the container and specify the audio input format. - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-stt
For more reading and more detailed guidance: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-faq and links provided above.
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.