Hi swal
Here is the update. I am able to get expected answer with below inputs and commands.
input - "<speak version=\"1.0\" xml:lang=\"en-US\"><voice name=\"en-US-JennyNeural\">The rainbow has<break strength=\"strong\"/>seven colors.<break strength=\"strong\"/>Each color has its own beauty.<break strength=\"strong\"/></voice></speak>"
curl -v -X PUT -H "Ocp-Apim-Subscription-Key: yoursubkey" -H "Content-Type: application/json" -d '{
"description": "my ssml test",
"inputKind": "SSML",
"inputs": [
{
"content": "<speak version=\"1.0\" xml:lang=\"en-US\"><voice name=\"en-US-JennyNeural\">The rainbow has<break strength=\"strong\"/>seven colors.<break strength=\"strong\"/>Each color has its own beauty.<break strength=\"strong\"/></voice></speak>"
}
],
"properties": {
"outputFormat": "riff-24khz-16bit-mono-pcm",
"wordBoundaryEnabled": true,
"sentenceBoundaryEnabled": false,
"concatenateResult": false,
"decompressOutputFiles": false
}
}'
https://northeurope.api.cognitive.microsoft.com/texttospeech/batchsyntheses/idm0756?api-version=2024-04-01%22
output- [
{
"Text": "The",
"AudioOffset": 50,
"Duration": 137
},
{
"Text": "rainbow",
"AudioOffset": 200,
"Duration": 350
},
{
"Text": "has",
"AudioOffset": 562,
"Duration": 475
},
{
"Text": "seven",
"AudioOffset": 2050,
"Duration": 362
},
{
"Text": "colors",
"AudioOffset": 2425,
"Duration": 612
},
{
"Text": ".",
"AudioOffset": 3050,
"Duration": 100
},
{
"Text": "Each",
"AudioOffset": 4900,
"Duration": 287
},
{
"Text": "color",
"AudioOffset": 5200,
"Duration": 350
},
{
"Text": "has",
"AudioOffset": 5562,
"Duration": 175
},
{
"Text": "its",
"AudioOffset": 5750,
"Duration": 150
},
{
"Text": "own",
"AudioOffset": 5912,
"Duration": 162
},
{
"Text": "beauty",
"AudioOffset": 6087,
"Duration": 462
},
{
"Text": ".",
"AudioOffset": 6562,
"Duration": 100
}
]
Thank You.