Transcriptions - List

Gets a list of transcriptions for the authenticated subscription.

GET {endpoint}/speechtotext/transcriptions?api-version=2024-11-15
GET {endpoint}/speechtotext/transcriptions?skip={skip}&top={top}&filter={filter}&api-version=2024-11-15

URI Parameters

Name In Required Type Description
endpoint
path True

string

Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus.api.cognitive.microsoft.com).

api-version
query True

string

The requested api version.

filter
query

string

A filtering expression for selecting a subset of the available transcriptions.

        - Supported properties: displayName, description, createdDateTime, lastActionDateTime, status, locale.

        - Operators:

          - eq, ne are supported for all properties.

          - gt, ge, lt, le are supported for createdDateTime and lastActionDateTime.

          - and, or, not are supported.

        - Example:

          filter=createdDateTime gt 2022-02-01T11:00:00Z
skip
query

integer

int32

Number of datasets that will be skipped.

top
query

integer

int32

Number of datasets that will be included after skipping.

Request Header

Name Required Type Description
Ocp-Apim-Subscription-Key True

string

Provide your cognitive services account key here.

Responses

Name Type Description
200 OK

PaginatedTranscriptions

OK

Headers

Retry-After: integer

Other Status Codes

Error

An error occurred.

Security

Ocp-Apim-Subscription-Key

Provide your cognitive services account key here.

Type: apiKey
In: header

Examples

Get all failed transcriptions
Get all transcriptions

Get all failed transcriptions

Sample request

GET {endpoint}/speechtotext/transcriptions?skip=0&top=2&filter=status%20eq%20'Failed'&api-version=2024-11-15

Sample response

{
  "values": [
    {
      "self": "https://westus.api.cognitive.microsoft.com/speechtotext/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683?api-version=2024-11-15",
      "displayName": "Transcription using adapted model en-US",
      "customProperties": {
        "key": "value"
      },
      "locale": "en-US",
      "createdDateTime": "2019-01-07T11:34:12Z",
      "lastActionDateTime": "2019-01-07T11:36:07Z",
      "contentUrls": [
        "https://contoso.com/",
        "https://contoso2.com/"
      ],
      "model": {
        "self": "https://westus.api.cognitive.microsoft.com/speechtotext/models/827712a5-f942-4997-91c3-7c6cde35600b?api-version=2024-11-15"
      },
      "links": {
        "files": "https://westus.api.cognitive.microsoft.com/speechtotext/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683/files?api-version=2024-11-15"
      },
      "properties": {
        "wordLevelTimestampsEnabled": false,
        "displayFormWordLevelTimestampsEnabled": true,
        "channels": [
          0,
          1
        ],
        "punctuationMode": "DictatedAndAutomatic",
        "profanityFilterMode": "Masked",
        "timeToLiveHours": 48,
        "durationMilliseconds": 42000
      },
      "status": "Failed"
    }
  ]
}

Get all transcriptions

Sample request

GET {endpoint}/speechtotext/transcriptions?skip=0&top=2&filter=createdDateTime%20ge%202018-01-24T09:54:39Z&api-version=2024-11-15

Sample response

{
  "values": [
    {
      "self": "https://westus.api.cognitive.microsoft.com/speechtotext/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683?api-version=2024-11-15",
      "displayName": "Transcription using adapted model en-US",
      "customProperties": {
        "key": "value"
      },
      "locale": "en-US",
      "createdDateTime": "2019-01-07T11:34:12Z",
      "lastActionDateTime": "2019-01-07T11:36:07Z",
      "model": {
        "self": "https://westus.api.cognitive.microsoft.com/speechtotext/models/827712a5-f942-4997-91c3-7c6cde35600b?api-version=2024-11-15"
      },
      "links": {
        "files": "https://westus.api.cognitive.microsoft.com/speechtotext/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683/files?api-version=2024-11-15"
      },
      "properties": {
        "wordLevelTimestampsEnabled": false,
        "displayFormWordLevelTimestampsEnabled": false,
        "channels": [
          0,
          1
        ],
        "punctuationMode": "DictatedAndAutomatic",
        "profanityFilterMode": "Masked",
        "timeToLiveHours": 48,
        "durationMilliseconds": 42000
      },
      "status": "Succeeded"
    },
    {
      "self": "https://westus.api.cognitive.microsoft.com/speechtotext/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683?api-version=2024-11-15",
      "displayName": "Transcription using adapted model en-US",
      "customProperties": {
        "key": "value"
      },
      "locale": "en-US",
      "createdDateTime": "2019-01-07T11:34:12Z",
      "lastActionDateTime": "2019-01-07T11:36:07Z",
      "contentUrls": [
        "https://contoso.com/",
        "https://contoso2.com/"
      ],
      "model": {
        "self": "https://westus.api.cognitive.microsoft.com/speechtotext/models/827712a5-f942-4997-91c3-7c6cde35600b?api-version=2024-11-15"
      },
      "links": {
        "files": "https://westus.api.cognitive.microsoft.com/speechtotext/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683/files?api-version=2024-11-15"
      },
      "properties": {
        "wordLevelTimestampsEnabled": false,
        "displayFormWordLevelTimestampsEnabled": true,
        "channels": [
          0,
          1
        ],
        "punctuationMode": "DictatedAndAutomatic",
        "profanityFilterMode": "Masked",
        "timeToLiveHours": 48,
        "durationMilliseconds": 42000
      },
      "status": "Failed"
    }
  ],
  "@nextLink": "https://westus.api.cognitive.microsoft.com/speechtotext/transcriptions?skip=2&top=2&filter=createdDateTime%20ge%202018-01-24T09:54:39Z&api-version=2024-11-15"
}

Definitions

Name Description
DetailedErrorCode

DetailedErrorCode

DiarizationProperties

DiarizationProperties

EntityError

EntityError

EntityReference

EntityReference

Error

Error

ErrorCode

ErrorCode

InnerError

InnerError

LanguageIdentificationMode

LanguageIdentificationMode

LanguageIdentificationProperties

LanguageIdentificationProperties

PaginatedTranscriptions

PaginatedTranscriptions

ProfanityFilterMode

ProfanityFilterMode

PunctuationMode

PunctuationMode

Status

Status

Transcription

Transcription

TranscriptionLinks

TranscriptionLinks

TranscriptionProperties

TranscriptionProperties

DetailedErrorCode

DetailedErrorCode

Value Description
AudioLengthLimitExceeded

The audio file is longer than the maximum allowed duration.

BadChannelConfiguration

There is a mismatch between audio channels in the data, in the configuration, or the requirements of the application.

DataImportFailed

Data import failed.

DeleteNotAllowed

Delete not allowed.

DeployNotAllowed

Deploy not allowed.

DeployingFailedModel

Deploying failed model.

EmptyAudioFile

The audio file is empty.

EmptyRequest

Empty Request.

EndpointCannotBeDefault

Endpoint cannot be default.

EndpointLoggingNotSupported

Endpoint logging not supported.

EndpointNotUpdatable

Endpoint not updatable.

EndpointWithoutLogging

Endpoint without logging.

ExceededNumberOfRecordingsUris

Exceeded number of recordings uris.

FailedDataset

Failed dataset.

Forbidden

Forbidden.

InUseViolation

In use violation.

InaccessibleCustomerStorage

Inaccessible customer storage.

InvalidAdaptationMapping

Invalid adaptation mapping.

InvalidAudioFormat

The format of input audio is not supported.

InvalidBaseModel

Invalid base model.

InvalidCallbackUri

Invalid callback uri.

InvalidChannelSpecification

The selection of channels in the transcription request is not supported (e.g., neither 0 nor 1 have been selected.)

InvalidChannels

Invalid channels.

InvalidCollection

Invalid collection.

InvalidDataset

Invalid dataset.

InvalidDocument

Invalid Document.

InvalidDocumentBatch

Invalid Document Batch.

InvalidLocale

Invalid locale.

InvalidLogDate

Invalid log date.

InvalidLogEndTime

Invalid log end time.

InvalidLogId

Invalid log id.

InvalidLogStartTime

Invalid log start time.

InvalidModel

Invalid model.

InvalidModelUri

Invalid model uri.

InvalidParameter

Invalid parameter.

InvalidParameterValue

Invalid parameter value.

InvalidPayload

Invalid payload.

InvalidPermissions

Invalid permissions.

InvalidPrerequisite

Invalid prerequisite.

InvalidProductId

Invalid product id.

InvalidProject

Invalid project.

InvalidProjectKind

Invalid project kind.

InvalidRecordingsUri

Invalid recordings uri.

InvalidRequestBodyFormat

Invalid request body format.

InvalidSasValidityDuration

Invalid sas validity duration.

InvalidSkipTokenForLogs

Invalid skip token for logs.

InvalidSourceAzureResourceId

Invalid source Azure resource ID.

InvalidSubscription

Invalid subscription.

InvalidTest

Invalid test.

InvalidTimeToLive

Invalid time to live.

InvalidTopForLogs

Invalid top for logs.

InvalidTranscription

Invalid transcription.

InvalidWebHookEventKind

Invalid web hook event kind.

MissingInputRecords

Missing Input Records.

ModelCopyAuthorizationExpired

Expired ModelCopyAuthorization.

ModelDeploymentNotCompleteState

Model deployment not complete state.

ModelDeprecated

Model deprecated.

ModelExists

Model exists.

ModelMismatch

Model mismatch.

ModelNotDeployable

Model not deployable.

ModelVersionIncorrect

Model Version Incorrect.

MultipleLanguagesIdentified

Language Identification recognized multiple languages. No dominant language could be determined.

NoLanguageIdentified

Language Identification did not recognize any language.

NoUtf8WithBom

No utf8 with bom.

OnlyOneOfUrlsOrContainerOrDataset

Only one of urls or container or dataset.

ProjectGenderMismatch

Project gender mismatch.

QuotaViolation

Quota violation.

SingleDefaultEndpoint

Single default endpoint.

SkuLimitsExist

Sku limits exist.

SubscriptionNotFound

Subscription not found.

UnexpectedError

Unexpected error.

UnsupportedClassBasedAdaptation

Unsupported class based adaptation.

UnsupportedDelta

Unsupported delta.

UnsupportedDynamicConfiguration

Unsupported dynamic configuration.

UnsupportedFilter

Unsupported filter.

UnsupportedLanguageCode

Unsupported language code.

UnsupportedOrderBy

Unsupported order by.

UnsupportedPagination

Unsupported pagination.

UnsupportedTimeRange

Unsupported time range.

DiarizationProperties

DiarizationProperties

Name Type Description
enabled

boolean

A value indicating whether speaker diarization is enabled.

maxSpeakers

integer

A hint for the maximum number of speakers for diarization. Must be greater than 1 and less than 36.

EntityError

EntityError

Name Type Description
code

string

The code of this error.

message

string

The message for this error.

EntityReference

EntityReference

Name Type Description
self

string

The location of the referenced entity.

Error

Error

Name Type Description
code

ErrorCode

ErrorCode
High level error codes.

details

Error[]

Additional supportive details regarding the error and/or expected policies.

innerError

InnerError

InnerError
New Inner Error format which conforms to Cognitive Services API Guidelines which is available at https://microsoft.sharepoint.com/%3Aw%3A/t/CognitiveServicesPMO/EUoytcrjuJdKpeOKIK_QRC8BPtUYQpKBi8JsWyeDMRsWlQ?e=CPq8ow. This contains required properties ErrorCode, message and optional properties target, details(key value pair), inner error(this can be nested).

message

string

High level error message.

target

string

The source of the error. For example it would be "documents" or "document id" in case of invalid document.

ErrorCode

ErrorCode

Value Description
Conflict

Representing the conflict error code.

Forbidden

Representing the forbidden error code.

InternalCommunicationFailed

Representing the internal communication failed error code.

InternalServerError

Representing the internal server error error code.

InvalidArgument

Representing the invalid argument error code.

InvalidRequest

Representing the invalid request error code.

NotAllowed

Representing the not allowed error code.

NotFound

Representing the not found error code.

PipelineError

Representing the pipeline error error code.

ServiceUnavailable

Representing the service unavailable error code.

TooManyRequests

Representing the too many requests error code.

Unauthorized

Representing the unauthorized error code.

UnprocessableEntity

Representing the unprocessable entity error code.

UnsupportedMediaType

Representing the unsupported media type error code.

InnerError

InnerError

Name Type Description
code

DetailedErrorCode

DetailedErrorCode
Detailed error code enum.

details

object

Additional supportive details regarding the error and/or expected policies.

innerError

InnerError

InnerError
New Inner Error format which conforms to Cognitive Services API Guidelines which is available at https://microsoft.sharepoint.com/%3Aw%3A/t/CognitiveServicesPMO/EUoytcrjuJdKpeOKIK_QRC8BPtUYQpKBi8JsWyeDMRsWlQ?e=CPq8ow. This contains required properties ErrorCode, message and optional properties target, details(key value pair), inner error(this can be nested).

message

string

High level error message.

target

string

The source of the error. For example it would be "documents" or "document id" in case of invalid document.

LanguageIdentificationMode

LanguageIdentificationMode

Value Description
Continuous

Continuous language identification (Default).

Single

Single language identification. If no language can be identified, the error code NoLanguageIdentified is returned to the user. If there is ambiguity between multiple languages, the error code MultipleLanguagesIdentified is returned to the user.

LanguageIdentificationProperties

LanguageIdentificationProperties

Name Type Default value Description
candidateLocales

string[]

The candidate locales for language identification (example ["en-US", "de-DE", "es-ES"]). A minimum of 2 and a maximum of 10 candidate locales, including the main locale for the transcription, is supported for continuous mode. For single language identification, the maximum number of candidate locales is unbounded.

mode

LanguageIdentificationMode

Continuous

LanguageIdentificationMode
The mode used for language identification.

speechModelMapping

<string,  EntityReference>

An optional mapping of locales to speech model entities. If no model is given for a locale, the default base model is used. Keys must be locales contained in the candidate locales, values are entities for models of the respective locales.

PaginatedTranscriptions

PaginatedTranscriptions

Name Type Description
@nextLink

string

A link to the next set of paginated results if there are more entities available; otherwise null.

values

Transcription[]

A list of entities limited by either the passed query parameters 'skip' and 'top' or their default values.

When iterating through a list using pagination and deleting entities in parallel, some entities will be skipped in the results. It's recommended to build a list on the client and delete after the fetching of the complete list.

ProfanityFilterMode

ProfanityFilterMode

Value Description
Masked

Mask the profanity with * except of the first letter, e.g., f***

None

Disable profanity filtering.

Removed

Remove profanity.

Tags

Add "profanity" XML tags</Profanity>

PunctuationMode

PunctuationMode

Value Description
Automatic

Automatic punctuation.

Dictated

Dictated punctuation marks only, i.e., explicit punctuation.

DictatedAndAutomatic

Dictated punctuation marks or automatic punctuation.

None

No punctuation.

Status

Status

Value Description
Failed

The long running operation has failed.

NotStarted

The long running operation has not yet started.

Running

The long running operation is currently processing.

Succeeded

The long running operation has successfully completed.

Transcription

Transcription

Name Type Description
contentContainerUrl

string

A URL for an Azure blob container that contains the audio files. A container is allowed to have a maximum size of 5GB and a maximum number of 10000 blobs. The maximum size for a blob is 2.5GB. Container SAS should contain 'r' (read) and 'l' (list) permissions. This property will not be returned in a response.

contentUrls

string[]

A list of content urls to get audio files to transcribe. Up to 1000 urls are allowed. This property will not be returned in a response.

createdDateTime

string

The time-stamp when the object was created. The time stamp is encoded as ISO 8601 date and time format ("YYYY-MM-DDThh:mm:ssZ", see https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations).

customProperties

object

The custom properties of this entity. The maximum allowed key length is 64 characters, the maximum allowed value length is 256 characters and the count of allowed entries is 10.

dataset

EntityReference

EntityReference

description

string

The description of the object.

displayName

string

The display name of the object.

lastActionDateTime

string

The time-stamp when the current status was entered. The time stamp is encoded as ISO 8601 date and time format ("YYYY-MM-DDThh:mm:ssZ", see https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations).

links

TranscriptionLinks

TranscriptionLinks

locale

string

The locale of the contained data. If Language Identification is used, this locale is used to transcribe speech for which no language could be detected.

model

EntityReference

EntityReference

properties

TranscriptionProperties

TranscriptionProperties

self

string

The location of this entity.

status

Status

Status
Describe the current state of the API.

TranscriptionLinks

Name Type Description
files

string

The location to get all files of this entity. See operation "Transcriptions_ListFiles" for more details.

TranscriptionProperties

TranscriptionProperties

Name Type Default value Description
channels

integer[]

A collection of the requested channel numbers. In the default case, the channels 0 and 1 are considered.

destinationContainerUrl

string

The requested destination container.

Remarks

When a destination container is used in combination with a timeToLive, the metadata of a transcription will be deleted normally, but the data stored in the destination container, including transcription results, will remain untouched, because no delete permissions are required for this container.

To support automatic cleanup, either configure blob lifetimes on the container, or use "Bring your own Storage (BYOS)" instead of destinationContainerUrl, where blobs can be cleaned up.

diarization

DiarizationProperties

DiarizationProperties

displayFormWordLevelTimestampsEnabled

boolean

A value indicating whether word level timestamps for the display form are requested. The default value is false.

durationMilliseconds

integer

0

The duration in milliseconds of the transcription. Durations larger than 2^53-1 are not supported to ensure compatibility with JavaScript integers.

error

EntityError

EntityError

languageIdentification

LanguageIdentificationProperties

LanguageIdentificationProperties

profanityFilterMode

ProfanityFilterMode

ProfanityFilterMode
Mode of profanity filtering.

punctuationMode

PunctuationMode

PunctuationMode
The mode used for punctuation.

timeToLiveHours

integer

How long the transcription will be kept in the system after it has completed. Once the transcription reaches the time to live after completion(successful or failed) it will be automatically deleted.

Note: When using BYOS (bring your own storage), the result files on the customer owned storage account will also be deleted.Use either destinationContainerUrl to specify a separate container for result files which will not be deleted when the timeToLive expires, or retrieve the result files through the API and store them as needed.

The shortest supported duration is 6 hours, the longest supported duration is 31 days. 2 days (48 hours) is the recommended default value when data is consumed directly.

wordLevelTimestampsEnabled

boolean

A value indicating whether word level timestamps are requested. The default value is false.