AssemblyAI (Preview)

Reference

Transcribe and extract data from audio using AssemblyAI's Speech AI.

This connector is available in the following products and regions:

Service	Class	Regions
Logic Apps	Standard	All Logic Apps regions except the following: - Azure Government regions - Azure China regions - US Department of Defense (DoD)
Power Automate	Premium	All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD)
Power Apps	Premium	All Power Apps regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD)

Contact
Name	Support
URL	https://www.assemblyai.com/docs/
Email	support@assemblyai.com

Connector Metadata
Publisher	AssemblyAI
Website	https://www.assemblyai.com
Privacy policy	https://www.assemblyai.com/legal/privacy-policy
Categories	AI

With the AssemblyAI Connector, you can use AssemblyAI's models to process audio data by transcribing it with speech recognition models, analyzing it with audio intelligence models, and building generative features on top of it with LLMs.

Speech-To-Text including many configurable features, such as speaker diarization, custom spelling, custom vocabulary, etc.
Audio Intelligence Models are additional AI models available and configured through the transcription configuration.
LeMUR lets you apply various LLM models to your transcripts without the need to build your own RAG infrastructure for very large transcripts.

Prerequisites

You will need the following to proceed:

An AssemblyAI API key (get one for free)

How to get credentials

You can get an AssemblyAI API key for free by signing up for an account and copying the API key from the dashboard.

Get started with your connector

Follow these steps to transcribe audio using the AssemblyAI connector.

Upload a File

To transcribe an audio file using AssemblyAI, the file needs to be accessible to AssemblyAI. If your audio file is already accessible via a URL, you can use your existing URL.

Otherwise, you can use the Upload a File action to upload a file to AssemblyAI. You will get back a URL for your file which can only be used to transcribe using your API key. Once you transcribe the file, the file will be removed from AssemblyAI's servers.

Transcribe Audio

To transcribe your audio, configure the Audio URL parameter using your audio file URL. Then, configure the additional parameters to enable more Speech Recognition features and Audio Intelligence models.

The result of the Transcribe Audio action is a queued transcript which will start being processed immediately. To get the completed transcript, you have two options:

Handle the Transcript Ready Webhook
Poll the Transcript Status

Handle the Transcript Ready Webhook

If you don't want to handle the webhook using Logic Apps or Power Automate, configure the Webhook URL parameter in your Transcribe Audio action, and implement your webhook following AssemblyAI's webhook documentation.

To handle the webhook using Logic Apps or Power Automate, follow these steps:

Create a separate Logic App or Power Automate Flow

Configure When an HTTP request is received as the trigger:

Set Who Can Trigger The Flow? to Anyone

Set Request Body JSON Schema to:

{
  "type": "object",
  "properties": {
    "transcript_id": {
      "type": "string"
    },
    "status": {
      "type": "string"
    }
  }
}

Set Method to POST

Add an AssemblyAI Get Transcript action, passing in the transcript_id from the trigger to the Transcript ID parameter.
Before doing anything else, you should check whether the Status is completed or error. Add a Condition action that checks if the Status from the Get Transcript output is error:
- In the True branch, add a Terminate action
  - Set the Status to Failed
  - Set the Code to Transcript Error
  - Pass the Error from the Get Transcript output to the Message parameter.
- You can leave the False branch empty.
Now you can add any action after the Condition knowing the transcript status is completed, and you can retrieve any of the output properties of the Get Transcript action.
Save your Logic App or Flow. The HTTP URL will be generated for the When an HTTP request is received trigger. Copy the HTTP URL and head back to your original Logic App or Flow.
In your original Logic App or Flow, update the Transcribe Audio action. Paste the HTTP URL you copied previously into the Webhook URL parameter, and save.

When the transcript status becomes completed or error, AssemblyAI will send an HTTP POST request to the webhook URL, which will be handled by your other Logic App or Flow.

As an alternative to using the webhook, you can poll the transcript status as explained in the next section.

Poll the Transcript Status

You can poll the transcript status using the following steps:

Add an Initialize variable action
- Set Name to transcript_status
- Set Type to String
- Store the Status from the Transcribe Audio output into the Value parameter
Add a Do until action
- Configure the Loop Until parameter with the following Fx code:
```
or(equals(variables('transcript_status'), 'completed'), equals(variables('transcript_status'), 'error'))
```
  This code checks whether the transcript_status variable is completed or error.
- Configure the Count parameter to 86400
- Configure the Timeout parameter to PT24H
Inside the Do until action, add the following actions:
- Add a Delay action that waits for one second
- Add a Get Transcript action and pass the ID from the Transcribe Audio output to the Transcript ID parameter.
- Add a Set variable action
  - Set Name to transcript_status
  - Pass the Status of the Get Transcript output to the Value parameter
The Do until loop will continue until the transcript is completed, or an error occurred.
Add another Get Transcript action, like before, but add it after the Do until loop so its output becomes available outside the scope of the Do until action.

Before doing anything else, you should check whether the transcript Status is completed or error. Add a Condition action that checks if the transcript_status is error:

In the True branch, add a Terminate action
- Set Status to Failed
- Set Code to Transcript Error
- Pass the Error from the Get Transcript output to the Message parameter.
You can leave the False branch empty.

Now you can add any action after the Condition knowing the transcript status is completed, and you can retrieve any of the output properties of the Get Transcript action.

Add more actions

Now that you have a completed transcription, you can use many other actions passing in the ID of the transcript, such as

Get Sentences of Transcript
Get Paragraphs of Transcript
Get Subtitles of Transcript
Get Redacted Audio
Search Transcript for Words
Run a Task using LeMUR

Known issues and limitations

No known issues currently. We don't support Streaming Speech-To-Text (real-time) as it is not possible using Custom Connectors.

Common errors and remedies

You can find more information about errors in the AssemblyAI documentation.

FAQ

You can find frequently asked questions in our documentation.

Creating a connection

The connector supports the following authentication types:


Default	Parameters for creating connection.	All regions	Not shareable

Default

Applicable: All regions

Parameters for creating connection.

This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.

Name	Type	Description	Required
AssemblyAI API Key	securestring	The AssemblyAI API Key to authenticate the AssemblyAI API.	True

Throttling Limits

Name	Calls	Renewal Period
API calls per connection	100	60 seconds

Actions

Delete Transcript	Delete the transcript. Deleting does not delete the resource itself, but removes the data from the resource and marks it as deleted.
Get Paragraphs in Transcript	Get the transcript split by paragraphs. The API will attempt to semantically segment your transcript into paragraphs to create more reader-friendly transcripts.
Get Redacted Audio	Retrieve the redacted audio object containing the status and URL to the redacted audio.
Get Sentences in Transcript	Get the transcript split by sentences. The API will attempt to semantically segment the transcript into sentences to create more reader-friendly transcripts.
Get Subtitles for Transcript	Export your transcript in SRT or VTT format to use with a video player for subtitles and closed captions.
Get Transcript	Get the transcript resource. The transcript is ready when the "status" is "completed".
List Transcripts	Retrieve a list of transcripts you created. Transcripts are sorted from newest to oldest. The previous URL always points to a page with older transcripts.
Purge LeMUR Request Data	Delete the data for a previously submitted LeMUR request. The LLM response data, as well as any context provided in the original request will be removed.
Retrieve LeMUR Response	Retrieve a LeMUR response that was previously generated.
Run a Task Using LeMUR	Use the LeMUR task endpoint to input your own LLM prompt.
Search Words in Transcript	Search through the transcript for keywords. You can search for individual words, numbers, or phrases containing up to five words or numbers.
Transcribe Audio	Create a transcript from a media file that is accessible via a URL.
Upload a Media File	Upload a media file to AssemblyAI's servers.

Delete Transcript

Operation ID:: DeleteTranscript

Delete the transcript. Deleting does not delete the resource itself, but removes the data from the resource and marks it as deleted.

Parameters

Name	Key	Required	Type	Description
Transcript ID	transcript_id	True	string	ID of the transcript

Returns

A transcript object

Body: Transcript

Get Paragraphs in Transcript

Operation ID:: GetTranscriptParagraphs

Get the transcript split by paragraphs. The API will attempt to semantically segment your transcript into paragraphs to create more reader-friendly transcripts.

Parameters

Name	Key	Required	Type	Description
Transcript ID	transcript_id	True	string	ID of the transcript

Returns

Body: ParagraphsResponse

Get Redacted Audio

Operation ID:: GetRedactedAudio

Retrieve the redacted audio object containing the status and URL to the redacted audio.

Parameters

Name	Key	Required	Type	Description
Transcript ID	transcript_id	True	string	ID of the transcript

Returns

Body: RedactedAudioResponse

Get Sentences in Transcript

Operation ID:: GetTranscriptSentences

Get the transcript split by sentences. The API will attempt to semantically segment the transcript into sentences to create more reader-friendly transcripts.

Parameters

Name	Key	Required	Type	Description
Transcript ID	transcript_id	True	string	ID of the transcript

Returns

Body: SentencesResponse

Get Subtitles for Transcript

Operation ID:: GetSubtitles

Export your transcript in SRT or VTT format to use with a video player for subtitles and closed captions.

Parameters

Name	Key	Required	Type	Description
Transcript ID	transcript_id	True	string	ID of the transcript
Subtitle Format	subtitle_format	True	string	Format of the subtitles
Number of Characters per Caption	chars_per_caption		integer	The maximum number of characters per caption

Returns

response: string

Get Transcript

Operation ID:: GetTranscript

Get the transcript resource. The transcript is ready when the "status" is "completed".

Parameters

Name	Key	Required	Type	Description
Transcript ID	transcript_id	True	string	ID of the transcript

Returns

A transcript object

Body: Transcript

List Transcripts

Operation ID:: ListTranscripts

Retrieve a list of transcripts you created. Transcripts are sorted from newest to oldest. The previous URL always points to a page with older transcripts.

Parameters

Name	Key	Type	Description
Limit	limit	integer	Maximum amount of transcripts to retrieve
Status	status	string	The status of your transcript. Possible values are queued, processing, completed, or error.
Created On	created_on	date	Only get transcripts created on this date
Before ID	before_id	uuid	Get transcripts that were created before this transcript ID
After ID	after_id	uuid	Get transcripts that were created after this transcript ID
Throttled Only	throttled_only	boolean	Only get throttled transcripts, overrides the status filter

Returns

A list of transcripts. Transcripts are sorted from newest to oldest. The previous URL always points to a page with older transcripts.

Body: TranscriptList

Purge LeMUR Request Data

Operation ID:: PurgeLemurRequestData

Delete the data for a previously submitted LeMUR request. The LLM response data, as well as any context provided in the original request will be removed.

Parameters

Name	Key	Required	Type	Description
LeMUR Request ID	request_id	True	string	The ID of the LeMUR request whose data you want to delete. This would be found in the response of the original request.

Returns

Body: PurgeLemurRequestDataResponse

Retrieve LeMUR Response

Operation ID:: GetLemurResponse

Retrieve a LeMUR response that was previously generated.

Parameters

Name	Key	Required	Type	Description
LeMUR Request ID	request_id	True	string	The ID of the LeMUR request you previously made. This would be found in the response of the original request.

Returns

Body: LemurResponse

Run a Task Using LeMUR

Operation ID:: LemurTask

Use the LeMUR task endpoint to input your own LLM prompt.

Parameters

Name	Key	Required	Type	Description
Prompt	prompt	True	string	Your text to prompt the model to produce a desired output, including any context you want to pass into the model.
Transcript IDs	transcript_ids		array of uuid	A list of completed transcripts with text. Up to a maximum of 100 files or 100 hours, whichever is lower. Use either transcript_ids or input_text as input into LeMUR.
Input Text	input_text		string	Custom formatted transcript data. Maximum size is the context limit of the selected model, which defaults to 100000. Use either transcript_ids or input_text as input into LeMUR.
Context	context		string	Context to provide the model. This can be a string or a free-form JSON value.
Final Model	final_model		string	The model that is used for the final prompt after compression is performed.
Maximum Output Size	max_output_size		integer	Max output size in tokens, up to 4000
Temperature	temperature		float	The temperature to use for the model. Higher values result in answers that are more creative, lower values are more conservative. Can be any value between 0.0 and 1.0 inclusive.

Returns

Body: LemurTaskResponse

Search Words in Transcript

Operation ID:: WordSearch

Search through the transcript for keywords. You can search for individual words, numbers, or phrases containing up to five words or numbers.

Parameters

Name	Key	Required	Type	Description
Transcript ID	transcript_id	True	string	ID of the transcript
Words	words	True	array	Keywords to search for

Returns

Body: WordSearchResponse

Transcribe Audio

Operation ID:: CreateTranscript

Create a transcript from a media file that is accessible via a URL.

Parameters

Name	Key	Required	Type	Description
Audio URL	audio_url	True	string	The URL of the audio or video file to transcribe.
Language Code	language_code		string	The language of your audio file. Possible values are found in Supported Languages. The default value is 'en_us'.
Language Detection	language_detection		boolean	Enable Automatic language detection, either true or false.
Speech Model	speech_model		string	The speech model to use for the transcription.
Punctuate	punctuate		boolean	Enable Automatic Punctuation, can be true or false
Format Text	format_text		boolean	Enable Text Formatting, can be true or false
Disfluencies	disfluencies		boolean	Transcribe Filler Words, like "umm", in your media file; can be true or false
Dual Channel	dual_channel		boolean	Enable Dual Channel transcription, can be true or false.
Webhook URL	webhook_url		string	The URL to which we send webhook requests. We sends two different types of webhook requests. One request when a transcript is completed or failed, and one request when the redacted audio is ready if redact_pii_audio is enabled.
Webhook Auth Header Name	webhook_auth_header_name		string	The header name to be sent with the transcript completed or failed webhook requests
Webhook Auth Header Value	webhook_auth_header_value		string	The header value to send back with the transcript completed or failed webhook requests for added security
Key Phrases	auto_highlights		boolean	Enable Key Phrases, either true or false
Audio Start From	audio_start_from		integer	The point in time, in milliseconds, to begin transcribing in your media file
Audio End At	audio_end_at		integer	The point in time, in milliseconds, to stop transcribing in your media file
Word Boost	word_boost		array of string	The list of custom vocabulary to boost transcription probability for
Word Boost Level	boost_param		string	How much to boost specified words
Filter Profanity	filter_profanity		boolean	Filter profanity from the transcribed text, can be true or false
Redact PII	redact_pii		boolean	Redact PII from the transcribed text using the Redact PII model, can be true or false
Redact PII Audio	redact_pii_audio		boolean	Generate a copy of the original media file with spoken PII "beeped" out, can be true or false. See PII redaction for more details.
Redact PII Audio Quality	redact_pii_audio_quality		string	Controls the filetype of the audio created by redact_pii_audio. Currently supports mp3 (default) and wav. See PII redaction for more details.
Redact PII Policies	redact_pii_policies		array of string	The list of PII Redaction policies to enable. See PII redaction for more details.
Redact PII Substitution	redact_pii_sub		string	The replacement logic for detected PII, can be "entity_name" or "hash". See PII redaction for more details.
Speaker Labels	speaker_labels		boolean	Enable Speaker diarization, can be true or false
Speakers Expected	speakers_expected		integer	Tells the speaker label model how many speakers it should attempt to identify, up to 10. See Speaker diarization for more details.
Content Moderation	content_safety		boolean	Enable Content Moderation, can be true or false
Content Moderation Confidence	content_safety_confidence		integer	The confidence threshold for the Content Moderation model. Values must be between 25 and 100.
Topic Detection	iab_categories		boolean	Enable Topic Detection, can be true or false
From	from	True	array of string	Words or phrases to replace
To	to	True	string	Word or phrase to replace with
Sentiment Analysis	sentiment_analysis		boolean	Enable Sentiment Analysis, can be true or false
Auto Chapters	auto_chapters		boolean	Enable Auto Chapters, can be true or false
Entity Detection	entity_detection		boolean	Enable Entity Detection, can be true or false
Speech Threshold	speech_threshold		float	Reject audio files that contain less than this fraction of speech. Valid values are in the range [0, 1] inclusive.
Enable Summarization	summarization		boolean	Enable Summarization, can be true or false
Summary Model	summary_model		string	The model to summarize the transcript
Summary Type	summary_type		string	The type of summary
Enable Custom Topics	custom_topics		boolean	Enable custom topics, either true or false
Custom Topics	topics		array of string	The list of custom topics

Returns

A transcript object

Body: Transcript

Upload a Media File

Operation ID:: UploadFile

Upload a media file to AssemblyAI's servers.

Parameters

Name	Key	Required	Type	Description
File Content	file	True	binary	The file to upload.

Returns

Body: UploadedFile

Definitions

RedactedAudioResponse

Name	Path	Type	Description
Status	status	string	The status of the redacted audio
Redacted Audio URL	redacted_audio_url	string	The URL of the redacted audio file

WordSearchResponse

Name	Path	Type	Description
Transcript ID	id	uuid	The ID of the transcript
Total Count of Matches	total_count	integer	The total count of all matched instances. For e.g., word 1 matched 2 times, and word 2 matched 3 times, total_count will equal 5.
Matches	matches	array of object	The matches of the search
Text	matches.text	string	The matched word
Count	matches.count	integer	The total amount of times the word is in the transcript
Timestamps	matches.timestamps	array of array	An array of timestamps
Timestamp	matches.timestamps	array of integer	An array of timestamps structured as [start_time, end_time] in milliseconds
Indexes	matches.indexes	array of integer	An array of all index locations for that word within the words array of the completed transcript

Transcript

A transcript object

Name	Path	Type	Description
ID	id	uuid	The unique identifier of your transcript
Audio URL	audio_url	string	The URL of the media that was transcribed
Status	status	string	The status of your transcript. Possible values are queued, processing, completed, or error.
Language Code	language_code	string	The language of your audio file. Possible values are found in Supported Languages. The default value is 'en_us'.
Language Detection	language_detection	boolean	Whether Automatic language detection is enabled, either true or false
Speech Model	speech_model	string	The speech model to use for the transcription.
Text	text	string	The textual transcript of your media file
Words	words	array of object	An array of temporally-sequential word objects, one for each word in the transcript. See Speech recognition for more information.
Confidence	words.confidence	double
Start	words.start	integer
End	words.end	integer
Text	words.text	string
Speaker	words.speaker	string	The speaker of the sentence if Speaker Diarization is enabled, else null
Utterances	utterances	array of object	When dual_channel or speaker_labels is enabled, a list of turn-by-turn utterance objects. See Speaker diarization for more information.
Confidence	utterances.confidence	double	The confidence score for the transcript of this utterance
Start	utterances.start	integer	The starting time, in milliseconds, of the utterance in the audio file
End	utterances.end	integer	The ending time, in milliseconds, of the utterance in the audio file
Text	utterances.text	string	The text for this utterance
Words	utterances.words	array of object	The words in the utterance.
Confidence	utterances.words.confidence	double
Start	utterances.words.start	integer
End	utterances.words.end	integer
Text	utterances.words.text	string
Speaker	utterances.words.speaker	string	The speaker of the sentence if Speaker Diarization is enabled, else null
Speaker	utterances.speaker	string	The speaker of this utterance, where each speaker is assigned a sequential capital letter - e.g. "A" for Speaker A, "B" for Speaker B, etc.
Confidence	confidence	double	The confidence score for the transcript, between 0.0 (low confidence) and 1.0 (high confidence)
Audio Duration	audio_duration	integer	The duration of this transcript object's media file, in seconds
Punctuate	punctuate	boolean	Whether Automatic Punctuation is enabled, either true or false
Format Text	format_text	boolean	Whether Text Formatting is enabled, either true or false
Disfluencies	disfluencies	boolean	Transcribe Filler Words, like "umm", in your media file; can be true or false
Dual Channel	dual_channel	boolean	Whether Dual channel transcription was enabled in the transcription request, either true or false
Webhook URL	webhook_url	string	The URL to which we send webhook requests. We sends two different types of webhook requests. One request when a transcript is completed or failed, and one request when the redacted audio is ready if redact_pii_audio is enabled.
Webhook HTTP Status Code	webhook_status_code	integer	The status code we received from your server when delivering the transcript completed or failed webhook request, if a webhook URL was provided
Webhook Auth Enabled	webhook_auth	boolean	Whether webhook authentication details were provided
Webhook Auth Header Name	webhook_auth_header_name	string	The header name to be sent with the transcript completed or failed webhook requests
Speed Boost	speed_boost	boolean	Whether speed boost is enabled
Key Phrases	auto_highlights	boolean	Whether Key Phrases is enabled, either true or false
Status	auto_highlights_result.status	string	Either success, or unavailable in the rare case that the model failed
Results	auto_highlights_result.results	array of object	A temporally-sequential array of Key Phrases
Count	auto_highlights_result.results.count	integer	The total number of times the key phrase appears in the audio file
Rank	auto_highlights_result.results.rank	float	The total relevancy to the overall audio file of this key phrase - a greater number means more relevant
Text	auto_highlights_result.results.text	string	The text itself of the key phrase
Timestamps	auto_highlights_result.results.timestamps	array of object	The timestamp of the of the key phrase
Start	auto_highlights_result.results.timestamps.start	integer	The start time in milliseconds
End	auto_highlights_result.results.timestamps.end	integer	The end time in milliseconds
Audio Start From	audio_start_from	integer	The point in time, in milliseconds, in the file at which the transcription was started
Audio End At	audio_end_at	integer	The point in time, in milliseconds, in the file at which the transcription was terminated
Word Boost	word_boost	array of string	The list of custom vocabulary to boost transcription probability for
Boost	boost_param	string	The word boost parameter value
Filter Profanity	filter_profanity	boolean	Whether Profanity Filtering is enabled, either true or false
Redact PII	redact_pii	boolean	Whether PII Redaction is enabled, either true or false
Redact PII Audio	redact_pii_audio	boolean	Whether a redacted version of the audio file was generated, either true or false. See PII redaction for more information.
Redact PII Audio Quality	redact_pii_audio_quality	string	Controls the filetype of the audio created by redact_pii_audio. Currently supports mp3 (default) and wav. See PII redaction for more details.
Redact PII Policies	redact_pii_policies	array of string	The list of PII Redaction policies that were enabled, if PII Redaction is enabled. See PII redaction for more information.
Redact PII Substitution	redact_pii_sub	string	The replacement logic for detected PII, can be "entity_name" or "hash". See PII redaction for more details.
Speaker Labels	speaker_labels	boolean	Whether Speaker diarization is enabled, can be true or false
Speakers Expected	speakers_expected	integer	Tell the speaker label model how many speakers it should attempt to identify, up to 10. See Speaker diarization for more details.
Content Moderation	content_safety	boolean	Whether Content Moderation is enabled, can be true or false
Status	content_safety_labels.status	string	Either success, or unavailable in the rare case that the model failed
Results	content_safety_labels.results	array of object
Text	content_safety_labels.results.text	string	The transcript of the section flagged by the Content Moderation model
Labels	content_safety_labels.results.labels	array of object	An array of safety labels, one per sensitive topic that was detected in the section
Label	content_safety_labels.results.labels.label	string	The label of the sensitive topic
Confidence	content_safety_labels.results.labels.confidence	double	The confidence score for the topic being discussed, from 0 to 1
Severity	content_safety_labels.results.labels.severity	double	How severely the topic is discussed in the section, from 0 to 1
Sentence Index Start	content_safety_labels.results.sentences_idx_start	integer	The sentence index at which the section begins
Sentence Index End	content_safety_labels.results.sentences_idx_end	integer	The sentence index at which the section ends
Start	content_safety_labels.results.timestamp.start	integer	The start time in milliseconds
End	content_safety_labels.results.timestamp.end	integer	The end time in milliseconds
Summary	content_safety_labels.summary	object	A summary of the Content Moderation confidence results for the entire audio file
Severity Score Summary	content_safety_labels.severity_score_summary	object	A summary of the Content Moderation severity results for the entire audio file
Topic Detection	iab_categories	boolean	Whether Topic Detection is enabled, can be true or false
Status	iab_categories_result.status	string	Either success, or unavailable in the rare case that the model failed
Results	iab_categories_result.results	array of object	An array of results for the Topic Detection model
Text	iab_categories_result.results.text	string	The text in the transcript in which a detected topic occurs
Labels	iab_categories_result.results.labels	array of object
Relevance	iab_categories_result.results.labels.relevance	double	How relevant the detected topic is of a detected topic
Label	iab_categories_result.results.labels.label	string	The IAB taxonomical label for the label of the detected topic, where > denotes supertopic/subtopic relationship
Start	iab_categories_result.results.timestamp.start	integer	The start time in milliseconds
End	iab_categories_result.results.timestamp.end	integer	The end time in milliseconds
Summary	iab_categories_result.summary	object	The overall relevance of topic to the entire audio file
Custom Spellings	custom_spelling	array of object	Customize how words are spelled and formatted using to and from values
From	custom_spelling.from	array of string	Words or phrases to replace
To	custom_spelling.to	string	Word or phrase to replace with
Auto Chapters Enabled	auto_chapters	boolean	Whether Auto Chapters is enabled, can be true or false
Chapters	chapters	array of object	An array of temporally sequential chapters for the audio file
Gist	chapters.gist	string	An ultra-short summary (just a few words) of the content spoken in the chapter
Headline	chapters.headline	string	A single sentence summary of the content spoken during the chapter
Summary	chapters.summary	string	A one paragraph summary of the content spoken during the chapter
Start	chapters.start	integer	The starting time, in milliseconds, for the chapter
End	chapters.end	integer	The starting time, in milliseconds, for the chapter
Summarization Enabled	summarization	boolean	Whether Summarization is enabled, either true or false
Summary Type	summary_type	string	The type of summary generated, if Summarization is enabled
Summary Model	summary_model	string	The Summarization model used to generate the summary, if Summarization is enabled
Summary	summary	string	The generated summary of the media file, if Summarization is enabled
Custom Topics Enabled	custom_topics	boolean	Whether custom topics is enabled, either true or false
Topics	topics	array of string	The list of custom topics provided if custom topics is enabled
Sentiment Analysis	sentiment_analysis	boolean	Whether Sentiment Analysis is enabled, can be true or false
Sentiment Analysis Results	sentiment_analysis_results	array of object	An array of results for the Sentiment Analysis model, if it is enabled. See Sentiment Analysis for more information.
Text	sentiment_analysis_results.text	string	The transcript of the sentence
Start	sentiment_analysis_results.start	integer	The starting time, in milliseconds, of the sentence
End	sentiment_analysis_results.end	integer	The ending time, in milliseconds, of the sentence
Sentiment	sentiment_analysis_results.sentiment		The detected sentiment for the sentence, one of POSITIVE, NEUTRAL, NEGATIVE
Confidence	sentiment_analysis_results.confidence	double	The confidence score for the detected sentiment of the sentence, from 0 to 1
Speaker	sentiment_analysis_results.speaker	string	The speaker of the sentence if Speaker Diarization is enabled, else null
Entity Detection	entity_detection	boolean	Whether Entity Detection is enabled, can be true or false
Entities	entities	array of object	An array of results for the Entity Detection model, if it is enabled. See Entity detection for more information.
Entity Type	entities.entity_type	string	The type of entity for the detected entity
Text	entities.text	string	The text for the detected entity
Start	entities.start	integer	The starting time, in milliseconds, at which the detected entity appears in the audio file
End	entities.end	integer	The ending time, in milliseconds, for the detected entity in the audio file
Speech Threshold	speech_threshold	float	Defaults to null. Reject audio files that contain less than this fraction of speech. Valid values are in the range [0, 1] inclusive.
Throttled	throttled	boolean	True while a request is throttled and false when a request is no longer throttled
Error	error	string	Error message of why the transcript failed
Language Model	language_model	string	The language model that was used for the transcript
Acoustic Model	acoustic_model	string	The acoustic model that was used for the transcript

SentencesResponse

Name	Path	Type	Description
Transcript ID	id	uuid
Confidence	confidence	double
Audio Duration	audio_duration	number
Sentences	sentences	array of object
Text	sentences.text	string
Start	sentences.start	integer
End	sentences.end	integer
Confidence	sentences.confidence	double
Words	sentences.words	array of object
Confidence	sentences.words.confidence	double
Start	sentences.words.start	integer
End	sentences.words.end	integer
Text	sentences.words.text	string
Speaker	sentences.words.speaker	string	The speaker of the sentence if Speaker Diarization is enabled, else null
Speaker	sentences.speaker	string	The speaker of the sentence if Speaker Diarization is enabled, else null

ParagraphsResponse

Name	Path	Type	Description
Transcript ID	id	uuid
Confidence	confidence	double
Audio Duration	audio_duration	number
Paragraphs	paragraphs	array of object
Text	paragraphs.text	string
Start	paragraphs.start	integer
End	paragraphs.end	integer
Confidence	paragraphs.confidence	double
Words	paragraphs.words	array of object
Confidence	paragraphs.words.confidence	double
Start	paragraphs.words.start	integer
End	paragraphs.words.end	integer
Text	paragraphs.words.text	string
Speaker	paragraphs.words.speaker	string	The speaker of the sentence if Speaker Diarization is enabled, else null
Speaker	paragraphs.speaker	string	The speaker of the sentence if Speaker Diarization is enabled, else null

TranscriptList

A list of transcripts. Transcripts are sorted from newest to oldest. The previous URL always points to a page with older transcripts.

Name	Path	Type	Description
Limit	page_details.limit	integer	The number of results this page is limited to
Result Count	page_details.result_count	integer	The actual number of results in the page
Current URL	page_details.current_url	string	The URL used to retrieve the current page of transcripts
Previous URL	page_details.prev_url	string	The URL to the next page of transcripts. The previous URL always points to a page with older transcripts.
Next URL	page_details.next_url	string	The URL to the next page of transcripts. The next URL always points to a page with newer transcripts.
Transcripts	transcripts	array of object
ID	transcripts.id	uuid
Resource URL	transcripts.resource_url	string
Status	transcripts.status	string	The status of your transcript. Possible values are queued, processing, completed, or error.
Created	transcripts.created	string
Completed	transcripts.completed	string
Audio URL	transcripts.audio_url	string
Error	transcripts.error	string	Error message of why the transcript failed

UploadedFile

Name	Path	Type	Description
Uploaded File URL	upload_url	string	A URL that points to your audio file, accessible only by AssemblyAI's servers

PurgeLemurRequestDataResponse

Name	Path	Type	Description
Purge Request ID	request_id	uuid	The ID of the deletion request of the LeMUR request
LeMUR Request ID to Purge	request_id_to_purge	uuid	The ID of the LeMUR request to purge the data for
Deleted	deleted	boolean	Whether the request data was deleted

LemurTaskResponse

Name	Path	Type	Description
Response	response	string	The response generated by LeMUR.
LeMUR Request ID	request_id	uuid	The ID of the LeMUR request
Input Tokens	usage.input_tokens	integer	The number of input tokens used by the model
Output Tokens	usage.output_tokens	integer	The number of output tokens generated by the model

LemurResponse

Name	Path	Type	Description
Response	response	string	The response generated by LeMUR.
LeMUR Request ID	request_id	uuid	The ID of the LeMUR request
Input Tokens	usage.input_tokens	integer	The number of input tokens used by the model
Output Tokens	usage.output_tokens	integer	The number of output tokens generated by the model

string

This is the basic data type 'string'.

Partilhar via

AssemblyAI (Preview)

Prerequisites

How to get credentials

Get started with your connector

Upload a File

Transcribe Audio

Handle the Transcript Ready Webhook

Poll the Transcript Status

Add more actions

Known issues and limitations

Common errors and remedies

FAQ

Creating a connection

Default

Throttling Limits

Actions

Delete Transcript

Parameters

Returns

Get Paragraphs in Transcript

Parameters

Returns

Get Redacted Audio

Parameters

Returns

Get Sentences in Transcript

Parameters

Returns

Get Subtitles for Transcript

Parameters

Returns

Get Transcript

Parameters

Returns

List Transcripts

Parameters

Returns

Purge LeMUR Request Data

Parameters

Returns

Retrieve LeMUR Response

Parameters

Returns

Run a Task Using LeMUR

Parameters

Returns

Search Words in Transcript

Parameters

Returns

Transcribe Audio

Parameters

Returns

Upload a Media File

Parameters

Returns

Definitions

RedactedAudioResponse

WordSearchResponse

Transcript

SentencesResponse

ParagraphsResponse

TranscriptList

UploadedFile

PurgeLemurRequestDataResponse

LemurTaskResponse

LemurResponse

string

Recursos adicionais