Rev.ai Provider
The Rev.ai provider contains language model support for the Rev.ai transcription API.
Setup
The Rev.ai provider is available in the @ai-sdk/revai module. You can install it with
pnpm add @ai-sdk/revai
Provider Instance
You can import the default provider instance revai from @ai-sdk/revai:
import { revai } from '@ai-sdk/revai';If you need a customized setup, you can import createRevai from @ai-sdk/revai and create a provider instance with your settings:
import { createRevai } from '@ai-sdk/revai';
const revai = createRevai({ // custom settings, e.g. fetch: customFetch,});You can use the following optional settings to customize the Rev.ai provider instance:
-
apiKey string
API key that is being sent using the
Authorizationheader. It defaults to theREVAI_API_KEYenvironment variable. -
headers Record<string,string>
Custom headers to include in the requests.
-
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation. Defaults to the global
fetchfunction. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing.
Transcription Models
You can create models that call the Rev.ai transcription API
using the .transcription() factory method.
The first argument is the model id e.g. machine.
const model = revai.transcription('machine');You can also pass additional provider-specific options using the providerOptions argument. For example, supplying the input language in ISO-639-1 (e.g. en) format can sometimes improve transcription performance if known beforehand.
import { experimental_transcribe as transcribe } from 'ai';import { revai } from '@ai-sdk/revai';import { readFile } from 'fs/promises';
const result = await transcribe({ model: revai.transcription('machine'), audio: await readFile('audio.mp3'), providerOptions: { revai: { language: 'en' } },});The following provider options are available:
-
metadata string
Optional metadata string to associate with the transcription job.
-
notification_config object
Configuration for webhook notifications when job is complete.
- url string - URL to send the notification to.
- auth_headers object - Optional authorization headers for the notification request.
- Authorization string - Authorization header value.
-
delete_after_seconds integer
Number of seconds after which the job will be automatically deleted.
-
verbatim boolean
Whether to include filler words and false starts in the transcription.
-
rush boolean
[HIPAA Unsupported] Whether to prioritize the job for faster processing. Only available for human transcriber option.
-
test_mode boolean
Whether to run the job in test mode. Default is
false. -
segments_to_transcribe Array
Specific segments of the audio to transcribe.
- start number - Start time of the segment in seconds.
- end number - End time of the segment in seconds.
-
speaker_names Array
Names to assign to speakers in the transcription.
- display_name string - Display name for the speaker.
-
skip_diarization boolean
Whether to skip speaker diarization. Default is
false. -
skip_postprocessing boolean
Whether to skip post-processing steps. Only available for English and Spanish languages. Default is
false. -
skip_punctuation boolean
Whether to skip adding punctuation to the transcription. Default is
false. -
remove_disfluencies boolean
Whether to remove disfluencies (um, uh, etc.) from the transcription. Default is
false. -
remove_atmospherics boolean
Whether to remove atmospheric sounds (like
<laugh>,<affirmative>) from the transcription. Default isfalse. -
filter_profanity boolean
Whether to filter profanity from the transcription by replacing characters with asterisks except for the first and last. Default is
false. -
speaker_channels_count integer
Number of speaker channels in the audio. Only available for English, Spanish and French languages.
-
speakers_count integer
Expected number of speakers in the audio. Only available for English, Spanish and French languages.
-
diarization_type string
Type of diarization to use. Possible values: "standard" (default), "premium".
-
custom_vocabulary_id string
ID of a custom vocabulary to use for the transcription, submitted through the Custom Vocabularies API.
-
custom_vocabularies Array
Custom vocabularies to use for the transcription.
-
strict_custom_vocabulary boolean
Whether to strictly enforce custom vocabulary.
-
summarization_config object
Configuration for generating a summary of the transcription.
- model string - Model to use for summarization. Possible values: "standard" (default), "premium".
- type string - Format of the summary. Possible values: "paragraph" (default), "bullets".
- prompt string - Custom prompt for the summarization (mutually exclusive with type).
-
translation_config object
Configuration for translating the transcription.
- target_languages Array - Target languages for translation. Each item is an object with:
- language string - Language code. Possible values: "en", "en-us", "en-gb", "ar", "pt", "pt-br", "pt-pt", "fr", "fr-ca", "es", "es-es", "es-la", "it", "ja", "ko", "de", "ru".
- model string - Model to use for translation. Possible values: "standard" (default), "premium".
- target_languages Array - Target languages for translation. Each item is an object with:
-
language string
Language of the audio content, provided as an ISO 639-1 language code. Default is "en".
-
forced_alignment boolean
Whether to perform forced alignment, which provides improved accuracy for per-word timestamps. Default is
false.Currently supported languages:
- English (en, en-us, en-gb)
- French (fr)
- Italian (it)
- German (de)
- Spanish (es)
Note: This option is not available in low-cost environment.
Model Capabilities
| Model | Transcription | Duration | Segments | Language |
|---|---|---|---|---|
machine | ||||
low_cost | ||||
fusion |