Hume Provider

The Hume provider contains support for the Hume text-to-speech (TTS) API.

Setup

The Hume provider is available in the @ai-sdk/hume module. You can install it with

pnpm add @ai-sdk/hume

Provider Instance

You can import the default provider instance hume from @ai-sdk/hume:

import { hume } from '@ai-sdk/hume';

If you need a customized setup, you can import createHume from @ai-sdk/hume and create a provider instance with your settings:

import { createHume } from '@ai-sdk/hume';
const hume = createHume({
// custom settings, e.g.
fetch: customFetch,
});

You can use the following optional settings to customize the Hume provider instance:

  • apiKey string

    API key that is being sent using the X-Hume-Api-Key header. It defaults to the HUME_API_KEY environment variable.

  • headers Record<string,string>

    Custom headers to include in the requests.

  • fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>

    Custom fetch implementation. Defaults to the global fetch function. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing.

Speech Models

You can create models that call the Hume speech API using the .speech() factory method.

const model = hume.speech();

You can pass standard speech generation options like voice, speed, instructions, and outputFormat:

import { experimental_generateSpeech as generateSpeech } from 'ai';
import { hume } from '@ai-sdk/hume';
const result = await generateSpeech({
model: hume.speech(),
text: 'Hello, world!',
voice: 'd8ab67c6-953d-4bd8-9370-8fa53a0f1453',
speed: 1.0,
instructions: 'Speak in a friendly, conversational tone.',
outputFormat: 'mp3',
});

Supported Parameters

  • text string (required)

    The text to convert to speech.

  • voice string

    The voice ID to use for the generated audio. Defaults to 'd8ab67c6-953d-4bd8-9370-8fa53a0f1453'.

  • speed number

    Speech rate multiplier.

  • instructions string

    Description or instructions for how the text should be spoken.

  • outputFormat string

    The audio format to generate. Supported values: 'mp3', 'pcm', 'wav'. Defaults to 'mp3'.

The language parameter is not supported by Hume speech models and will be ignored with a warning.

Provider Options

You can pass additional provider-specific options using the providerOptions argument:

import { experimental_generateSpeech as generateSpeech } from 'ai';
import { hume } from '@ai-sdk/hume';
const result = await generateSpeech({
model: hume.speech(),
text: 'Hello, world!',
providerOptions: {
hume: {
context: {
generationId: 'previous-generation-id',
},
},
},
});

The following provider options are available:

  • context object

    Context for the speech synthesis request. Can be either:

    • { generationId: string } - ID of a previously generated speech synthesis to use as context.
    • { utterances: Utterance[] } - An array of utterance objects for context, where each utterance has:
      • text string (required) - The text content.
      • description string - Instructions for how the text should be spoken.
      • speed number - Speech rate multiplier.
      • trailingSilence number - Duration of silence to add after the utterance in seconds.
      • voice object - Voice configuration, either { id: string, provider?: 'HUME_AI' | 'CUSTOM_VOICE' } or { name: string, provider?: 'HUME_AI' | 'CUSTOM_VOICE' }.

Model Capabilities

ModelInstructionsSpeedOutput Formats
defaultmp3, pcm, wav