TTS
- class pipecat.services.openai.tts.OpenAITTSService(*, api_key=None, base_url=None, voice='alloy', model='gpt-4o-mini-tts', sample_rate=None, instructions=None, **kwargs)[source]
Bases:
TTSService
OpenAI Text-to-Speech service that generates audio from text.
This service uses the OpenAI TTS API to generate PCM-encoded audio at 24kHz.
- Parameters:
api_key (str | None) – OpenAI API key. Defaults to None.
voice (str) – Voice ID to use. Defaults to “alloy”.
model (str) – TTS model to use. Defaults to “gpt-4o-mini-tts”.
sample_rate (int | None) – Output audio sample rate in Hz. Defaults to None.
**kwargs – Additional keyword arguments passed to TTSService.
base_url (str | None)
instructions (str | None)
The service returns PCM-encoded audio at the specified sample rate.
- OPENAI_SAMPLE_RATE = 24000
- can_generate_metrics()[source]
- Return type:
bool
- async set_model(model)[source]
Set the TTS model to use.
- Parameters:
model (str) – The name of the TTS model.
- async start(frame)[source]
Start the TTS service.
- Parameters:
frame (StartFrame) – The start frame containing initialization parameters.
- async run_tts(text)[source]
Run text-to-speech synthesis on the provided text.
This method must be implemented by subclasses to provide actual TTS functionality.
- Parameters:
text (str) – The text to synthesize into speech.
- Yields:
Frame – Audio frames containing the synthesized speech.
- Return type:
AsyncGenerator[Frame, None]