TTS
- class pipecat.services.fish.tts.FishAudioTTSService(*, api_key, model, output_format='pcm', sample_rate=None, params=None, **kwargs)[source]
Bases:
InterruptibleTTSService
- Parameters:
api_key (str)
model (str)
output_format (Literal['opus', 'mp3', 'pcm', 'wav'])
sample_rate (int | None)
params (InputParams | None)
- class InputParams(*, language=Language.EN, latency='normal', prosody_speed=1.0, prosody_volume=0)[source]
Bases:
BaseModel
- Parameters:
language (Language | None)
latency (str | None)
prosody_speed (float | None)
prosody_volume (int | None)
- language: Language | None
- latency: str | None
- prosody_speed: float | None
- prosody_volume: int | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- can_generate_metrics()[source]
- Return type:
bool
- async set_model(model)[source]
Set the TTS model to use.
- Parameters:
model (str) – The name of the TTS model.
- async start(frame)[source]
Start the TTS service.
- Parameters:
frame (StartFrame) – The start frame containing initialization parameters.
- async stop(frame)[source]
Stop the TTS service.
- Parameters:
frame (EndFrame) – The end frame.
- async cancel(frame)[source]
Cancel the TTS service.
- Parameters:
frame (CancelFrame) – The cancel frame.
- async flush_audio()[source]
Flush any buffered audio by sending a flush event to Fish Audio.
- async run_tts(text)[source]
Run text-to-speech synthesis on the provided text.
This method must be implemented by subclasses to provide actual TTS functionality.
- Parameters:
text (str) – The text to synthesize into speech.
- Yields:
Frame – Audio frames containing the synthesized speech.
- Return type:
AsyncGenerator[Frame, None]