TTS
- pipecat.services.rime.tts.language_to_rime_language(language)[source]
Convert pipecat Language to Rime language code.
- Parameters:
language (Language) – The pipecat Language enum value.
- Returns:
Three-letter language code used by Rime (e.g., ‘eng’ for English).
- Return type:
str
- class pipecat.services.rime.tts.RimeTTSService(*, api_key, voice_id, url='wss://users.rime.ai/ws2', model='mistv2', sample_rate=None, params=None, text_aggregator=None, **kwargs)[source]
Bases:
AudioContextWordTTSService
Text-to-Speech service using Rime’s websocket API.
Uses Rime’s websocket JSON API to convert text to speech with word-level timing information. Supports interruptions and maintains context across multiple messages within a turn.
- Parameters:
api_key (str)
voice_id (str)
url (str)
model (str)
sample_rate (int | None)
params (InputParams | None)
text_aggregator (BaseTextAggregator | None)
- class InputParams(*, language=Language.EN, speed_alpha=1.0, reduce_latency=False, pause_between_brackets=False, phonemize_between_brackets=False)[source]
Bases:
BaseModel
Configuration parameters for Rime TTS service.
- Parameters:
language (Language | None)
speed_alpha (float | None)
reduce_latency (bool | None)
pause_between_brackets (bool | None)
phonemize_between_brackets (bool | None)
- language: Language | None
- speed_alpha: float | None
- reduce_latency: bool | None
- pause_between_brackets: bool | None
- phonemize_between_brackets: bool | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- can_generate_metrics()[source]
- Return type:
bool
- language_to_service_language(language)[source]
Convert pipecat language to Rime language code.
- Parameters:
language (Language)
- Return type:
str | None
- async set_model(model)[source]
Update the TTS model.
- Parameters:
model (str)
- async start(frame)[source]
Start the service and establish websocket connection.
- Parameters:
frame (StartFrame)
- async stop(frame)[source]
Stop the service and close connection.
- Parameters:
frame (EndFrame)
- async cancel(frame)[source]
Cancel current operation and clean up.
- Parameters:
frame (CancelFrame)
- async flush_audio()[source]
Flush any buffered audio data.
- async push_frame(frame, direction=FrameDirection.DOWNSTREAM)[source]
Push frame and handle end-of-turn conditions.
- Parameters:
frame (Frame)
direction (FrameDirection)
- async run_tts(text)[source]
Generate speech from text.
- Parameters:
text (str) – The text to convert to speech.
- Yields:
Frames containing audio data and timing information.
- Return type:
AsyncGenerator[Frame, None]
- class pipecat.services.rime.tts.RimeHttpTTSService(*, api_key, voice_id, aiohttp_session, model='mistv2', sample_rate=None, params=None, **kwargs)[source]
Bases:
TTSService
- Parameters:
api_key (str)
voice_id (str)
aiohttp_session (ClientSession)
model (str)
sample_rate (int | None)
params (InputParams | None)
- class InputParams(*, language=Language.EN, pause_between_brackets=False, phonemize_between_brackets=False, inline_speed_alpha=None, speed_alpha=1.0, reduce_latency=False)[source]
Bases:
BaseModel
- Parameters:
language (Language | None)
pause_between_brackets (bool | None)
phonemize_between_brackets (bool | None)
inline_speed_alpha (str | None)
speed_alpha (float | None)
reduce_latency (bool | None)
- language: Language | None
- pause_between_brackets: bool | None
- phonemize_between_brackets: bool | None
- inline_speed_alpha: str | None
- speed_alpha: float | None
- reduce_latency: bool | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- can_generate_metrics()[source]
- Return type:
bool
- language_to_service_language(language)[source]
Convert pipecat language to Rime language code.
- Parameters:
language (Language)
- Return type:
str | None
- async run_tts(text)[source]
Run text-to-speech synthesis on the provided text.
This method must be implemented by subclasses to provide actual TTS functionality.
- Parameters:
text (str) – The text to synthesize into speech.
- Yields:
Frame – Audio frames containing the synthesized speech.
- Return type:
AsyncGenerator[Frame, None]