TTS
- pipecat.services.minimax.tts.language_to_minimax_language(language)[source]
- Parameters:
language (Language)
- Return type:
str | None
- class pipecat.services.minimax.tts.MiniMaxHttpTTSService(*, api_key, group_id, model='speech-02-turbo', voice_id='Calm_Woman', aiohttp_session, sample_rate=None, params=None, **kwargs)[source]
Bases:
TTSService
Text-to-speech service using MiniMax’s T2A (Text-to-Audio) API.
Platform documentation: https://www.minimax.io/platform/document/T2A%20V2?key=66719005a427f0c8a5701643
- Parameters:
api_key (str) – MiniMax API key for authentication.
group_id (str) – MiniMax Group ID to identify project.
model (str) – TTS model name (default: “speech-02-turbo”). Options include “speech-02-hd”, “speech-02-turbo”, “speech-01-hd”, “speech-01-turbo”.
voice_id (str) – Voice identifier (default: “Calm_Woman”).
aiohttp_session (ClientSession) – aiohttp.ClientSession for API communication.
sample_rate (int | None) – Output audio sample rate in Hz (default: None, set from pipeline).
params (InputParams | None) – Additional configuration parameters.
- class InputParams(*, language=Language.EN, speed=1.0, volume=1.0, pitch=0, emotion=None, english_normalization=None)[source]
Bases:
BaseModel
Configuration parameters for MiniMax TTS.
- Parameters:
language (Language | None)
speed (float | None)
volume (float | None)
pitch (float | None)
emotion (str | None)
english_normalization (bool | None)
- language
Language for TTS generation.
- Type:
pipecat.transcriptions.language.Language | None
- speed
Speech speed (range: 0.5 to 2.0).
- Type:
float | None
- volume
Speech volume (range: 0 to 10).
- Type:
float | None
- pitch
Pitch adjustment (range: -12 to 12).
- Type:
float | None
- emotion
Emotional tone (options: “happy”, “sad”, “angry”, “fearful”, “disgusted”, “surprised”, “neutral”).
- Type:
str | None
- english_normalization
Whether to apply English text normalization.
- Type:
bool | None
- language: Language | None
- speed: float | None
- volume: float | None
- pitch: float | None
- emotion: str | None
- english_normalization: bool | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- can_generate_metrics()[source]
- Return type:
bool
- language_to_service_language(language)[source]
Convert a language to the service-specific language format.
- Parameters:
language (Language) – The language to convert.
- Returns:
The service-specific language identifier, or None if not supported.
- Return type:
str | None
- set_model_name(model)[source]
Set the TTS model to use
- Parameters:
model (str)
- set_voice(voice)[source]
Set the voice to use
- Parameters:
voice (str)
- async start(frame)[source]
Start the TTS service.
- Parameters:
frame (StartFrame) – The start frame containing initialization parameters.
- async run_tts(text)[source]
Run text-to-speech synthesis on the provided text.
This method must be implemented by subclasses to provide actual TTS functionality.
- Parameters:
text (str) – The text to synthesize into speech.
- Yields:
Frame – Audio frames containing the synthesized speech.
- Return type:
AsyncGenerator[Frame, None]