TTS
- pipecat.services.google.tts.language_to_google_tts_language(language)[source]
- Parameters:
language (Language)
- Return type:
str | None
- class pipecat.services.google.tts.GoogleHttpTTSService(*, credentials=None, credentials_path=None, voice_id='en-US-Chirp3-HD-Charon', sample_rate=None, params=None, **kwargs)[source]
Bases:
TTSService
- Parameters:
credentials (str | None)
credentials_path (str | None)
voice_id (str)
sample_rate (int | None)
params (InputParams | None)
- class InputParams(*, pitch=None, rate=None, volume=None, emphasis=None, language=Language.EN, gender=None, google_style=None)[source]
Bases:
BaseModel
- Parameters:
pitch (str | None)
rate (str | None)
volume (str | None)
emphasis (Literal['strong', 'moderate', 'reduced', 'none'] | None)
language (Language | None)
gender (Literal['male', 'female', 'neutral'] | None)
google_style (Literal['apologetic', 'calm', 'empathetic', 'firm', 'lively'] | None)
- pitch: str | None
- rate: str | None
- volume: str | None
- emphasis: Literal['strong', 'moderate', 'reduced', 'none'] | None
- language: Language | None
- gender: Literal['male', 'female', 'neutral'] | None
- google_style: Literal['apologetic', 'calm', 'empathetic', 'firm', 'lively'] | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- can_generate_metrics()[source]
- Return type:
bool
- language_to_service_language(language)[source]
Convert a language to the service-specific language format.
- Parameters:
language (Language) – The language to convert.
- Returns:
The service-specific language identifier, or None if not supported.
- Return type:
str | None
- async run_tts(text)[source]
Run text-to-speech synthesis on the provided text.
This method must be implemented by subclasses to provide actual TTS functionality.
- Parameters:
text (str) – The text to synthesize into speech.
- Yields:
Frame – Audio frames containing the synthesized speech.
- Return type:
AsyncGenerator[Frame, None]
- class pipecat.services.google.tts.GoogleTTSService(*, credentials=None, credentials_path=None, voice_id='en-US-Chirp3-HD-Charon', sample_rate=None, params=InputParams(language=<Language.EN: 'en'>), **kwargs)[source]
Bases:
TTSService
Text-to-Speech service using Google Cloud Text-to-Speech API.
Converts text to speech using Google’s TTS models with streaming synthesis for low latency. Supports multiple languages and voices.
- Parameters:
credentials (str | None) – JSON string containing Google Cloud service account credentials.
credentials_path (str | None) – Path to Google Cloud service account JSON file.
voice_id (str) – Google TTS voice identifier (e.g., “en-US-Chirp3-HD-Charon”).
sample_rate (int | None) – Audio sample rate in Hz.
params (InputParams) – Language only.
Notes
Requires Google Cloud credentials via service account JSON, file path, or default application credentials (GOOGLE_APPLICATION_CREDENTIALS env var). Only Chirp 3 HD and Journey voices are supported. Use GoogleHttpTTSService for other voices.
Example
```python tts = GoogleTTSService(
credentials_path=”/path/to/service-account.json”, voice_id=”en-US-Chirp3-HD-Charon”, params=GoogleTTSService.InputParams(
language=Language.EN_US,
)
)
- class InputParams(*, language=Language.EN)[source]
Bases:
BaseModel
- Parameters:
language (Language | None)
- language: Language | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- can_generate_metrics()[source]
- Return type:
bool
- language_to_service_language(language)[source]
Convert a language to the service-specific language format.
- Parameters:
language (Language) – The language to convert.
- Returns:
The service-specific language identifier, or None if not supported.
- Return type:
str | None
- async run_tts(text)[source]
Run text-to-speech synthesis on the provided text.
This method must be implemented by subclasses to provide actual TTS functionality.
- Parameters:
text (str) – The text to synthesize into speech.
- Yields:
Frame – Audio frames containing the synthesized speech.
- Return type:
AsyncGenerator[Frame, None]