TTS
- pipecat.services.sarvam.tts.language_to_sarvam_language(language)[source]
Convert Pipecat Language enum to Sarvam AI language codes.
- Parameters:
language (Language)
- Return type:
str | None
- class pipecat.services.sarvam.tts.SarvamTTSService(*, api_key, voice_id='anushka', model='bulbul:v2', aiohttp_session, base_url='https://api.sarvam.ai', sample_rate=None, params=None, **kwargs)[source]
Bases:
TTSService
Text-to-Speech service using Sarvam AI’s API.
Converts text to speech using Sarvam AI’s TTS models with support for multiple Indian languages. Provides control over voice characteristics like pitch, pace, and loudness.
- Parameters:
api_key (str) – Sarvam AI API subscription key.
voice_id (str) – Speaker voice ID (e.g., “anushka”, “meera”).
model (str) – TTS model to use (“bulbul:v1” or “bulbul:v2”).
aiohttp_session (ClientSession) – Shared aiohttp session for making requests.
base_url (str) – Sarvam AI API base URL.
sample_rate (int | None) – Audio sample rate in Hz (8000, 16000, 22050, 24000).
params (InputParams | None) – Additional voice and preprocessing parameters.
Example
```python tts = SarvamTTSService(
api_key=”your-api-key”, voice_id=”anushka”, model=”bulbul:v2”, aiohttp_session=session, params=SarvamTTSService.InputParams(
language=Language.HI, pitch=0.1, pace=1.2
)
)
- class InputParams(*, language=Language.EN, pitch=0.0, pace=1.0, loudness=1.0, enable_preprocessing=False)[source]
Bases:
BaseModel
- Parameters:
language (Language | None)
pitch (float | None)
pace (float | None)
loudness (float | None)
enable_preprocessing (bool | None)
- language: Language | None
- pitch: float | None
- pace: float | None
- loudness: float | None
- enable_preprocessing: bool | None
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- can_generate_metrics()[source]
- Return type:
bool
- language_to_service_language(language)[source]
Convert a language to the service-specific language format.
- Parameters:
language (Language) – The language to convert.
- Returns:
The service-specific language identifier, or None if not supported.
- Return type:
str | None
- async start(frame)[source]
Start the TTS service.
- Parameters:
frame (StartFrame) – The start frame containing initialization parameters.
- async run_tts(text)[source]
Run text-to-speech synthesis on the provided text.
This method must be implemented by subclasses to provide actual TTS functionality.
- Parameters:
text (str) – The text to synthesize into speech.
- Yields:
Frame – Audio frames containing the synthesized speech.
- Return type:
AsyncGenerator[Frame, None]