STT
- pipecat.services.fal.stt.language_to_fal_language(language)[source]
Language support for Fal’s Wizper API.
- Parameters:
language (Language)
- Return type:
str | None
- class pipecat.services.fal.stt.FalSTTService(*, api_key=None, sample_rate=None, params=None, **kwargs)[source]
Bases:
SegmentedSTTService
Speech-to-text service using Fal’s Wizper API.
This service uses Fal’s Wizper API to perform speech-to-text transcription on audio segments. It inherits from SegmentedSTTService to handle audio buffering and speech detection.
- Parameters:
api_key (str | None) – Fal API key. If not provided, will check FAL_KEY environment variable.
sample_rate (int | None) – Audio sample rate in Hz. If not provided, uses the pipeline’s rate.
params (InputParams | None) – Configuration parameters for the Wizper API.
**kwargs – Additional arguments passed to SegmentedSTTService.
- class InputParams(*, language=Language.EN, task='transcribe', chunk_level='segment', version='3')[source]
Bases:
BaseModel
Configuration parameters for Fal’s Wizper API.
- Parameters:
language (Language | None)
task (str)
chunk_level (str)
version (str)
- language
Language of the audio input. Defaults to English.
- Type:
pipecat.transcriptions.language.Language | None
- task
Task to perform (‘transcribe’ or ‘translate’). Defaults to ‘transcribe’.
- Type:
str
- chunk_level
Level of chunking (‘segment’). Defaults to ‘segment’.
- Type:
str
- version
Version of Wizper model to use. Defaults to ‘3’.
- Type:
str
- language: Language | None
- task: str
- chunk_level: str
- version: str
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- can_generate_metrics()[source]
- Return type:
bool
- language_to_service_language(language)[source]
- Parameters:
language (Language)
- Return type:
str | None
- async set_language(language)[source]
Set the language for speech recognition.
- Parameters:
language (Language) – The language to use for speech recognition.
- async set_model(model)[source]
Set the speech recognition model.
- Parameters:
model (str) – The name of the model to use for speech recognition.
- async run_stt(audio)[source]
Transcribes an audio segment using Fal’s Wizper API.
- Parameters:
audio (bytes) – Raw audio bytes in WAV format (already converted by base class).
- Yields:
Frame – TranscriptionFrame containing the transcribed text.
- Return type:
AsyncGenerator[Frame, None]
Note
The audio is already in WAV format from the SegmentedSTTService. Only non-empty transcriptions are yielded.