AudioBufferProcessor
- class pipecat.processors.audio.audio_buffer_processor.AudioBufferProcessor(*, sample_rate=None, num_channels=1, buffer_size=0, user_continuous_stream=None, enable_turn_audio=False, **kwargs)[source]
Bases:
FrameProcessor
Processes and buffers audio frames from both input (user) and output (bot) sources.
This processor manages audio buffering and synchronization, providing both merged and track-specific audio access through event handlers. It supports various audio configurations including sample rate conversion and mono/stereo output.
- Events:
on_audio_data: Triggered when buffer_size is reached, providing merged audio on_track_audio_data: Triggered when buffer_size is reached, providing separate tracks on_user_turn_audio_data: Triggered when user turn has ended, providing that user turn’s audio on_bot_turn_audio_data: Triggered when bot turn has ended, providing that bot turn’s audio
- Parameters:
sample_rate (Optional[int]) – Desired output sample rate. If None, uses source rate
num_channels (int) – Number of channels (1 for mono, 2 for stereo). Defaults to 1
buffer_size (int) – Size of buffer before triggering events. 0 for no buffering
enable_turn_audio (bool) – Whether turn audio event handlers should be triggered
user_continuous_stream (bool | None)
- Audio handling:
Mono output (num_channels=1): User and bot audio are mixed
Stereo output (num_channels=2): User audio on left, bot audio on right
Automatic resampling of incoming audio to match desired sample_rate
Silence insertion for non-continuous audio streams
Buffer synchronization between user and bot audio
- property sample_rate: int
Current sample rate of the audio processor.
- Returns:
The sample rate in Hz
- Return type:
int
- property num_channels: int
Number of channels in the audio output.
- Returns:
Number of channels (1 for mono, 2 for stereo)
- Return type:
int
- has_audio()[source]
Check if both user and bot audio buffers contain data.
- Returns:
True if both buffers contain audio data
- Return type:
bool
- merge_audio_buffers()[source]
Merge user and bot audio buffers into a single audio stream.
For mono output, audio is mixed. For stereo output, user audio is placed on the left channel and bot audio on the right channel.
- Returns:
Mixed audio data
- Return type:
bytes
- async start_recording()[source]
Start recording audio from both user and bot.
Initializes recording state and resets audio buffers.
- async stop_recording()[source]
Stop recording and trigger final audio data handlers.
Calls audio handlers with any remaining buffered audio before stopping.
- async process_frame(frame, direction)[source]
Process incoming audio frames and manage audio buffers.
- Parameters:
frame (Frame)
direction (FrameDirection)