AudioBufferProcessor

class pipecat.processors.audio.audio_buffer_processor.AudioBufferProcessor(*, sample_rate=None, num_channels=1, buffer_size=0, user_continuous_stream=None, enable_turn_audio=False, **kwargs)[source]

Bases: FrameProcessor

Processes and buffers audio frames from both input (user) and output (bot) sources.

This processor manages audio buffering and synchronization, providing both merged and track-specific audio access through event handlers. It supports various audio configurations including sample rate conversion and mono/stereo output.

Events:: on_audio_data: Triggered when buffer_size is reached, providing merged audio on_track_audio_data: Triggered when buffer_size is reached, providing separate tracks on_user_turn_audio_data: Triggered when user turn has ended, providing that user turn’s audio on_bot_turn_audio_data: Triggered when bot turn has ended, providing that bot turn’s audio

Parameters:

sample_rate (Optional[int]) – Desired output sample rate. If None, uses source rate
num_channels (int) – Number of channels (1 for mono, 2 for stereo). Defaults to 1
buffer_size (int) – Size of buffer before triggering events. 0 for no buffering
enable_turn_audio (bool) – Whether turn audio event handlers should be triggered
user_continuous_stream (bool | None)

Audio handling:

Mono output (num_channels=1): User and bot audio are mixed
Stereo output (num_channels=2): User audio on left, bot audio on right
Automatic resampling of incoming audio to match desired sample_rate
Silence insertion for non-continuous audio streams
Buffer synchronization between user and bot audio

property sample_rate: int

Current sample rate of the audio processor.

Returns:: The sample rate in Hz
Return type:: int

property num_channels: int

Number of channels in the audio output.

Returns:: Number of channels (1 for mono, 2 for stereo)
Return type:: int

has_audio()[source]

Check if both user and bot audio buffers contain data.

Returns:: True if both buffers contain audio data
Return type:: bool

merge_audio_buffers()[source]

Merge user and bot audio buffers into a single audio stream.

For mono output, audio is mixed. For stereo output, user audio is placed on the left channel and bot audio on the right channel.

Returns:: Mixed audio data
Return type:: bytes

async start_recording()[source]

Start recording audio from both user and bot.

Initializes recording state and resets audio buffers.

async stop_recording()[source]

Stop recording and trigger final audio data handlers.

Calls audio handlers with any remaining buffered audio before stopping.

async process_frame(frame, direction)[source]

Process incoming audio frames and manage audio buffers.

Parameters:

frame (Frame)
direction (FrameDirection)