AudioBufferProcessor

class pipecat.processors.audio.audio_buffer_processor.AudioBufferProcessor(*, sample_rate=None, num_channels=1, buffer_size=0, user_continuous_stream=None, enable_turn_audio=False, **kwargs)[source]

Bases: FrameProcessor

Processes and buffers audio frames from both input (user) and output (bot) sources.

This processor manages audio buffering and synchronization, providing both merged and track-specific audio access through event handlers. It supports various audio configurations including sample rate conversion and mono/stereo output.

Events:

on_audio_data: Triggered when buffer_size is reached, providing merged audio on_track_audio_data: Triggered when buffer_size is reached, providing separate tracks on_user_turn_audio_data: Triggered when user turn has ended, providing that user turn’s audio on_bot_turn_audio_data: Triggered when bot turn has ended, providing that bot turn’s audio

Parameters:
  • sample_rate (Optional[int]) – Desired output sample rate. If None, uses source rate

  • num_channels (int) – Number of channels (1 for mono, 2 for stereo). Defaults to 1

  • buffer_size (int) – Size of buffer before triggering events. 0 for no buffering

  • enable_turn_audio (bool) – Whether turn audio event handlers should be triggered

  • user_continuous_stream (bool | None)

Audio handling:
  • Mono output (num_channels=1): User and bot audio are mixed

  • Stereo output (num_channels=2): User audio on left, bot audio on right

  • Automatic resampling of incoming audio to match desired sample_rate

  • Silence insertion for non-continuous audio streams

  • Buffer synchronization between user and bot audio

property sample_rate: int

Current sample rate of the audio processor.

Returns:

The sample rate in Hz

Return type:

int

property num_channels: int

Number of channels in the audio output.

Returns:

Number of channels (1 for mono, 2 for stereo)

Return type:

int

has_audio()[source]

Check if both user and bot audio buffers contain data.

Returns:

True if both buffers contain audio data

Return type:

bool

merge_audio_buffers()[source]

Merge user and bot audio buffers into a single audio stream.

For mono output, audio is mixed. For stereo output, user audio is placed on the left channel and bot audio on the right channel.

Returns:

Mixed audio data

Return type:

bytes

async start_recording()[source]

Start recording audio from both user and bot.

Initializes recording state and resets audio buffers.

async stop_recording()[source]

Stop recording and trigger final audio data handlers.

Calls audio handlers with any remaining buffered audio before stopping.

async process_frame(frame, direction)[source]

Process incoming audio frames and manage audio buffers.

Parameters:
  • frame (Frame)

  • direction (FrameDirection)