PatternPairAggregator
- class pipecat.utils.text.pattern_pair_aggregator.PatternMatch(pattern_id, full_match, content)[source]
Bases:
object
Represents a matched pattern pair with its content.
A PatternMatch object is created when a complete pattern pair is found in the text. It contains information about which pattern was matched, the full matched text (including start and end patterns), and the content between the patterns.
- Parameters:
pattern_id (str)
full_match (str)
content (str)
- pattern_id
The identifier of the matched pattern pair.
- full_match
The complete text including start and end patterns.
- content
The text content between the start and end patterns.
- class pipecat.utils.text.pattern_pair_aggregator.PatternPairAggregator[source]
Bases:
BaseTextAggregator
Aggregator that identifies and processes content between pattern pairs.
This aggregator buffers text until it can identify complete pattern pairs (defined by start and end patterns), processes the content between these patterns using registered handlers, and returns text at sentence boundaries. It’s particularly useful for processing structured content in streaming text, such as XML tags, markdown formatting, or custom delimiters.
The aggregator ensures that patterns spanning multiple text chunks are correctly identified and handles cases where patterns contain sentence boundaries.
- property text: str
Get the currently buffered text.
- Returns:
The current text buffer content.
- add_pattern_pair(pattern_id, start_pattern, end_pattern, remove_match=True)[source]
Add a pattern pair to detect in the text.
Registers a new pattern pair with a unique identifier. The aggregator will look for text that starts with the start pattern and ends with the end pattern, and treat the content between them as a match.
- Parameters:
pattern_id (str) – Unique identifier for this pattern pair.
start_pattern (str) – Pattern that marks the beginning of content.
end_pattern (str) – Pattern that marks the end of content.
remove_match (bool) – Whether to remove the matched content from the text.
- Returns:
Self for method chaining.
- Return type:
PatternPairAggregator
- on_pattern_match(pattern_id, handler)[source]
Register a handler for when a pattern pair is matched.
The handler will be called whenever a complete match for the specified pattern ID is found in the text.
- Parameters:
pattern_id (str) – ID of the pattern pair to match.
handler (Callable[[PatternMatch], Awaitable[None]]) – Function to call when pattern is matched. The function should accept a PatternMatch object.
- Returns:
Self for method chaining.
- Return type:
PatternPairAggregator
- async aggregate(text)[source]
Aggregate text and process pattern pairs.
This method adds the new text to the buffer, processes any complete pattern pairs, and returns processed text up to sentence boundaries if possible. If there are incomplete patterns (start without matching end), it will continue buffering text.
- Parameters:
text (str) – New text to add to the buffer.
- Returns:
Processed text up to a sentence boundary, or None if more text is needed to form a complete sentence or pattern.
- Return type:
str | None
- async handle_interruption()[source]
Handle interruptions by clearing the buffer.
Called when an interruption occurs in the processing pipeline, to reset the state and discard any partially aggregated text.
- async reset()[source]
Clear the internally aggregated text.
Resets the aggregator to its initial state, discarding any buffered text.