LLM

class pipecat.services.cerebras.llm.CerebrasLLMService(*, api_key, base_url='https://api.cerebras.ai/v1', model='llama-3.3-70b', **kwargs)[source]

Bases: OpenAILLMService

A service for interacting with Cerebras’s API using the OpenAI-compatible interface.

This service extends OpenAILLMService to connect to Cerebras’s API endpoint while maintaining full compatibility with OpenAI’s interface and functionality.

Parameters:
  • api_key (str) – The API key for accessing Cerebras’s API

  • base_url (str, optional) – The base URL for Cerebras API. Defaults to “https://api.cerebras.ai/v1

  • model (str, optional) – The model identifier to use. Defaults to “llama-3.3-70b”

  • **kwargs – Additional keyword arguments passed to OpenAILLMService

create_client(api_key=None, base_url=None, **kwargs)[source]

Create OpenAI-compatible client for Cerebras API endpoint.

async get_chat_completions(context, messages)[source]

Create a streaming chat completion using Cerebras’s API.

Args: context (OpenAILLMContext): The context object containing tools configuration

and other settings for the chat completion.

messages (List[ChatCompletionMessageParam]): The list of messages comprising

the conversation history and current request.

Returns: AsyncStream[ChatCompletionChunk]: A streaming response of chat completion

chunks that can be processed asynchronously.

Parameters:
  • context (OpenAILLMContext)

  • messages (List[ChatCompletionDeveloperMessageParam | ChatCompletionSystemMessageParam | ChatCompletionUserMessageParam | ChatCompletionAssistantMessageParam | ChatCompletionToolMessageParam | ChatCompletionFunctionMessageParam])

Return type:

AsyncStream[ChatCompletionChunk]