Skip to content

LLM Client

One abstract base, one concrete implementation, multiple modalities.

BaseModelClient defines the contract. OpenAIClient implements it โ€” handling text generation, vision, structured output, streaming, speech-to-text, text-to-speech, real-time S2S, and image generation from a single class.


Modality map

graph TB
    BC["BaseModelClient\n(core/llm/base_client.py)"] --> OC["OpenAIClient\n(integrations/llm/openai/)"]

    OC --> TXT["๐Ÿ“ Text\ngenerate()\ngenerate_stream()"]
    OC --> VIS["๐Ÿ–ผ Vision\ngenerate() with ImageContent"]
    OC --> STR["๐Ÿ” Structured output\nresponse_format=MyModel"]
    OC --> STT["๐ŸŽค STT\ntranscribe()"]
    OC --> TTS["๐Ÿ”Š TTS\nstream_tts()"]
    OC --> S2S["๐Ÿ“ก Realtime S2S\ncreate_s2s_session()\ns2s_ws_url()"]
    OC --> IMG["๐ŸŽจ Image gen\ngenerate_image()"]

    style BC  fill:#0d2b2b,stroke:#2dd4bf,color:#e2ecff
    style OC  fill:#1a1a2e,stroke:#818cf8,color:#e2ecff
    style TXT fill:#1a2b1a,stroke:#4ade80,color:#e2ecff

Import paths

# Abstract base (no external deps โ€” safe to import anywhere)
from ravi.core.llm.base_client import BaseModelClient

# Concrete implementation
from ravi.integrations.llm.openai.openai_client import OpenAIClient

Text generation

Blocking

from ravi.integrations.llm.openai.openai_client import OpenAIClient
from ravi.core.messages import SystemMessage, UserMessage

client = OpenAIClient(model="gpt-4o", temperature=0.7)

messages = [
    SystemMessage("You are a helpful assistant."),
    UserMessage(content=["What is the capital of France?"]),
]

response = await client.generate(messages)
print(response.content[0].text)   # "Paris"
print(response.usage)             # {prompt_tokens, completion_tokens, total_tokens}

Streaming

sequenceDiagram
    autonumber
    participant A as Agent
    participant C as OpenAIClient
    participant O as OpenAI API

    A->>C: generate_stream(messages, tools)
    C->>O: POST /chat/completions stream=true
    loop Token by token
        O-->>C: SSE chunk
        C-->>A: yield AssistantMessage (partial)
    end
    O-->>C: [DONE]
    C-->>A: yield AssistantMessage (final, finish_reason)
async for partial in client.generate_stream(messages):
    if partial.content:
        print(partial.content[0].text, end="", flush=True)

Capability checks

Audio and image methods are optional. Always check the property before calling.

# Audio
if client.supports_audio:
    transcript = await client.transcribe(
        audio_bytes=audio_data,
        filename="meeting.mp3",
        model="gpt-4o-transcribe",
        language="en",
    )

    async for chunk in client.stream_tts(
        text="Hello world",
        voice="coral",         # alloy echo fable nova onyx sage shimmer
        model="tts-1",
        response_format="mp3",
    ):
        audio_buffer.write(chunk)

# Real-time speech-to-speech
if client.supports_s2s:
    ws_url = client.s2s_ws_url(model="gpt-4o-realtime-preview")
    session = await client.create_s2s_session(
        model="gpt-4o-realtime-preview",
        voice="coral",
        instructions="You are a voice assistant.",
    )

# Image generation
if client.supports_image_generation:
    urls = await client.generate_image(
        "a cat in a spacesuit, realistic",
        model="gpt-image-1",
        size="1024x1024",
        quality="high",    # low / medium / high / auto
    )
    # Returns list[str] โ€” URLs or data:image/png;base64,... strings

Structured output

Pass a Pydantic model as response_format and the LLM is forced to return valid JSON matching your schema.

from pydantic import BaseModel

class Analysis(BaseModel):
    sentiment: str
    confidence: float
    summary: str

response = await client.generate(
    messages,
    response_format=Analysis,
)

result = response.parsed       # Analysis instance, or None if parse failed
if result:
    print(result.sentiment, result.confidence)

Vision input

Pass ImageContent objects in a UserMessage to feed images to the LLM.

from ravi.core.messages import UserMessage, ImageContent

msg = UserMessage(content=[
    "Describe what you see in this image.",
    ImageContent(url="https://example.com/photo.jpg", detail="high"),
])

# From bytes (no network fetch)
msg = UserMessage(content=[
    ImageContent(data=open("chart.png", "rb").read(), media_type="image/png"),
])

# From OpenAI Files API
msg = UserMessage(content=[
    ImageContent(file_id="file-abc123"),
])

detail values: "low" ยท "high" ยท "original" ยท "auto" (default)


Token counting

count = await client.count_tokens(messages)
# Used by TokenBudgetContext to fit messages in the context window

Source

File What it owns
core/llm/base_client.py BaseModelClient ABC โ€” text, audio, image generation signatures
integrations/llm/openai/openai_client.py OpenAIClient โ€” all modalities