Realtime API Agents Demo

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API. In particular, this demonstrates:

Sequential agent handoffs according to a defined agent graph (taking inspiration from OpenAI Swarm)
Background escalation to more intelligent models like o4-mini for high-stakes decisions
Prompting models to follow a state machine, for example to accurately collect things like names and phone numbers with confirmation character by character to authenticate a user.

Here's a quick demo video if you'd like a walkthrough. You should be able to use this repo to prototype your own multi-agent realtime voice app in less than 20 minutes!

Setup

This is a Next.js typescript app
Install dependencies with npm i
Add your OPENAI_API_KEY to your env. Either add it to your .bash_profile or equivalent file, or copy .env.sample to .env and add it there.
Start the server with npm run dev
Open your browser to http://localhost:3000 to see the app. It should automatically connect to the simpleExample Agent Set.

Configuring Agents

Configuration in src/app/agentConfigs/simpleExample.ts

import { AgentConfig } from "@/app/types";
import { injectTransferTools } from "./utils";

// Define agents
const haikuWriter: AgentConfig = {
  name: "haikuWriter",
  publicDescription: "Agent that writes haikus.", // Context for the agent_transfer tool
  instructions:
    "Ask the user for a topic, then reply with a haiku about that topic.",
  tools: [],
};

const greeter: AgentConfig = {
  name: "greeter",
  publicDescription: "Agent that greets the user.",
  instructions:
    "Please greet the user and ask them if they'd like a Haiku. If yes, transfer them to the 'haiku' agent.",
  tools: [],
  downstreamAgents: [haikuWriter],
};

// add the transfer tool to point to downstreamAgents
const agents = injectTransferTools([greeter, haikuWriter]);

export default agents;

This fully specifies the agent set that was used in the interaction shown in the screenshot above.

Sequence Diagram of CustomerServiceRetail Flow

This diagram illustrates the interaction flow defined in src/app/agentConfigs/customerServiceRetail/.

Show CustomerServiceRetail Flow Diagram

sequenceDiagram
    participant User
    participant WebClient as Next.js Client
    participant NextAPI as /api/session
    participant RealtimeAPI as OpenAI Realtime API
    participant AgentManager as Agents (authentication, returns, sales, simulatedHuman)
    participant o1mini as "o4-mini" (Escalation Model)

    Note over WebClient: User navigates to ?agentConfig=customerServiceRetail
    User->>WebClient: Open Page
    WebClient->>NextAPI: GET /api/session
    NextAPI->>RealtimeAPI: POST /v1/realtime/sessions
    RealtimeAPI->>NextAPI: Returns ephemeral session
    NextAPI->>WebClient: Returns ephemeral token (JSON)

    Note right of WebClient: Start RTC handshake
    WebClient->>RealtimeAPI: Offer SDP (WebRTC)
    RealtimeAPI->>WebClient: SDP answer
    WebClient->>WebClient: DataChannel "oai-events" established

    Note over AgentManager: Default agent is "authentication"
    User->>WebClient: "Hi, I'd like to return my snowboard."
    WebClient->>AgentManager: conversation.item.create (role=user)
    WebClient->>RealtimeAPI: {type: "conversation.item.create"}
    WebClient->>RealtimeAPI: {type: "response.create"}

    authentication->>AgentManager: Requests user info, calls authenticate_user_information()
    AgentManager-->>WebClient: function_call => name="authenticate_user_information"
    WebClient->>WebClient: handleFunctionCall => verifies details

    Note over AgentManager: After user is authenticated
    authentication->>AgentManager: transferAgents("returns")
    AgentManager-->>WebClient: function_call => name="transferAgents" args={ destination: "returns" }
    WebClient->>WebClient: setSelectedAgentName("returns")

    Note over returns: The user wants to process a return
    returns->>AgentManager: function_call => checkEligibilityAndPossiblyInitiateReturn
    AgentManager-->>WebClient: function_call => name="checkEligibilityAndPossiblyInitiateReturn"

    Note over WebClient: The WebClient calls /api/chat/completions with model="o4-mini"
    WebClient->>o1mini: "Is this item eligible for return?"
    o1mini->>WebClient: "Yes/No (plus notes)"

    Note right of returns: Returns uses the result from "o4-mini"
    returns->>AgentManager: "Return is approved" or "Return is denied"
    AgentManager->>WebClient: conversation.item.create (assistant role)
    WebClient->>User: Displays final verdict

Next steps

Check out the configs in src/app/agentConfigs. The example above is a minimal demo that illustrates the core concepts.
frontDeskAuthentication Guides the user through a step-by-step authentication flow, confirming each value character-by-character, authenticates the user with a tool call, and then transfers to another agent. Note that the second agent is intentionally "bored" to show how to prompt for personality and tone.
customerServiceRetail Also guides through an authentication flow, reads a long offer from a canned script verbatim, and then walks through a complex return flow which requires looking up orders and policies, gathering user context, and checking with o4-mini to ensure the return is eligible. To test this flow, say that you'd like to return your snowboard and go through the necessary prompts!

Defining your own agents

You can copy these to make your own multi-agent voice app! Once you make a new agent set config, add it to src/app/agentConfigs/index.ts and you should be able to select it in the UI in the "Scenario" dropdown menu.
To see how to define tools and toolLogic, including a background LLM call, see src/app/agentConfigs/customerServiceRetail/returns.ts
To see how to define a detailed personality and tone, and use a prompt state machine to collect user information step by step, see src/app/agentConfigs/frontDeskAuthentication/authentication.ts
To see how to wire up Agents into a single Agent Set, see src/app/agentConfigs/frontDeskAuthentication/index.ts
If you want help creating your own prompt using these conventions, we've included a metaprompt here, or you can use our Voice Agent Metaprompter GPT

Customizing Output Guardrails

Assistant messages are checked for safety and compliance using a guardrail function before being finalized in the transcript. This is implemented in src/app/hooks/useHandleServerEvent.ts as the processGuardrail function, which is invoked on each assistant message to run a moderation/classification check. You can review or customize this logic by editing the processGuardrail function definition and its invocation inside useHandleServerEvent.

UI

You can select agent scenarios in the Scenario dropdown, and automatically switch to a specific agent with the Agent dropdown.
The conversation transcript is on the left, including tool calls, tool call responses, and agent changes. Click to expand non-message elements.
The event log is on the right, showing both client and server events. Click to see the full payload.
On the bottom, you can disconnect, toggle between automated voice-activity detection or PTT, turn off audio playback, and toggle logs.

Core Contributors

Noah MacCallum - noahmacca
Ilan Bigio - ibigio

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
public		public
src/app		src/app
.env.sample		.env.sample
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Realtime API Agents Demo

Setup

Configuring Agents

Sequence Diagram of CustomerServiceRetail Flow

Next steps

Defining your own agents

Customizing Output Guardrails

UI

Core Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

openai/openai-realtime-agents

Folders and files

Latest commit

History

Repository files navigation

Realtime API Agents Demo

Setup

Configuring Agents

Sequence Diagram of CustomerServiceRetail Flow

Next steps

Defining your own agents

Customizing Output Guardrails

UI

Core Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages