AI Utils is a compact library for building edge-rendered AI-powered streaming text and chat UIs. It takes care of the boilerplate streaming code while not adding any additional abstraction or indirection between you and your AI model provider's SDK--letting you focus on building your next big thing instead of wasting another day messing around with text encoders.
- Edge Runtime compatibility
- First-class support for native OpenAI, Anthropic, and HuggingFace Inference JavaScript SDKs
- SWR-powered React hooks for fetching and rendering streaming text responses
- Callbacks for saving completed streaming responses to a database (in the same request)
pnpm install @vercel/ai-utils
Table of Contents
- Features
- Installation
- Background
- Usage
- Tutorial
- API Reference
Creating UIs with contemporary AI providers is a daunting task. Ideally, language models/providers would be fast enough where developers could just fetch complete responses data with JSON in a few hundred milliseconds, but the reality is starkly different. It's quite common for these LLMs to take 5-40s to whip up a response.
Instead of tormenting users with a seemingly endless loading spinner while these models conjure up responses or completions, the progressive approach involves streaming the text output to the frontend on the fly-—a tactic championed by OpenAI's ChatGPT. However, implementing this technique is easier said than done. Each AI provider has its own unique SDK, each has its own envelope surrounding the tokens, and each with different metadata (whose usefulness varies drastically).
Many AI utility helpers so far in the JS ecosystem tend to overcomplicate things with unnecessary magic tricks, excess levels of indirection, and lossy abstractions. Here's where Vercel AI Utils comes to the rescue—a compact library designed to alleviate the headaches of constructing streaming text UIs by taking care of the most annoying parts and then getting out of your way:
- Diminish the boilerplate necessary for handling streaming text responses
- Guarantee the capability to run functions at the Edge
- Streamline fetching and rendering of streaming responses (in React)
The goal of this library lies in its commitment to work directly with each AI/Model Hosting Provider's SDK, an equivalent edge-compatible version, or a vanilla fetch
function. Its job is simply to cut through the confusion and handle the intricacies of streaming text, leaving you to concentrate on building your next big thing instead of wasting another afternoon tweaking TextEncoder
with trial and error.
// app/api/generate/route.ts
import { Configuration, OpenAIApi } from 'openai-edge'
import { OpenAIStream, StreamingTextResponse } from '@vercel/ai-utils'
const config = new Configuration({
apiKey: process.env.OPENAI_API_KEY
})
const openai = new OpenAIApi(config)
export const runtime = 'edge'
export async function POST() {
const response = await openai.createChatCompletion({
model: 'gpt-4',
stream: true,
messages: [{ role: 'user', content: 'What is love?' }]
})
const stream = OpenAIStream(response)
return new StreamingTextResponse(stream)
}
For this example, we'll stream a chat completion text from OpenAI's gpt-3.5-turbo
and render it in Next.js. This tutorial assumes you have
Create a Next.js application and install @vercel/ai-utils
and openai-edge
. We currently prefer the latter openai-edge
library over the official OpenAI SDK because the official SDK uses axios
which is not compatible with Vercel Edge Functions.
pnpx create-next-app my-ai-app
cd my-ai-app
pnpm install @vercel/ai-utils openai-edge
Create a .env
file and add an OpenAI API Key called
touch .env
OPENAI_API_KEY=xxxxxxxxx
Create a Next.js Route Handler that uses the Edge Runtime that we'll use to generate a chat completion via OpenAI that we'll then stream back to our Next.js.
// ./app/api/chat/route.ts
import { Configuration, OpenAIApi } from 'openai-edge'
import { OpenAIStream, StreamingTextResponse } from '@vercel/ai-utils'
// Create an OpenAI API client (that's edge friendly!)
const config = new Configuration({
apiKey: process.env.OPENAI_API_KEY
})
const openai = new OpenAIApi(config)
// IMPORTANT! Set the runtime to edge
export const runtime = 'edge'
export async function POST(req: Request) {
// Extract the `prompt` from the body of the request
const { messages } = await req.json()
// Ask OpenAI for a streaming chat completion given the prompt
const response = await openai.createChatCompletion({
model: 'gpt-3.5-turbo',
stream: true,
messages
})
// Convert the response into a friendly text-stream
const stream = OpenAIStream(response)
// Respond with the stream
return new StreamingTextResponse(stream)
}
Vercel AI Utils provides 2 utility helpers to make the above seamless: First, we pass the streaming response
we receive from OpenAI to OpenAIStream
. This method decodes/extracts the text tokens in the response and then re-encodes them properly for simple consumption. We can then pass that new stream directly to StreamingTextResponse
. This is another utility class that extends the normal Node/Edge Runtime Response
class with the default headers you probably want (hint: 'Content-Type': 'text/plain; charset=utf-8'
is already set for you).
Create a Client component with a form that we'll use to gather the prompt from the user and then stream back the completion from.
// ./app/form.ts
'use client'
import { useChat } from '@vercel/ai-utils'
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat()
return (
<div className="mx-auto w-full max-w-md py-24 flex flex-col stretch">
{messages.length > 0
? messages.map(m => (
<div key={m.id}>
{m.role === 'user' ? 'User: ' : 'AI: '}
{m.content}
</div>
))
: null}
<form onSubmit={handleSubmit}>
<input
className="fixed w-full max-w-md bottom-0 border border-gray-300 rounded mb-8 shadow-xl p-2"
value={input}
placeholder="Say something..."
onChange={handleInputChange}
/>
</form>
</div>
)
}
A transform that will extract the text from all chat and completion OpenAI models as returned as a ReadableStream
.
// app/api/generate/route.ts
import { Configuration, OpenAIApi } from 'openai-edge';
import { OpenAIStream, StreamingTextResponse } from '@vercel/ai-utils';
const config = new Configuration({
apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(config);
export const runtime = 'edge';
export async function POST() {
const response = await openai.createChatCompletion({
model: 'gpt-4',
stream: true,
messages: [{ role: 'user', content: 'What is love?' }],
});
const stream = OpenAIStream(response, {
async onStart() {
console.log('streamin yo')
},
async onToken(token) {
console.log('token: ' + token)
},
async onCompletion(content) {
console.log('full text: ' + )
// await prisma.messages.create({ content }) or something
}
});
return new StreamingTextResponse(stream);
}
A transform that will extract the text from most chat and completion HuggingFace models and return them as a ReadableStream
.
// app/api/generate/route.ts
import { HfInference } from '@huggingface/inference'
import { HuggingFaceStream, StreamingTextResponse } from '@vercel/ai-utils'
export const runtime = 'edge'
const Hf = new HfInference(process.env.HUGGINGFACE_API_KEY)
export async function POST() {
const response = await Hf.textGenerationStream({
model: 'OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5',
inputs: `<|prompter|>What's the Earth total population?<|endoftext|><|assistant|>`,
parameters: {
max_new_tokens: 200,
// @ts-ignore
typical_p: 0.2, // you'll need this for OpenAssistant
repetition_penalty: 1,
truncate: 1000,
return_full_text: false
}
})
const stream = HuggingFaceStream(response)
return new StreamingTextResponse(stream)
}
This is a tiny wrapper around Response
class that makes returning ReadableStreams
of text a one liner. Status is automatically set to 200
, with 'Content-Type': 'text/plain; charset=utf-8'
set as headers
.
// app/api/chat/route.ts
import { OpenAIStream, StreamingTextResponse } from '@vercel/ai-utils'
export const runtime = 'edge'
export async function POST() {
const response = await openai.createChatCompletion({
model: 'gpt-4',
stream: true,
messages: { role: 'user', content: 'What is love?' }
})
const stream = OpenAIStream(response)
return new StreamingTextResponse(stream, {
'X-RATE-LIMIT': 'lol'
}) // => new Response(stream, { status: 200, headers: { 'Content-Type': 'text/plain; charset=utf-8', 'X-RATE-LIMIT': 'lol' }})
}
An SWR-powered React hook for streaming text completion or chat messages and handling chat and prompt input state.
The useChat
hook is designed to provide an intuitive interface for building ChatGPT-like UI's in React with streaming text responses. It leverages the SWR library for efficient data fetching and state synchronization.
The Message type represents a chat message within your application.
type Message = {
id: string
createdAt?: Date
content: string
role: 'system' | 'user' | 'assistant'
}
The UseChatOptions type defines the configuration options for the useChat hook.
type UseChatOptions = {
api?: string
id?: string
initialMessages?: Message[]
initialInput?: string
}
The UseChatHelpers
type is the return type of the useChat
hook. It provides various utilities to interact with and manipulate the chat.
type UseChatHelpers = {
messages: Message[]
error: any
append: (message: Message) => void
reload: () => void
stop: () => void
set: (messages: Message[]) => void
input: string
setInput: react.Dispatch<react.SetStateAction<string>>
handleInputChange: (e: any) => void
handleSubmit: (e: React.FormEvent<HTMLFormElement>) => void
isLoading: boolean
}
Below is a basic example of the useChat hook in a component:
// app/chat.tsx
'use client'
import { useChat } from '@vercel/ai-utils'
export default function Chat() {
const { messages, input, stop, isLoading, handleInputChange, handleSubmit } =
useChat({
api: '/api/some-custom-endpoint',
initialMessages: [
{
id: 'abc124',
content: 'You are an AI assistant ...',
role: 'system'
}
]
})
return (
<div className="mx-auto w-full max-w-md py-24 flex flex-col stretch">
{messages.length > 0
? messages.map(m => (
<div key={m.id}>
{m.role === 'user' ? 'User: ' : 'AI: '}
{m.content}
</div>
))
: null}
<form onSubmit={handleSubmit}>
<input
className="fixed w-full max-w-md bottom-0 border border-gray-300 rounded mb-8 shadow-xl p-2"
value={input}
placeholder="Say something..."
onChange={handleInputChange}
/>
<button type="button" onClick={stop}>
Stop
</button>
<button disabled={isLoading} type="submit">
Send
</button>
</form>
</div>
)
}
In this example, chat is an object of type UseChatHelpers
, which contains various utilities to interact with and control the chat. You can use these utilities to render chat messages, handle input changes, submit messages, and manage the chat state in your UI.