Persistent Memory for
Vercel AI SDK
The Vercel AI SDK makes it trivially easy to stream AI responses in Next.js — but every chat session starts with a blank slate. Users who come back tomorrow have to re-explain their context from scratch. This guide shows how to add Kronvex persistent memory alongside streamText and useChat so your users are always remembered.
- Vercel AI SDK and the memory gap
- Installing Kronvex Node.js SDK
- The Next.js API route pattern
- Recall memories before streamText
- Store memories after each response
- Full API route: chat with persistent memory
- Client-side: useChat with user identity
- Works with any model — GPT-4, Claude, Gemini
- Best practices
Vercel AI SDK and the memory gap
Vercel's AI SDK (package: ai) provides a unified interface for streaming completions from any LLM provider. The useChat React hook handles the frontend conversation state, and streamText handles the backend API call. Together they cover the entire chat loop beautifully.
What they don't cover is persistence. The useChat hook stores messages in React state — when the page reloads, all messages are gone. The backend streamText call receives whatever messages the client sends — it has no knowledge of previous sessions. Every conversation is ephemeral by design.
For a personal productivity chatbot, a customer support agent, or a coding assistant, this is a significant problem. Users invest time providing context — their tech stack, their preferences, their project constraints — and lose it all on every reload or next-day visit.
Kronvex solves this by acting as a semantic memory store. Before each streamText call, you recall the most relevant past memories and inject them into the system prompt. After the response streams, you store the new exchange as memories. The pattern is a clean before/after wrapper around your existing API route.
Installing Kronvex Node.js SDK
npm install @kronvex/sdk ai openai # or: yarn add @kronvex/sdk ai openai # or: pnpm add @kronvex/sdk ai openai
OPENAI_API_KEY=sk-... KV_API_KEY=kv-... KV_AGENT_ID=your-agent-uuid
import Kronvex from '@kronvex/sdk' export const kv = new Kronvex({ apiKey: process.env.KV_API_KEY!, agentId: process.env.KV_AGENT_ID!, })
The Next.js API route pattern
The standard Vercel AI SDK API route uses streamText and returns a streaming response. The Kronvex integration wraps this with two additional steps: recall before, store after.
In Next.js App Router, your chat API route lives at app/api/chat/route.ts. The POST handler receives the messages array from the client and calls the LLM. Here is the base pattern before adding memory:
// app/api/chat/route.ts — base, no memory import { streamText } from 'ai' import { openai } from '@ai-sdk/openai' export async function POST(req: Request) { const { messages } = await req.json() const result = await streamText({ model: openai('gpt-4o'), messages, }) return result.toDataStreamResponse() }
Recall memories before streamText
Before the streamText call, use Kronvex's injectContext method. It takes the latest user message and returns a pre-formatted memory block ready to inject into the system prompt.
import { kv } from '@/lib/memory' // Get the latest user message const lastUserMsg = messages.filter(m => m.role === 'user').at(-1)?.content ?? '' // Recall relevant past memories for this user const memoryContext = await kv.injectContext({ message: lastUserMsg, sessionId: userId, // from auth session }) // Build enriched system prompt const systemPrompt = memoryContext ? `You are a helpful assistant.\n\n${memoryContext}\n\nUse the above memories to personalise your response.` : 'You are a helpful assistant.'
Store memories after each response
Vercel AI SDK's streamText accepts an onFinish callback that fires when the response stream completes. This is the ideal place to store the new exchange without blocking the stream.
const result = await streamText({ model: openai('gpt-4o'), system: systemPrompt, messages, onFinish: async ({ text }) => { // Store the user message await kv.remember({ content: `User: ${lastUserMsg}`, memoryType: 'episodic', sessionId: userId, }) // Store the assistant response await kv.remember({ content: `Assistant: ${text.slice(0, 500)}`, memoryType: 'episodic', sessionId: userId, ttlDays: 90, }) }, })
Full API route: chat with persistent memory
Here is the complete, production-ready Next.js App Router API route with Kronvex persistent memory integrated. It uses NextAuth for user identity, OpenAI GPT-4o as the model, and Kronvex for memory recall and storage.
import { streamText } from 'ai' import { openai } from '@ai-sdk/openai' import { getServerSession } from 'next-auth' import { kv } from '@/lib/memory' import { authOptions } from '@/lib/auth' export const runtime = 'edge' // optional: run at edge for lowest latency export const maxDuration = 60 export async function POST(req: Request) { // 1. Auth — get stable user identity const session = await getServerSession(authOptions) const userId = session?.user?.id ?? 'anonymous' const { messages } = await req.json() const lastUserMsg = (messages.filter((m: any) => m.role === 'user').at(-1)?.content ?? '') as string // 2. Recall relevant memories let systemPrompt = 'You are a helpful assistant with memory of past conversations.' try { const memCtx = await kv.injectContext({ message: lastUserMsg, sessionId: userId, }) if (memCtx) { systemPrompt += `\n\n${memCtx}\n\nUse these memories to personalise your response.` } } catch { // Memory recall failure should never break the chat } // 3. Stream the response const result = await streamText({ model: openai('gpt-4o'), system: systemPrompt, messages, onFinish: async ({ text }) => { try { // 4. Store the exchange as memories (non-blocking) await Promise.all([ kv.remember({ content: `User: ${lastUserMsg}`, memoryType: 'episodic', sessionId: userId, }), kv.remember({ content: `Assistant: ${text.slice(0, 600)}`, memoryType: 'episodic', sessionId: userId, ttlDays: 90, }), ]) } catch { // Memory store failure is silent — chat continues } }, }) return result.toDataStreamResponse() }
Client-side: useChat with user identity
The frontend code using useChat requires no changes — the memory layer is entirely handled in the API route. However, you should ensure the user is authenticated before enabling the chat, so the API route has a stable userId to use as sessionId.
'use client' import { useChat } from 'ai/react' export default function Chat() { const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({ api: '/api/chat', // initialMessages: [] — start fresh each session // Memory is injected server-side via system prompt }) return ( <div> {messages.map(m => ( <div key={m.id} className={m.role}> {m.content} </div> ))} <form onSubmit={handleSubmit}> <input value={input} onChange={handleInputChange} /> <button disabled={isLoading}>Send</button> </form> </div> ) }
Works with any model — GPT-4, Claude, Gemini
The Vercel AI SDK abstracts over multiple providers. You can swap the model without changing any Kronvex code — the memory layer is completely provider-agnostic.
// GPT-4o (OpenAI) import { openai } from '@ai-sdk/openai' model: openai('gpt-4o') // Claude 3.7 Sonnet (Anthropic) import { anthropic } from '@ai-sdk/anthropic' model: anthropic('claude-sonnet-4-6') // Gemini 2.0 Flash (Google) import { google } from '@ai-sdk/google' model: google('gemini-2.0-flash') // Kronvex code stays the same regardless of model
Best practices
Never block the stream for memory operations
Wrap all Kronvex calls in try/catch. Memory recall failure should degrade gracefully — the chat continues without memory context. Memory storage happens in onFinish which fires after the response starts streaming, so it never adds latency to the user experience.
Session ID from auth, not client
Never trust a userId sent from the client. Always derive it from the server-side auth session (NextAuth, Clerk, Auth.js). An attacker could otherwise read or write to another user's memories by sending a different user ID in the request body.
Limit message history in the client
Because Kronvex handles long-term memory, you don't need to send the full conversation history to the LLM. Keep the messages array to the last 6–10 turns maximum. This reduces token usage significantly while Kronvex handles the semantic retrieval of older context.
- Use
maxMessagesin useChat —useChat({ maxMessages: 10 })automatically trims older messages client-side - Store semantic facts separately — when users mention something important ("I'm the CTO", "we use PostgreSQL"), store it as a
semanticmemory with no TTL so it persists indefinitely - Edge runtime compatible — the Kronvex Node.js SDK works in both Node.js and Edge runtime environments on Vercel
- EU data residency — Kronvex is hosted in Frankfurt. If your Vercel deployment is also in EU regions, all data stays in Europe
Add persistent memory to your Next.js AI app
Free plan — 1 agent, 100 memories. No credit card. Works with any Vercel AI SDK model or framework.
Get your free API key →