LIVE DEMO → Home Product
FeaturesUse CasesCompare
Docs
DocumentationQuickstartIntegrations
PricingBlog DASHBOARD → LOG IN →
Tutorial Vercel AI SDK Node.js May 4, 2026 · 8 min read

Persistent Memory for
Vercel AI SDK

The Vercel AI SDK makes it trivially easy to stream AI responses in Next.js — but every chat session starts with a blank slate. Users who come back tomorrow have to re-explain their context from scratch. This guide shows how to add Kronvex persistent memory alongside streamText and useChat so your users are always remembered.

In this article
  1. Vercel AI SDK and the memory gap
  2. Installing Kronvex Node.js SDK
  3. The Next.js API route pattern
  4. Recall memories before streamText
  5. Store memories after each response
  6. Full API route: chat with persistent memory
  7. Client-side: useChat with user identity
  8. Works with any model — GPT-4, Claude, Gemini
  9. Best practices

Vercel AI SDK and the memory gap

Vercel's AI SDK (package: ai) provides a unified interface for streaming completions from any LLM provider. The useChat React hook handles the frontend conversation state, and streamText handles the backend API call. Together they cover the entire chat loop beautifully.

What they don't cover is persistence. The useChat hook stores messages in React state — when the page reloads, all messages are gone. The backend streamText call receives whatever messages the client sends — it has no knowledge of previous sessions. Every conversation is ephemeral by design.

For a personal productivity chatbot, a customer support agent, or a coding assistant, this is a significant problem. Users invest time providing context — their tech stack, their preferences, their project constraints — and lose it all on every reload or next-day visit.

Kronvex solves this by acting as a semantic memory store. Before each streamText call, you recall the most relevant past memories and inject them into the system prompt. After the response streams, you store the new exchange as memories. The pattern is a clean before/after wrapper around your existing API route.

Model-agnostic. Kronvex works with any model the Vercel AI SDK supports — OpenAI GPT-4o, Anthropic Claude, Google Gemini, Mistral, Cohere. The memory layer is independent of the LLM provider. Switch models without changing your memory code.

Installing Kronvex Node.js SDK

Install packages
npm install @kronvex/sdk ai openai
# or: yarn add @kronvex/sdk ai openai
# or: pnpm add @kronvex/sdk ai openai
Environment variables (.env.local)
OPENAI_API_KEY=sk-...
KV_API_KEY=kv-...
KV_AGENT_ID=your-agent-uuid
Initialize the Kronvex client (lib/memory.ts)
import Kronvex from '@kronvex/sdk'

export const kv = new Kronvex({
  apiKey: process.env.KV_API_KEY!,
  agentId: process.env.KV_AGENT_ID!,
})

The Next.js API route pattern

The standard Vercel AI SDK API route uses streamText and returns a streaming response. The Kronvex integration wraps this with two additional steps: recall before, store after.

In Next.js App Router, your chat API route lives at app/api/chat/route.ts. The POST handler receives the messages array from the client and calls the LLM. Here is the base pattern before adding memory:

Base pattern (no memory)
// app/api/chat/route.ts — base, no memory
import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'

export async function POST(req: Request) {
  const { messages } = await req.json()
  const result = await streamText({
    model: openai('gpt-4o'),
    messages,
  })
  return result.toDataStreamResponse()
}

Recall memories before streamText

Before the streamText call, use Kronvex's injectContext method. It takes the latest user message and returns a pre-formatted memory block ready to inject into the system prompt.

Recall: inject context into system prompt
import { kv } from '@/lib/memory'

// Get the latest user message
const lastUserMsg = messages.filter(m => m.role === 'user').at(-1)?.content ?? ''

// Recall relevant past memories for this user
const memoryContext = await kv.injectContext({
  message: lastUserMsg,
  sessionId: userId,  // from auth session
})

// Build enriched system prompt
const systemPrompt = memoryContext
  ? `You are a helpful assistant.\n\n${memoryContext}\n\nUse the above memories to personalise your response.`
  : 'You are a helpful assistant.'

Store memories after each response

Vercel AI SDK's streamText accepts an onFinish callback that fires when the response stream completes. This is the ideal place to store the new exchange without blocking the stream.

Store memories in onFinish callback
const result = await streamText({
  model: openai('gpt-4o'),
  system: systemPrompt,
  messages,
  onFinish: async ({ text }) => {
    // Store the user message
    await kv.remember({
      content: `User: ${lastUserMsg}`,
      memoryType: 'episodic',
      sessionId: userId,
    })
    // Store the assistant response
    await kv.remember({
      content: `Assistant: ${text.slice(0, 500)}`,
      memoryType: 'episodic',
      sessionId: userId,
      ttlDays: 90,
    })
  },
})

Full API route: chat with persistent memory

Here is the complete, production-ready Next.js App Router API route with Kronvex persistent memory integrated. It uses NextAuth for user identity, OpenAI GPT-4o as the model, and Kronvex for memory recall and storage.

app/api/chat/route.ts — full implementation
import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { getServerSession } from 'next-auth'
import { kv } from '@/lib/memory'
import { authOptions } from '@/lib/auth'

export const runtime = 'edge' // optional: run at edge for lowest latency
export const maxDuration = 60

export async function POST(req: Request) {
  // 1. Auth — get stable user identity
  const session = await getServerSession(authOptions)
  const userId = session?.user?.id ?? 'anonymous'

  const { messages } = await req.json()
  const lastUserMsg = (messages.filter((m: any) => m.role === 'user').at(-1)?.content ?? '') as string

  // 2. Recall relevant memories
  let systemPrompt = 'You are a helpful assistant with memory of past conversations.'
  try {
    const memCtx = await kv.injectContext({
      message: lastUserMsg,
      sessionId: userId,
    })
    if (memCtx) {
      systemPrompt += `\n\n${memCtx}\n\nUse these memories to personalise your response.`
    }
  } catch {
    // Memory recall failure should never break the chat
  }

  // 3. Stream the response
  const result = await streamText({
    model: openai('gpt-4o'),
    system: systemPrompt,
    messages,
    onFinish: async ({ text }) => {
      try {
        // 4. Store the exchange as memories (non-blocking)
        await Promise.all([
          kv.remember({
            content: `User: ${lastUserMsg}`,
            memoryType: 'episodic',
            sessionId: userId,
          }),
          kv.remember({
            content: `Assistant: ${text.slice(0, 600)}`,
            memoryType: 'episodic',
            sessionId: userId,
            ttlDays: 90,
          }),
        ])
      } catch {
        // Memory store failure is silent — chat continues
      }
    },
  })

  return result.toDataStreamResponse()
}

Client-side: useChat with user identity

The frontend code using useChat requires no changes — the memory layer is entirely handled in the API route. However, you should ensure the user is authenticated before enabling the chat, so the API route has a stable userId to use as sessionId.

components/Chat.tsx
'use client'

import { useChat } from 'ai/react'

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
    api: '/api/chat',
    // initialMessages: [] — start fresh each session
    // Memory is injected server-side via system prompt
  })

  return (
    <div>
      {messages.map(m => (
        <div key={m.id} className={m.role}>
          {m.content}
        </div>
      ))}
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        <button disabled={isLoading}>Send</button>
      </form>
    </div>
  )
}

Works with any model — GPT-4, Claude, Gemini

The Vercel AI SDK abstracts over multiple providers. You can swap the model without changing any Kronvex code — the memory layer is completely provider-agnostic.

Switch provider without changing memory code
// GPT-4o (OpenAI)
import { openai } from '@ai-sdk/openai'
model: openai('gpt-4o')

// Claude 3.7 Sonnet (Anthropic)
import { anthropic } from '@ai-sdk/anthropic'
model: anthropic('claude-sonnet-4-6')

// Gemini 2.0 Flash (Google)
import { google } from '@ai-sdk/google'
model: google('gemini-2.0-flash')

// Kronvex code stays the same regardless of model

Best practices

Never block the stream for memory operations

Wrap all Kronvex calls in try/catch. Memory recall failure should degrade gracefully — the chat continues without memory context. Memory storage happens in onFinish which fires after the response starts streaming, so it never adds latency to the user experience.

Session ID from auth, not client

Never trust a userId sent from the client. Always derive it from the server-side auth session (NextAuth, Clerk, Auth.js). An attacker could otherwise read or write to another user's memories by sending a different user ID in the request body.

Limit message history in the client

Because Kronvex handles long-term memory, you don't need to send the full conversation history to the LLM. Keep the messages array to the last 6–10 turns maximum. This reduces token usage significantly while Kronvex handles the semantic retrieval of older context.

Add persistent memory to your Next.js AI app

Free plan — 1 agent, 100 memories. No credit card. Works with any Vercel AI SDK model or framework.

Get your free API key →

Or read the full Vercel AI integration docs

Integration guide
Vercel AI SDK Persistent Memory — Setup Guide
Step-by-step setup · code snippets · 2 min
Read the guide →
Related articles
Free API Key
Get started free
No credit card required. 1 agent, 100 memories on the demo plan.