Tutorial Python OpenAI March 22, 2026 · 10 min read

OpenAI Agents SDK Memory:
How to Add Persistent State

The OpenAI Agents SDK is designed to be stateless — clean, composable, and easy to reason about. But stateless means your agent forgets everything the moment a session ends. Here's how to wire in persistent memory without fighting the framework.

In this article

Why the OpenAI Agents SDK is stateless by design
What you lose without persistence
Integration architecture
Full code example
Before / after comparison
Production considerations
Quick start

Why the OpenAI Agents SDK is stateless by design

The OpenAI Agents SDK (released in early 2025) is built around a straightforward execution model: each Runner.run() call starts from scratch. You pass in a list of messages, the agent processes them, and it returns a response. There is no built-in mechanism to persist state between calls.

This is a deliberate design choice. Stateless agents are easier to test, easier to scale horizontally, and easier to reason about in a distributed environment. When OpenAI introduced the Agents SDK alongside the Responses API, they separated the concern of what the agent knows right now (the input messages) from what it has learned over time (which they left to the developer).

Note on Responses API vs Chat Completions: The Responses API does maintain a previous_response_id chain for a single conversation thread, but this only works within a session and doesn't survive process restarts or cross-user scenarios. It is threading, not memory.

What you lose without persistence

Without a memory layer, every session is a blank slate. The practical consequences compound quickly:

Context window bloat — to approximate memory, developers stuff previous conversations into the system prompt. At ~$15/M tokens for GPT-4o, this adds up fast.
Poor user experience — the agent asks the same onboarding questions on every session, forgets user preferences, and cannot reference past decisions.
No learning curve — the agent cannot get better at serving a specific user over time, making it feel generic regardless of how many interactions have occurred.
Broken workflows — multi-step processes that span days (e.g., a procurement agent managing an RFP) fall apart when state is lost between sessions.

Integration architecture

The pattern is straightforward: intercept the agent's inputs and outputs to read and write to Kronvex. The SDK's hook system makes this clean.

Architecture (text diagram)

# ┌─────────────────────────────────────────────────────────────────┐
# │                      User request                               │
# └────────────────────────────┬────────────────────────────────────┘
#                              │
#                    ┌─────────▼──────────┐
#                    │  recall() — fetch  │  ← Kronvex (pgvector)
#                    │  top-k memories    │
#                    └─────────┬──────────┘
#                              │ inject as system context
#                    ┌─────────▼──────────┐
#                    │  OpenAI Agent SDK  │
#                    │  Runner.run()      │
#                    └─────────┬──────────┘
#                              │
#                    ┌─────────▼──────────┐
#                    │  remember() — save │  ← Kronvex (pgvector)
#                    │  exchange + facts  │
#                    └─────────┬──────────┘
#                              │
#                    ┌─────────▼──────────┐
#                    │  Response to user  │
#                    └────────────────────┘

Full code example

The OpenAI Agents SDK exposes lifecycle hooks via RunHooks. We implement on_agent_start to inject context and on_agent_end to persist the exchange.

Install

pip install openai-agents kronvex

memory_hooks.py

from agents import RunHooks, RunContextWrapper, Agent, RunResult
from kronvex import Kronvex
import os

kv = Kronvex(os.environ["KRONVEX_API_KEY"])

class PersistentMemoryHooks(RunHooks):
    def __init__(self, agent_id: str, session_id: str = None):
        self.kv_agent  = kv.agent(agent_id)
        self.session_id = session_id
        self._last_input = None

    async def on_agent_start(
        self,
        context: RunContextWrapper,
        agent: Agent,
    ) -> None:
        # Grab the last user message as the recall query
        messages = context.context.get("messages", [])
        query = messages[-1]["content"] if messages else ""
        self._last_input = query

        # Fetch relevant memories
        ctx = self.kv_agent.inject_context(
            query=query,
            top_k=6,
            session_id=self.session_id,
        )

        # Prepend memory block to agent's system instructions
        if ctx.context:
            agent.instructions = (
                f"[MEMORY]\n{ctx.context}\n[/MEMORY]\n\n"
                + (agent.instructions or "")
            )

    async def on_agent_end(
        self,
        context: RunContextWrapper,
        agent: Agent,
        result: RunResult,
    ) -> None:
        kwargs = dict(session_id=self.session_id) if self.session_id else {}

        # Store user input
        if self._last_input:
            self.kv_agent.remember(
                content=self._last_input,
                memory_type="episodic",
                **kwargs,
            )

        # Store agent response
        output = result.final_output
        if output:
            self.kv_agent.remember(
                content=output,
                memory_type="episodic",
                **kwargs,
            )

main.py

from agents import Agent, Runner
from memory_hooks import PersistentMemoryHooks

agent = Agent(
    name="SupportAgent",
    instructions="You are a helpful customer support agent.",
    model="gpt-4o",
)

hooks = PersistentMemoryHooks(
    agent_id="your-kronvex-agent-id",
    session_id="user-42",  # scope per user
)

# Session 1
result = await Runner.run(
    agent,
    messages=[{"role": "user", "content": "I'm on the Pro plan, invoice #4821."}],
    hooks=hooks,
)

# Session 2 — new process, same user
result2 = await Runner.run(
    agent,
    messages=[{"role": "user", "content": "What plan am I on?"}],
    hooks=hooks,
)
# → "You're on the Pro plan. I can also see your invoice #4821."

Before / after comparison

The difference is visible at the first return session:

Scenario	Without memory	With Kronvex
User returns after 3 days	Treated as new user	Context instantly restored
User preference ("no bullet points")	Lost after session	Recalled on every session
Multi-session workflow	Breaks without full history in prompt	Seamless continuation
100 sessions in context	~200k tokens / $3 per request	Top-6 relevant snippets only

Production considerations

EU hosting and GDPR

Kronvex stores all memory vectors in an EU-hosted PostgreSQL instance (Supabase Frankfurt region). For B2B agents handling personal data, this matters: Article 46 transfer mechanisms, DPA availability, and data residency SLAs are all covered. No extra configuration needed.

Rate limits and quotas

The Kronvex API enforces per-plan limits on remember calls (writes) and recall calls (reads) separately. For high-throughput agents, batch your remember calls or use the async client to avoid blocking the response path.

Memory hygiene

Use memory_type="episodic" for conversation turns with ttl_days=30 to auto-expire stale exchanges
Use memory_type="semantic" for durable user facts (plan, preferences, company) with no TTL
Use memory_type="procedural" for agent-learned workflows and decision patterns
Set top_k=6 as a starting point — more memories means larger prompts and higher latency

Async agents: If you use Runner.run_streamed() or async tool calls, use the AsyncKronvex client to avoid blocking the event loop. Import it with from kronvex import AsyncKronvex.

Quick start

Three steps to add memory to any OpenAI agent:

1. Get your API key

# Sign up at https://kronvex.io — free plan, no credit card
# Create an agent in the dashboard → copy the agent ID

2. Install and configure

pip install openai-agents kronvex

export KRONVEX_API_KEY="kv-your-key"
export KRONVEX_AGENT_ID="your-agent-id"

3. Add the hooks (copy-paste ready)

from agents import Agent, Runner
from kronvex import Kronvex
from agents import RunHooks, RunContextWrapper, RunResult
import os

_kv = Kronvex(os.environ["KRONVEX_API_KEY"])
_kv_agent = _kv.agent(os.environ["KRONVEX_AGENT_ID"])

class MemoryHooks(RunHooks):
    def __init__(self, session_id: str):
        self.session_id = session_id
        self._q = None

    async def on_agent_start(self, ctx, agent):
        msgs = ctx.context.get("messages", [])
        self._q = msgs[-1]["content"] if msgs else ""
        mem = _kv_agent.inject_context(self._q, top_k=6, session_id=self.session_id)
        if mem.context:
            agent.instructions = f"[MEMORY]\n{mem.context}\n[/MEMORY]\n\n" + (agent.instructions or "")

    async def on_agent_end(self, ctx, agent, result):
        kw = dict(session_id=self.session_id)
        if self._q: _kv_agent.remember(self._q, memory_type="episodic", **kw)
        if result.final_output: _kv_agent.remember(result.final_output, memory_type="episodic", **kw)

# Usage
agent = Agent(name="MyAgent", instructions="You are a helpful assistant.", model="gpt-4o")
result = await Runner.run(agent, messages=[...], hooks=MemoryHooks("user-42"))

OpenAI Agents SDK Memory:How to Add Persistent State

Why the OpenAI Agents SDK is stateless by design

What you lose without persistence

Integration architecture

Full code example

Before / after comparison

Production considerations

EU hosting and GDPR

Rate limits and quotas

Memory hygiene

Quick start

OpenAI Agents SDK Memory:
How to Add Persistent State