1 / 11 |
ClawCon Ankara

The Other Half of the
Context Problem

B. Mert Köseoğlu · Creator of context-mode
Organized by BuilderMare — Veli Uysal & Furkan Demir · ClawCon & OpenClaw · Special thanks to Murat Aslan
Slide 1

Good evening everyone. Quick thank you to Veli and Furkan from BuilderMare, to Murat Aslan, and to ClawCon and the OpenClaw community for making this happen in Ankara. [pause] I'm Mert. I build open-source tools for AI coding agents. Tonight I want to show you a problem that every single person in this room is living with. You just might not have a name for it yet.

Your AI agent
has no memory.

Claude Code, Cursor, Copilot, Codex, Gemini CLI.
Under the hood, they all work the same way.

Slide 2

Let's start with a fact that might surprise you. [pause] Your AI coding agent has no memory. None. Claude Code, Cursor, Copilot, Codex, Gemini CLI. Under the hood, they all work exactly the same way. The model is stateless. It doesn't remember anything from one turn to the next. So how does it seem like it remembers? Let me show you.

Every turn, everything gets re-sent.

Turn 1
your prompt + 1 tool result
~60 KB
Turn 5
everything from turns 1-4 + new output
~300 KB
Turn 15
all previous turns re-sent again
~600 KB
Turn 30
context window almost full
~1.2 MB
Turn 31
COMPACTION — agent summarizes & forgets everything
RESET
Slide 3

Here's what actually happens. [pause] Turn one, you ask something, a tool runs, sixty kilobytes come back. That's fine. Turn five, all the output from turns one through four gets re-sent along with the new stuff. Three hundred kilobytes now. Turn fifteen, everything ships again. Six hundred kilobytes. Turn thirty, you're at 1.2 megabytes and the context window is almost full. [pause] Turn thirty-one? The agent compacts. It summarizes the conversation. Throws away the originals. And just like that, it forgets everything it read, every decision it made, the task it was in the middle of. You start over.

One command. 750,000 tokens.

gh issue list
59 KB
of JSON comes back
That’s about 15,000 tokens.
Next turn, those 15,000 tokens get sent again.
Turn after that, again.
Fifty turns in, that one command has cost you
750,000 tokens
Now do that 20 times in a session.
30 MB
of input tokens on tool output alone.
Slide 4

Let me make this concrete. [pause] You run gh issue list. One command. Fifty-nine kilobytes of JSON come back. About fifteen thousand tokens. Seems reasonable, right? [pause] But next turn, those fifteen thousand tokens ship again. Turn after that, again. Fifty turns in, that single command has cost you seven hundred fifty thousand input tokens. [pause] And you're not running one command per session. You're running twenty. Average output: thirty kilobytes each. That's thirty megabytes of input tokens. On tool output alone. That's where your context goes.

When the context fills up,
everything disappears.

The agent compacts. Summarizes the conversation. Throws away the originals. Gone:

× Files it read
× Decisions it made
× Errors it fixed
× Your corrections
× The task in progress
× Your instructions
50%
accuracy drop at 32K tokens
NoLiMa, 2025
15 min
lost re-explaining
your codebase
$60K
/year for a 50-seat team
on an agent that forgets
Slide 5

And when it fills up, everything disappears. [pause] Files it read. Gone. Decisions it made. Gone. Errors it encountered and fixed. Gone. Your instructions and corrections. Gone. The task it was in the middle of building. Gone. [pause] Fifty percent accuracy drop after thirty-two thousand tokens. Fifteen minutes lost every time it resets, re-explaining your codebase from scratch. Sixty thousand dollars a year for a fifty-seat team, burned on an agent that forgets everything every twenty minutes.

What if the data
never entered context?

Slide 6

[let it sit for a moment] What if the data never entered the context window in the first place?

Intercept.

Agent
calls
gh issue list
59 KB
raw JSON output
5 lifecycle hooks
context-mode
intercepts via PostToolUse hook
intercept
context window
1.1 KB
summary only
FTS5 database
59 KB
searchable, indexed
Slide 7

context-mode intercepts. [pause] Your agent calls gh issue list. Fifty-nine kilobytes of JSON come back. Normally, all of that dumps straight into your context window. But context-mode is sitting between the agent and its tools. It uses five lifecycle hooks — PreToolUse, PostToolUse, SessionStart, PreCompact, UserPromptSubmit — to intercept tool output before it reaches the conversation. The full fifty-nine kilobytes goes to a local FTS5 database. Only a one-point-one kilobyte summary enters context. The agent still gets the answer. Your context window barely notices.

Sandbox.

Don’t pull data into context.
Send code to the data.

Before: Pull data into context
cat access.log   # 45 KB enters context
# Agent reads all 45 KB, writes summary
# 45 KB stays in context FOREVER
After: Think in Code™
ctx_execute("javascript", `
  const logs = fs.readFileSync('access.log')
  const errors = logs.filter(l => l.status >= 500)
  console.log(errors.length + ' server errors')
`)
45 KB
enters context (before)
155 B
enters context (after)
200× reduction
Slide 8

Sandbox. [pause] This is the paradigm shift. We call it Think in Code. Instead of pulling forty-five kilobytes of access logs into your context window for the agent to read, you send code to the data. The agent writes a small script, context-mode executes it in a sandbox, and only the stdout — one hundred fifty-five bytes — enters context. [pause] Two hundred times reduction. The agent still gets the answer. It wrote the analysis. But the raw data never touched the context window. This is mandatory across all fourteen platforms context-mode supports.

Index.

Everything the agent learns is searchable forever. Even across compactions.

Local FTS5/BM25 Knowledge Base
tool outputs auto-indexed
web pages fetched chunked & indexed
file analysis results sandbox stdout
session events (26 categories) survives compaction
Session Persistence

26 event categories carry over through compaction and --resume. The agent never has to re-learn your codebase.

Slide 9

Index. [pause] Everything the agent touches goes into a local FTS5 database with BM25 ranking. Tool outputs, web pages, sandbox results — all searchable. But here's what makes it special: session persistence. [pause] Twenty-six event categories — files modified, decisions made, errors encountered, user corrections, git state, environment details — all of it carries over through compactions and even through session resume. The agent compacts? It doesn't forget. It searches the knowledge base and picks up right where it left off. You never have to explain your codebase twice.

Before & After.

Without context-mode
Playwright snapshot 56 KB
GitHub Issues query 59 KB
Access log analysis 45 KB
Per session (50 turns) 30 MB re-sent
With context-mode
Playwright snapshot 299 B (99.5%)
GitHub Issues query 1.1 KB (98%)
Access log analysis 155 B (99.7%)
Per session (50 turns) 1 MB re-sent
98–99.5%
context reduction. Same work. Same answers.
Slide 10

Let's look at the numbers. [pause] Playwright snapshot: fifty-six kilobytes down to two hundred ninety-nine bytes. Ninety-nine point five percent reduction. GitHub issues: fifty-nine kilobytes down to one point one. Access logs: forty-five kilobytes down to one hundred fifty-five bytes. [pause] Per session, thirty megabytes re-sent becomes one megabyte. Ninety-eight to ninety-nine point five percent context reduction. Same work. Same answers. The agent doesn't lose any capability. It just stops wasting your context window on data it already processed.

context-mode

The other half of the context problem.

95K+
users
79K
npm
14
platforms
10K
GitHub stars
Claude Code Codex Cursor Gemini OpenClaw Kiro Cline +7
context-mode.com

Thank you.

Slide 11

context-mode. Open source. Always has been. [pause] Ninety-five thousand users. Seventy-nine thousand npm downloads. Fourteen platform adapters, including a native OpenClaw adapter, which feels right to mention here tonight. Ten thousand GitHub stars. [pause] It's open source because I believe the answer to the problem I just showed you can only come from the community, not from vendors. [pause] context-mode.com. Thank you again to Veli, Furkan, Murat, and ClawCon.