ClawCon Ankara · April 29, 2026

The Other Half of the
Context Problem

B. Mert Köseoğlu · Creator of context-mode

Organized by BuilderMare — Veli Uysal & Furkan Demir · ClawCon & OpenClaw · Special thanks to Murat Aslan

Slide 1

Good evening everyone. Quick thank you to Veli and Furkan from BuilderMare, to Murat Aslan, and to ClawCon and the OpenClaw community for making this happen in Ankara. [pause] I'm Mert. I build open-source tools for AI coding agents. Tonight I want to show you a problem that every single person in this room is living with. You just might not have a name for it yet.

The Problem

Your AI agent
has no memory.

Claude Code, Cursor, Copilot, Codex, Gemini CLI.
Under the hood, they all work the same way.

Slide 2

Let's start with a fact that might surprise you. [pause] Your AI coding agent has no memory. None. Claude Code, Cursor, Copilot, Codex, Gemini CLI. Under the hood, they all work exactly the same way. The model is stateless. It doesn't remember anything from one turn to the next. So how does it seem like it remembers? Let me show you.

How it actually works

Every turn, everything gets re-sent.

Turn 1

your prompt + 1 tool result

~60 KB

Turn 5

everything from turns 1-4 + new output

~300 KB

Turn 15

all previous turns re-sent again

~600 KB

Turn 30

context window almost full

~1.2 MB

Turn 31

COMPACTION — agent summarizes & forgets everything

RESET

Slide 3

Here's what actually happens. [pause] Turn one, you ask something, a tool runs, sixty kilobytes come back. That's fine. Turn five, all the output from turns one through four gets re-sent along with the new stuff. Three hundred kilobytes now. Turn fifteen, everything ships again. Six hundred kilobytes. Turn thirty, you're at 1.2 megabytes and the context window is almost full. [pause] Turn thirty-one? The agent compacts. It summarizes the conversation. Throws away the originals. And just like that, it forgets everything it read, every decision it made, the task it was in the middle of. You start over.

A real example

One command. 750,000 tokens.

gh issue list

59 KB

of JSON comes back

That’s about 15,000 tokens.

Next turn, those 15,000 tokens get sent again.

Turn after that, again.

Fifty turns in, that one command has cost you

750,000 tokens

Now do that 20 times in a session.

30 MB

of input tokens on tool output alone.

Slide 4

Let me make this concrete. [pause] You run gh issue list. One command. Fifty-nine kilobytes of JSON come back. About fifteen thousand tokens. Seems reasonable, right? [pause] But next turn, those fifteen thousand tokens ship again. Turn after that, again. Fifty turns in, that single command has cost you seven hundred fifty thousand input tokens. [pause] And you're not running one command per session. You're running twenty. Average output: thirty kilobytes each. That's thirty megabytes of input tokens. On tool output alone. That's where your context goes.

The breaking point

When the context fills up,
everything disappears.

The agent compacts. Summarizes the conversation. Throws away the originals. Gone:

× Files it read

× Decisions it made

× Errors it fixed

× Your corrections

× The task in progress

× Your instructions

50%

accuracy drop at 32K tokens

NoLiMa, 2025

15 min

lost re-explaining
your codebase

$60K

/year for a 50-seat team
on an agent that forgets

Slide 5

And when it fills up, everything disappears. [pause] Files it read. Gone. Decisions it made. Gone. Errors it encountered and fixed. Gone. Your instructions and corrections. Gone. The task it was in the middle of building. Gone. [pause] Fifty percent accuracy drop after thirty-two thousand tokens. Fifteen minutes lost every time it resets, re-explaining your codebase from scratch. Sixty thousand dollars a year for a fifty-seat team, burned on an agent that forgets everything every twenty minutes.

The Solution

What if the data
never entered context?

Slide 6

[let it sit for a moment] What if the data never entered the context window in the first place?

context-mode · Step 1

Intercept.

Agent

calls

gh issue list

59 KB

raw JSON output

5 lifecycle hooks

context-mode

intercepts via PostToolUse hook

intercept

context window

1.1 KB

summary only

←

FTS5 database

59 KB

searchable, indexed

Slide 7

context-mode intercepts. [pause] Your agent calls gh issue list. Fifty-nine kilobytes of JSON come back. Normally, all of that dumps straight into your context window. But context-mode is sitting between the agent and its tools. It uses five lifecycle hooks — PreToolUse, PostToolUse, SessionStart, PreCompact, UserPromptSubmit — to intercept tool output before it reaches the conversation. The full fifty-nine kilobytes goes to a local FTS5 database. Only a one-point-one kilobyte summary enters context. The agent still gets the answer. Your context window barely notices.

context-mode · Step 2

Sandbox.

Don’t pull data into context.
Send code to the data.

Before: Pull data into context

cat access.log # 45 KB enters context
# Agent reads all 45 KB, writes summary
# 45 KB stays in context FOREVER

After: Think in Code™

ctx_execute("javascript", `
  const logs = fs.readFileSync('access.log')
  const errors = logs.filter(l => l.status >= 500)
  console.log(errors.length + ' server errors')
`)

45 KB

enters context (before)

155 B

enters context (after)

200× reduction

Slide 8

Sandbox. [pause] This is the paradigm shift. We call it Think in Code. Instead of pulling forty-five kilobytes of access logs into your context window for the agent to read, you send code to the data. The agent writes a small script, context-mode executes it in a sandbox, and only the stdout — one hundred fifty-five bytes — enters context. [pause] Two hundred times reduction. The agent still gets the answer. It wrote the analysis. But the raw data never touched the context window. This is mandatory across all fourteen platforms context-mode supports.

context-mode · Step 3

Index.

Everything the agent learns is searchable forever. Even across compactions.

Local FTS5/BM25 Knowledge Base

tool outputs auto-indexed

web pages fetched chunked & indexed

file analysis results sandbox stdout

session events (26 categories) survives compaction

Session Persistence

26 event categories carry over through compaction and --resume. The agent never has to re-learn your codebase.

Slide 9

Index. [pause] Everything the agent touches goes into a local FTS5 database with BM25 ranking. Tool outputs, web pages, sandbox results — all searchable. But here's what makes it special: session persistence. [pause] Twenty-six event categories — files modified, decisions made, errors encountered, user corrections, git state, environment details — all of it carries over through compactions and even through session resume. The agent compacts? It doesn't forget. It searches the knowledge base and picks up right where it left off. You never have to explain your codebase twice.

Measured Results

Before & After.

Without context-mode

Playwright snapshot 56 KB

GitHub Issues query 59 KB

Access log analysis 45 KB

Per session (50 turns) 30 MB re-sent

With context-mode

Playwright snapshot 299 B (99.5%)

GitHub Issues query 1.1 KB (98%)

Access log analysis 155 B (99.7%)

Per session (50 turns) 1 MB re-sent

98–99.5%

context reduction. Same work. Same answers.

Slide 10

Let's look at the numbers. [pause] Playwright snapshot: fifty-six kilobytes down to two hundred ninety-nine bytes. Ninety-nine point five percent reduction. GitHub issues: fifty-nine kilobytes down to one point one. Access logs: forty-five kilobytes down to one hundred fifty-five bytes. [pause] Per session, thirty megabytes re-sent becomes one megabyte. Ninety-eight to ninety-nine point five percent context reduction. Same work. Same answers. The agent doesn't lose any capability. It just stops wasting your context window on data it already processed.

Open Source

context-mode

The other half of the context problem.

95K+

users

79K

npm

14

platforms

10K

GitHub stars

Claude Code Codex Cursor Gemini OpenClaw Kiro Cline +7

context-mode.com

Thank you.

Slide 11

context-mode. Open source. Always has been. [pause] Ninety-five thousand users. Seventy-nine thousand npm downloads. Fourteen platform adapters, including a native OpenClaw adapter, which feels right to mention here tonight. Ten thousand GitHub stars. [pause] It's open source because I believe the answer to the problem I just showed you can only come from the community, not from vendors. [pause] context-mode.com. Thank you again to Veli, Furkan, Murat, and ClawCon.

The Other Half of theContext Problem

Your AI agenthas no memory.

Every turn, everything gets re-sent.

One command. 750,000 tokens.

When the context fills up,everything disappears.

What if the datanever entered context?

Intercept.

Sandbox.

Index.

Before & After.

context-mode

The Other Half of the
Context Problem

Your AI agent
has no memory.

When the context fills up,
everything disappears.

What if the data
never entered context?