The memory system that makes your AI actually remember things

The forgetting problem

Two weeks into running OpenClaw, I had 12 agents humming along. They could write code, draft emails, research competitors. Impressive stuff. But every single session started the same way: the agent had no idea what happened yesterday.

I'd ask Forge to continue working on the PortLink auth system. It would ask me what PortLink was. I'd tell Nyx to follow up on that email from a potential partner. She'd ask what email. Every morning felt like onboarding a new employee who happened to have amnesia.

This is the core problem with AI agents. The model itself doesn't remember anything between sessions. Your conversation ends, the context window clears, and everything you discussed vanishes. It doesn't matter how smart the model is. Without memory, you're stuck re-explaining the same things over and over.

I needed a system. Something simple enough that I could build it in a day, but structured enough that it would actually work across 26 agents without turning into a mess.

The three-file system

After trying a few approaches (including a Supabase-backed memory store that was way overengineered), I settled on three plain markdown files per agent:

MEMORY.md -- the long-term brain
memory/YYYY-MM-DD.md -- daily notes
decisions.md -- the decision log

That's it. No database. No vector store for memory (though Hermes uses ChromaDB for email search, that's different). Just files that get loaded into the agent's context at session start.

Let me walk through each one.

MEMORY.md: the always-on context

Every agent has a MEMORY.md file in its workspace. This file gets loaded into the agent's context window at the start of every single session. It's the first thing the agent "reads" before it does anything else.

Think of it as the agent's long-term memory. It contains everything the agent needs to know to be useful right away:

Its role and purpose
Key configuration details (API endpoints, database refs, account IDs)
Standing orders that never change
Hard-won lessons from past mistakes
Current project states

Here's a trimmed example from Hermes, my email agent:

# MEMORY.md -- Hermes

## Role
Email & CRM Manager for David Bakke.

## Architecture
- SQLite: data/crm.db -- 84k+ emails, V6 taxonomy
- ChromaDB: data/chromadb/ -- 89k+ docs
- FastAPI: http://localhost:8765

## Label System V6 -- LOCKED
11 action labels, 8 context labels, 14 type labels.
Auto-archive: PROMOTION + SCAM_DETECTED only.

The critical constraint: MEMORY.md must stay under 10,000 characters. I learned this the hard way. OpenClaw silently truncates files that are too long when loading them into context. Your agent loses the bottom half of its memory and you don't even know it. I now have every agent check wc -c MEMORY.md after any write.

What goes in MEMORY.md:

Facts that are true every session (credentials, architecture, rules)
Standing orders from me that should never be forgotten
Current state of active projects (updated regularly)
Lessons learned from failures

What does NOT go in MEMORY.md:

Anything that happened today (that goes in daily notes)
One-time instructions
Detailed specs or documentation (link to separate files instead)
Full conversation histories

Keep it tight. Every character counts against that 10K limit. I've gotten ruthless about trimming. If something can live in a separate file that gets read on demand, it should.

Daily notes: what happened today

The memory/YYYY-MM-DD.md files are the agent's short-term memory. Each day gets its own file. The agent writes to it during the session, capturing what it did, what it learned, and what's pending.

memory/
  2026-02-15.md
  2026-02-16.md
  2026-02-17.md
  2026-02-18.md

These files are NOT loaded automatically at session start. That would blow out the context window. Instead, the agent can read them on demand when it needs to remember what happened on a specific day.

A typical daily note looks like this:

# 2026-02-18

## Completed
- Fixed CORS issue on Hermes FastAPI endpoint
- Migrated 3,200 emails from old label format to V6
- Sent morning briefing to David (4 action items, 12 FYI)

## Pending
- 847 emails still need V6 classification (batch job running)
- David asked about calendar integration -- research tomorrow

## Issues
- Gmail API rate limit hit at 14:22 -- backed off 60s, resumed
- ChromaDB indexing slower than expected for Norwegian text

The value here isn't just for the agent. I read these files too. When I come back to an agent after a few days, I can scan the daily notes to catch up on what happened while I wasn't paying attention.

One thing I got wrong initially: I let agents write essays in their daily notes. Long paragraphs about what they were "thinking." That's useless. Daily notes should be terse. Facts, not feelings. What happened, not why the agent found it interesting.

decisions.md: the most important file

This is the file I didn't plan to build. It emerged out of frustration.

Here's what kept happening: I'd tell an agent to use approach A for something. Next session, it would use approach B. I'd correct it. Session after that, back to approach A. The agent was flip-flopping because it had no record of why a decision was made.

The decisions.md file fixes this. Every significant decision gets logged with a date, what was decided, and why.

# decisions.md -- Hermes

## 2026-03-08: AI classification model
Decision: Use GPT-4o-mini via OpenAI Batch API for email classification
Reason: Cheaper than Claude for bulk classification, good enough accuracy
Decided by: David

## 2026-03-16: Label System V6 -- LOCKED
Decision: Freeze label taxonomy. No changes without David's approval.
Reason: Three iterations caused data inconsistencies. Stability > optimization.
Decided by: David

## 2026-02-26: Email sending
Decision: Hermes NEVER sends email. Read-only + drafts only.
Reason: Early bug accidentally sent emails to contacts. Trust violation.
Decided by: David

Why is this the most important file? Because it prevents the agent from revisiting settled questions. When Hermes starts a new session and considers whether to use Claude or GPT for classification, it checks decisions.md and sees that David already decided on GPT-4o-mini. Question answered. No flip-flopping.

It also prevents me from flip-flopping. I can see my own reasoning from two weeks ago. Sometimes I'm about to change a decision, I read the "Reason" line, and I remember why I made that call in the first place.

The format matters. Each entry needs:

Date: when was this decided
Decision: what was decided (clear, unambiguous)
Reason: why (this is the part that prevents re-litigation)
Decided by: who made the call (usually me, sometimes the agent recommends and I approve)

The moment it clicked

About 14 days into OpenClaw, I had a session with Nyx where I asked her to review the status of all active projects. She pulled up her MEMORY.md, cross-referenced the daily notes from the past week, checked decisions.md for any recent changes, and gave me a status report that was actually accurate.

No re-explaining. No "what's PortLink?" No wrong assumptions. She knew the current state of six projects, remembered that we'd decided to pause Wheelhouse in favor of EventRipple, and flagged that Forge hadn't committed code in two days.

That's when I realized this wasn't just a convenience feature. Memory is what turns an AI from a tool into a team member. Without memory, you have a very smart stranger. With memory, you have a colleague who was here yesterday.

The broadcast problem

Memory gets more complicated when you have 26 agents. How do you tell all of them about a new decision?

My first approach: update each agent's MEMORY.md individually. That took forever and I always missed a few.

The solution was the broadcast system. I write a broadcast file in shared/broadcasts/ and Nyx distributes it to every agent using sessions_send. Each agent reads the broadcast and writes the relevant parts to its own MEMORY.md.

For example, when I decided on the Anthropic rate limit policy (max 2 parallel Claude jobs), I wrote one broadcast and all 26 agents absorbed it. Each agent now has that rule in its MEMORY.md.

Standing orders get a special tag in broadcasts: "Write this to your MEMORY.md now." That ensures the rule persists beyond the current session.

What I'd do differently

If I were starting over, I'd establish the 10,000 character limit on MEMORY.md from day one. I spent a painful weekend trimming bloated memory files after discovering the silent truncation issue. Some agents had 30K+ character MEMORY.md files and were losing context every session without me knowing.

I'd also enforce a stricter daily note format from the start. Some agents developed their own styles, which made it hard to parse when I needed to find something specific across multiple agents.

And I'd build decisions.md before anything else. That file should exist before the agent does its first piece of real work. Every decision, no matter how small, should be logged. The ones that seem obvious today are the ones you'll forget the reasoning for next month.

How to build this yourself

You don't need OpenClaw to use this system. Any AI setup where you can inject files into context works.

Create a MEMORY.md in your project. Write down everything the AI needs to know to be useful. Keep it under your context limit.
Create a memory/ directory. After each session, save a note about what happened. Even a few bullet points helps.
Create a decisions.md. Every time you decide something about how the AI should work, log it. Date, decision, reason.
At the start of each session, load MEMORY.md into context (or paste it if you're using ChatGPT/Claude directly).

The files don't have to be markdown. They don't have to follow my exact format. The point is that the information exists in a structured, persistent place outside the AI's ephemeral context window.

The AI itself won't remember you. But the files will. And that turns out to be enough.