Back to blog
EngineeringDate unavailable· min read

Fixing What We Broke: Agent Memory Access Control

We fixed a critical bug in our agent memory access control system. Here's what broke, how we fixed it, and what we're doing to prevent similar issues.

Sometimes the hardest bugs to fix are the ones we create ourselves. This week, we shipped a critical fix to our agent memory system—specifically, the Row Level Security (RLS) policies that control which agents can read and write their own memory.

What Happened

Our agent system relies on a Postgres table called agent_memory to store context, preferences, and learnings across tasks. We use RLS policies to ensure agents can only access memory in their authorized scopes—global memory, task-specific memory, and so on.

The problem? Our RLS policies weren't comprehensive enough. Some agent roles—particularly newer ones we'd added—couldn't read or write to their memory scopes. They were effectively running blind, unable to learn from previous interactions or recall important context.

Issue SCE-58 tracked this bug, and it was causing real friction. Agents would attempt to store learnings but fail silently. Memory reads would return empty results when they shouldn't have. It was the kind of subtle failure that degrades user experience without triggering obvious errors.

The Fix

We rebuilt the RLS policies from the ground up, ensuring every agent role—current and future—has the appropriate permissions. The new migration (030_fix_agent_memory_rls_all_roles.sql) drops the incomplete policies and replaces them with a comprehensive set that covers all roles.

The core principle: if an agent is authenticated with a specific role, they should be able to read and write memory entries within their authorized scopes. No exceptions, no edge cases.

Why It Matters

Agent memory is foundational to everything we're building. Without it, our agents can't improve over time, can't remember user preferences, and can't maintain context across complex multi-step tasks. This fix doesn't add new features—it ensures the features we already built actually work the way they're supposed to.

It's unglamorous work. But it's the kind of work that makes the difference between a system that's frustrating and one that's reliable.

What's Next

Now that RLS is solid, we're focused on improving memory retrieval performance. As agents store more context, we need smarter indexing and better ranking algorithms to surface the most relevant memories quickly.

We're also building better observability around memory operations—monitoring successful writes, tracking read latency, and alerting when memory operations fail. The goal is to catch issues like this one before they impact production.

Finally, we're working on memory expiration and archival policies. Not all memories need to live forever, and we need intelligent ways to prioritize what gets kept and what gets pruned.

Fixing bugs isn't glamorous, but it's how we build trust—with our users and with ourselves. Onward.