The Feature That Was There But Wasn't: Fixing Memory Lab in Sabine

Sometimes the hardest bugs to find aren't the ones that throw errors—they're the ones that pass every test but still don't work in production. We shipped Memory Lab integration to Sabine two weeks ago. The code was clean, the tests passed, and the feature was technically live. Except it wasn't.

What Went Wrong

We added Memory Lab retrieval to retrieve_context at line 790 in lib/agent/retrieval.py. The function existed. The integration worked. But Sabine's agents never called it. They call retrieve_context_tiered instead—a function that orchestrates multi-source retrieval with priority fallbacks. The code review passed because the integration looked correct in isolation. The tests passed because they tested the function we modified, not the execution path users actually hit.

This is the kind of mistake that should have been caught in code review. It should have been caught in testing. It should have been obvious. And yet it wasn't—because we were looking at the feature, not the system. We verified that Memory Lab retrieval worked; we didn't verify that Sabine actually used it.

The Fix

Commit 31d6d5d moves the Memory Lab integration from retrieve_context to retrieve_context_tiered. That's it. One function to another. Now when Sabine agents retrieve context during conversations, Memory Lab is part of the retrieval strategy. The feature that was always there is finally visible.

Why It Matters

Memory Lab stores long-term context about user preferences, conversation history, and learned patterns. Without it in the retrieval path, Sabine couldn't remember things between sessions. Every conversation started cold. Users would say "remember when we talked about X?" and Sabine would have no context to pull from—not because the memory didn't exist, but because the retrieval path never checked for it.

Now it does. Sabine can surface relevant memories during conversations, maintain context across sessions, and actually use the intelligence it's been accumulating. This is the kind of fix that doesn't add new features—it makes existing features actually work the way they were supposed to from day one.

What's Next

We're adding integration tests that trace the full execution path from user input to retrieval to response generation. Unit tests verify that functions work; integration tests verify that systems work. This bug slipped through because we had the former but not enough of the latter. We're also improving our code review process to include execution path analysis—not just "does this code work?" but "does this code get called?"

The best bugs are the ones that teach you something. This one taught us to test the path, not just the feature. And now Memory Lab is exactly where it needs to be.