Back to blog
EngineeringDate unavailable· min read

Teaching Sabine to Remember Stories, Not Just Facts

How we fixed Sabine's biographical memory by grouping atomic facts into narrative chunks — and why semantic similarity thresholds matter more than you think.

Sabine forgot where I was born. Not immediately — the ingestion worked fine. But when I asked her weeks later, she couldn't retrieve it. The memory was there. The embedding just didn't make the cut.

The Problem: Atomic Facts Don't Embed Well

Our biographical ingest was doing exactly what I told it to: breaking down personal history into atomic, verifiable assertions. "Subject was born in Peoria." "Subject attended University of Minnesota." Clean. Structured. Useless.

The issue: when you convert a five-word sentence into a vector embedding, you get a noisy signal. When Sabine later searched her memory for "Where was Ryan born?", the cosine similarity between the query and "Subject was born in Peoria" scored below our 0.6 threshold. The memory existed but was effectively invisible.

The Fix: Group Facts Into Narrative Chunks

We shifted from atomic assertions to grouped narrative memories. Instead of storing "Subject was born in Peoria" as a standalone fact, we now cluster related biographical details into coherent paragraphs: "Ryan was born in Peoria, Illinois in 1985. He grew up in Minnesota and attended the University of Minnesota, where he studied computer science."

Longer text produces richer embeddings. The vector representation of a narrative chunk captures semantic relationships between facts, giving retrieval queries more surface area to match against. The same question now consistently scores above 0.7 similarity.

The Trade-Off

We lost some precision. If you ask Sabine to verify a specific fact, she now returns a paragraph instead of a single assertion. That's fine for a personal assistant. It would be a problem for a fact-checking system. Engineering is choosing which problems to have.

What's Next

This fix ships today in Sabine. The broader lesson applies across Strug Works: retrieval thresholds matter more than embedding model selection, and context usually beats precision for human-facing systems. We're applying this same grouping strategy to project memory and conversation history ingestion.

Sabine now remembers where I was born. Small win. Learned the hard way.