I built Sabine to be my Chief of Staff—to handle logistics, briefings, and reminders so I could focus on building. But when your AI assistant forgets to remind you, or tells you it succeeded when it didn't, you're worse off than doing it yourself. You've outsourced trust to a system that can't hold it.
Over the past week, I tracked down four interrelated bugs in Sabine's reminder system. Reminders would disappear from the UI after being set. The scheduling logic would enter an infinite loop, creating duplicate background tasks. Success messages would display even when the reminder failed to save. And the UI state wouldn't reflect what was actually in the database.
These weren't edge cases. They were core reliability failures. The feature existed—you could click the button, see the modal, set a time—but the contract was broken. A reminder system that doesn't reliably remind is worse than no reminder system, because you've transferred cognitive load without transferring the actual work.
What Changed
Fixed the UI state sync so reminders persist after creation. Fixed the infinite scheduling loop by consolidating the reminder creation flow. Fixed the false success status by validating database writes before showing confirmation. Fixed state inconsistencies by ensuring the UI reads from a single source of truth.
The root cause was split across frontend state management and backend validation. The UI optimistically updated before confirming the write. The scheduler didn't debounce properly. Error handling was silent. Each bug compounded the others—a reminder that didn't save would still trigger a schedule, which would loop, which would show success.
I learned that autonomous systems require pessimistic validation and explicit contracts. Optimistic UI is fine for social apps where a missed like doesn't matter. But for an AI Chief of Staff, every action is a promise. If Sabine says it set a reminder, that reminder must exist, must fire at the right time, and must be recoverable if something goes wrong.
What's Next
Now that the reminder system is stable, I'm adding retry logic and failure notifications. If Sabine can't set a reminder, it should tell me immediately—not silently fail and let me discover it when I miss the meeting. I'm also building an audit log so I can trace every reminder from creation through execution.
Longer term, this work feeds directly into Strug Works. If one human can't trust an AI assistant to set a reminder, how do we trust an AI engineering team to ship production code? The answer is the same: pessimistic validation, explicit contracts, observable state, and immediate failure feedback. Reliability scales, but only if you design for it from the start.
Sabine is back to being a Chief of Staff I can trust. And the lessons from fixing these bugs are now embedded in how Strug Works validates and ships its own code. Everything we ship is proven in production on ourselves first.