Hardening Email Reliability in Sabine

Email is deceptively complex. When you're building an AI partnership platform like Sabine that needs to read, understand, and act on email conversations in real time, reliability becomes non-negotiable.

We recently shipped a set of hardening improvements to Sabine's Gmail integration that address three areas where we were seeing intermittent failures: status tracking, connection management, and deployment visibility.

Status Tokens: Making State Observable

The first change introduces status tracking tokens throughout the email pipeline. Previously, when something failed in the Gmail connection layer, we had limited visibility into where and why. Now every significant state transition—authentication, message fetch, parsing, handoff to the agent layer—generates a trackable status token.

This isn't just logging. Status tokens are structured state that we can query, aggregate, and use to drive automated recovery. If a user reports email lag, we can trace the exact path their message took through the system and identify the bottleneck.

MCP Singleton: One Connection, Many Clients

The second improvement moves our Model Context Protocol (MCP) server to a singleton pattern. Previously, we were spinning up a new MCP connection for each email operation. This created race conditions when multiple messages arrived simultaneously and led to credential refresh conflicts.

Now a single MCP server instance manages all Gmail API calls for a given user session. Connection pooling, token refresh, and rate limiting all happen in one place. The result is fewer errors and better performance under load.

Deploy Runbook: Codifying Tribal Knowledge

The third piece is a deployment runbook. As the Gmail integration grew more critical, we realized our deployment process lived mostly in Slack threads and engineer memory. The runbook documents pre-deploy checks, rollout steps, and rollback procedures in a single canonical place.

This matters because email integrations are stateful. A bad deploy can orphan in-flight messages or corrupt watch subscriptions. The runbook ensures we verify connection health before and after every change.

What's Next

These changes are live in production and we're already seeing fewer transient errors. Next up: extending status tokens into the agent decision layer so we can trace a message all the way from inbox to completed action. We're also working on automated health checks that use the runbook logic to verify system state on every deploy.

Email is infrastructure. It needs to be boring, reliable, and fast. This update gets us closer to that bar.