5-Phase Healing Pipeline and State Machine Migration
The biggest architectural change since launch. All status transitions now go through a centralized state machine, and the healing pipeline graduates mailboxes through 5 controlled phases instead of binary pause/resume.
What's in this release
- 15-phase healing pipeline (paused → quarantine → restricted → warm → healthy)
- 2State machine migration — single authority for all status changes
- 3Mailbox rotation with standby mailboxes
- 4Correlation engine for cross-entity failure detection
5-Phase Healing Pipeline
When a mailbox gets paused, it no longer sits in limbo. It enters a graduated recovery with explicit criteria at each phase.
Phase 0: Paused
Cooldown timer with exponential backoff (24h first offense, 72h second, 7 days third+). Mailbox removed from all campaigns on the sending platform.
Phase 1: Quarantine
Cooldown expired. System checks domain DNS health — SPF, DKIM, blacklists. If the domain is broken, the mailbox stays here. No point warming up on a poisoned domain.
Phase 2: Restricted Send
DNS passed. Warmup re-enabled at 10 emails/day. Must complete 15 clean sends with zero bounces. Repeat offenders need 25.
Phase 3: Warm Recovery
Volume increases to 50/day with +5/day ramp. Must sustain 3+ days with bounce rate under 2%.
Phase 4: Healthy
Full recovery. Re-added to all campaigns. Maintenance warmup continues. Resilience score gets +10 bonus.
State Machine Migration
All 24+ direct status writes across the codebase were migrated to entityStateService — the single authority for status changes.
Centralized state transitions
entityStateService.ts validates every transition before execution. Invalid transitions (e.g., healthy → warm_recovery) are rejected. Full audit trail for every state change.
Cooldown and locking
Cooldown timers with exponential backoff. Optimistic locking on phase transitions prevents race conditions between workers.
Key rule enforced
Campaigns NEVER pause on bounce rate alone — only when ALL mailboxes are paused or removed.
Mailbox Rotation and Correlation
Smart infrastructure response that goes beyond simple pausing.
Standby rotation
When a mailbox is paused, the system checks for standby mailboxes on the same domain and swaps them into affected campaigns automatically.
Correlation engine
Before pausing a mailbox, the system checks if the root cause is actually at the domain level (multiple mailboxes failing) or campaign level (bad lead list). Pauses the right entity.