Designing the Amy job scheduler

Most schedulers are deceptively simple — until you ask what happens when a worker dies between dequeueing a job and starting it. Or when a daylight-saving boundary creates two 1:30 AMs. Or when a user pauses an agent mid-run.

We split the problem into two halves: a stateless trigger layer that emits intent ("this schedule wants to fire at T") and a stateful dispatcher that owns idempotency and lifecycle. Each half is replaceable, neither owns the other's bugs.

The lesson, which we learned twice: the schedule is not the run. Conflate them and every retry becomes a duplicate-billing incident.

Designing the Amy job scheduler

More in Amy Engineering

Inside Amy's credit system

How we sandbox untrusted browser tools

Observability for long-running agent runs