Notes on prompt caching at agent scale
What we measured running Amy with aggressive prompt caching: where it pays off, where it backfires, and the rule of thumb we use now.
Mira Patel2 min read
Notes from our work on agents, models, and orchestration.
What we measured running Amy with aggressive prompt caching: where it pays off, where it backfires, and the rule of thumb we use now.
You don't need a thousand-row eval set to catch the regression that matters. You need the right twenty rows.
Most teams reach for fine-tuning too early and prompt engineering too late. Here's the heuristic we use.