What I'm using under the hood

People keep asking what's running inside Toasted. Here's the honest answer: it changes. I rewrote the model layer three times in four months.

Right now it's a mix — one provider for the first draft, a smaller model for tone rewrites, and a tiny on-device model for the practice-mode pacing cues. The split keeps latency low and cost predictable.

I won't name providers because the moment I do, this post becomes wrong. The principle is what matters: pick the cheapest model that passes your evaluator and let the work tell you when to upgrade.