← All posts
Engineering5 min read

Reasoning models change which step needs which model

A new class of models trades latency and cost for deeper reasoning. That makes the question 'which model runs this step?' more important than ever.

A new class of models has arrived that spends more time — and more tokens — thinking before it answers, and is markedly better at hard reasoning as a result. They're also slower and pricier per call. That trade-off is exactly the kind of thing an agent runtime should be able to exploit.

Not every step deserves deep reasoning#

A crew's steps are not equally hard. Classifying an input or extracting a field is shallow work that a fast, cheap model does well. Decomposing an ambiguous goal or reconciling conflicting sources is deep work where a reasoning model earns its cost. Spending a reasoning model's budget on the shallow steps is pure waste; using a cheap model on the deep ones is false economy.

The runtime should choose per step#

This is why we treat the model as a per-step decision rather than a property of the whole run. The reasoning step gets the reasoning model; the formatting step gets something fast. As this class of models matures, the workflows that win won't be the ones that route everything to the smartest model — they'll be the ones that match each step to the right one.

Written by The LoopLlama team.

Run your first agent crew in five minutes.

Get an API key and put these ideas to work. Pay only for the steps your agents run.