A single chatbot interplay could devour a couple of thousand tokens. A helpful agentic workflow can devour tons of of 1000’s or thousands and thousands of tokens per day as a result of it does greater than reply a query. It decomposes the issue, retrieves context, causes by way of choices, invokes APIs, checks the output, and infrequently runs a number of passes earlier than reaching a outcome. Due to this fact, the economics have to be understood on the degree of “agent situations,” not simply mannequin calls.
For the estimates beneath, I’m utilizing a blended token price of $3 {dollars} per million tokens. This isn’t supposed to replicate a single vendor’s record value. It’s a blended planning determine that assumes a mixture of enter and output tokens, reasoning steps, retrieval-augmented technology, summarization, software calls, reminiscence updates, and occasional use of bigger context home windows. Some enterprises can pay much less by way of quantity reductions or by routing work to smaller fashions. Others can pay extra by utilizing premium fashions, long-context prompts, net searching, giant doc ingestion, and repeated reasoning loops.
The essential formulation is simple. If an agent consumes 2 million tokens per day, it consumes 730 million tokens per 12 months. At $3 per million tokens, that single agent prices about $2,190 per 12 months in token burn. That quantity sounds surprisingly low till you multiply it by the variety of brokers, workflows, and customers, plus the encompassing infrastructure required to run these techniques safely.
What an agent actually prices
Within the mannequin used right here, the annual token-only price per agent ranges from about $1,095 to $3,833, relying on the use case.
