Question 1

What is token-based pricing in AI?

Accepted Answer

Vendors bill by tokens — chunks of text the model reads (input) and generates (output). Each API call accumulates tokens; list prices are quoted per million tokens, often with different rates per model tier.

Question 2

Why can AI token costs scale unexpectedly?

Accepted Answer

Production traffic, longer prompts, retrieval-augmented context, agent loops, and retries multiply tokens per user action. A pilot that looked cheap at thousands of calls can exceed budget at millions without rate limits and model routing.

Question 3

Why do output tokens cost more than input tokens?

Accepted Answer

Generating new tokens is more compute-intensive than encoding input. Providers price output at a premium — often several times input — reflecting GPU time for autoregressive decoding at scale.

Token-based pricing

Definition

Why it matters

Related Terms

Related Guides

FAQ

Stay ahead of cloud, SaaS, and AI spend