What Is a Token? (AI Pricing Explained)

Updated April 28, 2026 · Data as of 2026-04-26

A token is a chunk of text — roughly 3 to 4 characters, or about 0.75 words. AI providers measure usage in tokens to calculate how much processing a request requires.

Tokenization works by splitting text into the smallest meaningful units a model can process. Common short words are typically one token. Longer or unusual words split into two or three. The word 'ChatGPT' is approximately two tokens; 'hello' is one. Your prompt, the code you share, and the model's response are all measured in tokens.

Providers charge separately for input tokens (what you send to the model) and output tokens (what the model generates back). Output tokens typically cost more because generating text requires more computation than processing it. A 1,000-token prompt asking the model to write a 2,000-token function costs 1,000 input tokens and 2,000 output tokens.

To put the scale in context: a typical back-and-forth coding session might use 20,000 to 100,000 tokens depending on how much code you are passing in and how long the responses are. A large-file refactor can easily consume 10,000 to 30,000 tokens in a single request. At typical API rates for frontier models, an active coding session costs roughly $0.10 to $1.00.

Why it matters for AI tool costs

Tokens matter because they are the unit of cost for every AI request you make. If you are on a subscription plan like Cursor Pro, the subscription includes a token allowance — heavy agentic use depletes it before the month ends, leaving you on a reduced capability tier. If you are on a bring-your-own-key plan like Cline, you pay per token directly at API rates. Understanding tokens helps you choose the right plan, predict your monthly cost, and avoid surprises mid-cycle.

Frequently asked questions

How many tokens is a typical coding session?

It varies widely. A focused 30-minute session with lots of agentic back-and-forth might use 50,000 to 200,000 tokens. A quick question-and-answer exchange might use 2,000 to 5,000. The biggest variable is how much code context you are passing in — sharing a large file with every message multiplies token usage fast. Tools with codebase indexing are more efficient because they pull only the relevant context for each request.

Do all AI tools charge per token?

Not directly. Subscription tools like Cursor and Windsurf bundle token allowances into a monthly fee — you pay a flat rate and the platform absorbs the per-token cost up to your limit. Bring-your-own-key tools like Cline pass the API costs through directly, so you see exactly what each session costs. The subscription model offers predictability; the BYO-key model offers transparency and potentially lower cost for lighter users.

What is a context window?

A context window is the maximum number of tokens a model can consider in a single request — your prompt, the code you share, and the conversation history all count toward it. Models with larger context windows can handle bigger files and longer conversations before losing track of earlier content. Context window size matters particularly for large codebase work, where you need the model to understand your full project structure.

Back to WhichAI · How we score tools