beginnerconcepts

What Is a Context Window?

The maximum amount of text an AI model can process in a single conversation.

What Is a Context Window?

A context window is the total amount of text a model can "see" at once when generating a response. It includes everything: the system prompt, the full conversation history, any documents you've pasted in, and the response the model is currently writing. When you exceed the context window, the model can't see older parts of the conversation — they fall off the edge, as if they were never said.

Tokens vs Words

Models don't count characters or words — they count tokens, which are chunks of text produced by a tokenizer. In English, a token is roughly three-quarters of a word. "Running" is 1 token; "uncharacteristically" might be 4. Numbers, punctuation, and code can be tokenized differently. A useful rule of thumb: 1,000 tokens ≈ 750 English words. Context windows are always stated in tokens, so a 128,000-token window holds roughly 96,000 words.

What Happens When You Exceed It

If a conversation exceeds the context window, one of three things happens depending on the implementation: the oldest messages are silently dropped (most common in chat apps), the API returns an error, or the application implements a compression or summarisation strategy to preserve key information. In any case, the model loses access to whatever was dropped, which can cause it to repeat itself, forget instructions, or lose track of the thread.

Why Longer Context Costs More

Transformer-based models have computational costs that scale with context length. Processing 200,000 tokens in the context costs significantly more than processing 2,000. This is why API pricing is typically per-token-in and per-token-out, and why models with large context windows charge a premium. For cost-sensitive applications, keeping context lean — using summarisation, retrieval, or structured memory — is a genuine engineering concern, not just a nice-to-have.

Example

GPT-4o: 128k tokens ≈ 96,000 words ≈ a 300-page novel Claude Opus 4: 200k tokens ≈ 150,000 words ≈ a long PhD thesis

Try this skill with our AI assistant

Try it →