Glossary
Context window
Also known as: Context length
Definition
The context window denotes the upper bound on tokens a language model can read and generate within a single request. Depending on the architecture, modern models range from a few thousand to several million tokens. When the window is exceeded, earlier context has to be truncated, summarized, or re-introduced through retrieval. Longer windows simplify RAG and agent workflows, but they also raise latency and cost.
How Swiss Knowledge Hub uses this term
In Swiss Knowledge Hub, every request only forwards the chunk context that retrieval actually deemed relevant. Answers therefore stay fast and cost-efficient, regardless of how large the context window of the chosen model is.
Related terms
Sources
- Wikipedia: Large language model — Context window — https://en.wikipedia.org/wiki/Large_language_model#Context_window
Last updated: April 22, 2026