تخطي إلى المحتوى الرئيسي

قاموس

ما هو Context Window؟

The maximum number of tokens an LLM can process in one interaction — including your prompt, conversation history, and the model's response.

The context window is the LLM's working memory. Once a conversation or document exceeds the window, older content gets dropped or summarized — and the model effectively forgets it. Window sizes in 2026: GPT-4 Turbo (128K tokens), Claude (200K), Gemini Pro (2M). 200K tokens is about a 500-page book; 2M is enough to load a small codebase entirely. Larger windows enable new use cases (analyzing a whole codebase, conversation memory over months) but come with trade-offs: cost (longer prompts cost more), latency (longer prompts process slower), and an effect called 'lost in the middle' where models pay less attention to information buried mid-document. For most apps, RAG still beats large context windows — retrieve the relevant 5% rather than dump in 100%.

مصطلحات ذات صلة

العودة إلى قاموس AI