Tokens are how LLMs break up text. The word 'understanding' might be one token; 'antidisestablishmentarianism' might be five. A rule of thumb for English: 1 token ≈ 0.75 words, or 4 characters. Different models use different tokenizers, so token counts vary. Why tokens matter: (1) costs — API pricing is usually per million tokens; (2) context windows — models have a maximum token count they can handle at once; (3) speed — inference time scales with output tokens. When estimating costs, count both input AND output tokens. A 1000-word document going through a model that produces a 500-word summary uses about 2000 tokens total. At GPT-4.1 pricing (~$2.50/M input, $10/M output), that costs less than a penny — but at scale (millions of documents) it adds up fast.
מילון
מה זה Token?
The basic unit that LLMs read and produce. Roughly 0.75 words in English. APIs charge per token consumed and produced.
מונחים קשורים
Context Window
The maximum number of tokens an LLM can process in one interaction — including your prompt, conversation history, and the model's response.
LLM (Large Language Model)
An AI system trained on massive text datasets to predict and generate human-like text — the technology behind ChatGPT, Claude, Gemini, and most modern AI chatbots.
Embedding
A vector representation of text, image, or audio — a list of numbers that captures the semantic meaning, enabling 'find similar' searches.
חזרה ל- מילון ה-AI