What is Token? — AI Glossary

Tokens are how LLMs break up text. The word 'understanding' might be one token; 'antidisestablishmentarianism' might be five. A rule of thumb for English: 1 token ≈ 0.75 words, or 4 characters. Different models use different tokenizers, so token counts vary. Why tokens matter: (1) costs — API pricing is usually per million tokens; (2) context windows — models have a maximum token count they can handle at once; (3) speed — inference time scales with output tokens. When estimating costs, count both input AND output tokens. A 1000-word document going through a model that produces a 500-word summary uses about 2000 tokens total. At GPT-4.1 pricing (~$2.50/M input, $10/M output), that costs less than a penny — but at scale (millions of documents) it adds up fast.

מה זה Token?

מונחים קשורים