Core Concepts

Tokens

Q: What is Tokens?

The basic units that language models use to process text. A token is typically a word, part of a word, or a punctuation mark. Models read, process, and generate text as sequences of tokens, and pricing is usually based on token count.

Quick Answer: The basic units that language models use to process text.

Tokens is the basic units that language models use to process text. A token is typically a word, part of a word, or a punctuation mark. Models read, process, and generate text as sequences of tokens, and pricing is usually based on token count.

Example

The sentence 'Prompt engineering is fascinating' is 4-5 tokens. As a rough rule, 1 token ≈ 4 characters in English, or about 0.75 words. 1,000 tokens ≈ 750 words.

Why It Matters

Token count directly impacts cost and performance. Efficient prompt engineering means getting the same quality output with fewer input tokens. At enterprise scale, reducing prompt length by 20% can save thousands per month.

How It Works

Tokens are the fundamental units that language models process. A token might be a whole word ('hello'), a word fragment ('un' + 'believ' + 'able'), a punctuation mark, or a special character. Different models use different tokenizers, so the same text produces different token counts across models.

Tokenization affects both cost and behavior. API pricing is per-token, so understanding token counts is essential for budget management. Tokenization quirks also cause model behavior oddities: models struggle with character-counting tasks because they don't see individual characters, only tokens.

Common ratios: English text averages about 0.75 tokens per word (or about 4 characters per token). Code tends to use more tokens per line than prose. Non-English languages, especially those with non-Latin scripts, typically require more tokens per word, making API calls more expensive.

Common Mistakes

Common mistake: Estimating costs based on word count instead of actual token count

Use the model provider's tokenizer tool to get exact counts. Libraries like tiktoken (OpenAI) give precise token counts for budgeting.

Common mistake: Ignoring that both input AND output tokens are billed

Output tokens are typically 3-4x more expensive than input tokens. Limiting output length (e.g., 'respond in under 100 words') can significantly reduce costs.

Career Relevance

Token economics directly affect AI product viability. Prompt engineers and AI product managers need to understand token costs to build sustainable products. A prompt that uses 2,000 tokens vs 500 tokens for the same task means 4x the API cost at scale.

Related Terms

Stay Ahead in AI

Join 1,300+ prompt engineers getting weekly insights on tools, techniques, and career opportunities.

Join the Community →