Tokens & Tokenization
Fundamentals beginner 10 min
Sources verified Dec 22
LLMs process text as tokens — chunks of characters that form the atomic units of input and output, directly affecting pricing and context limits.
LLMs don't process text character by character. Instead, they break text into tokens — chunks that might be words, parts of words, or punctuation.
For example:
- "Hello" → 1 token
- "indistinguishable" → 4 tokens
- " " (spaces) → 1 token
- Code often tokenizes less efficiently than prose
How Tokenization Works
Modern LLMs use byte-pair encoding (BPE) or similar algorithms:
- Start with individual bytes/characters
- Iteratively merge the most frequent pairs
- Build a vocabulary of common subwords
- Result: frequent words become single tokens, rare words split into pieces
This is why "ChatGPT" might be 3 tokens while "cat" is 1.
token_counting.ts
// Approximate token counting
function estimateTokens(text: string): number {
// Rule of thumb: 4 chars per token for English
return Math.ceil(text.length / 4);
}
// For production, use tiktoken or the model's tokenizer
import { encoding_for_model } from 'tiktoken';
const enc = encoding_for_model('gpt-4o');
const tokens = enc.encode('Hello, world!');
console.log(tokens.length); // 4 L3: Rough estimate only - use tiktoken for accuracy
L8: tiktoken gives exact counts for OpenAI models
Why Tokens Matter
- Pricing: You pay per input + output tokens (e.g., $3/million input for Claude Sonnet)
- Context limits: Models have maximum context windows (e.g., 128K tokens for GPT-4o)
- Response quality: Important context near the limit may be "forgotten"
- Speed: More tokens = longer generation time
Key Takeaways
- Tokens are chunks of text, not characters or words
- 1 token ≈ 4 English characters or ~0.75 words
- Pricing and limits are based on token counts
- Use tiktoken or model APIs for accurate counting
- Code and non-English text often tokenize less efficiently
In This Platform
Token awareness matters when designing prompts. Our system prompts in prompts/*.json are written to be concise, avoiding unnecessary verbosity that would waste tokens.
Relevant Files:
- prompts/analysis.json
- prompts/recommendations.json
Sources
Tempered AI — Forged Through Practice, Not Hype
? Keyboard shortcuts