Tokens & Tokenization

Fundamentals beginner 10 min

Sources verified Dec 22

LLMs process text as tokens — chunks of characters that form the atomic units of input and output, directly affecting pricing and context limits.

LLMs don't process text character by character. Instead, they break text into tokens — chunks that might be words, parts of words, or punctuation.

For example:

"Hello" → 1 token
"indistinguishable" → 4 tokens
" " (spaces) → 1 token
Code often tokenizes less efficiently than prose

How Tokenization Works

Modern LLMs use byte-pair encoding (BPE) or similar algorithms:

Start with individual bytes/characters
Iteratively merge the most frequent pairs
Build a vocabulary of common subwords
Result: frequent words become single tokens, rare words split into pieces

This is why "ChatGPT" might be 3 tokens while "cat" is 1.

token_counting.ts
 // Approximate token counting
function estimateTokens(text: string): number {
  // Rule of thumb: 4 chars per token for English
  return Math.ceil(text.length / 4);
}

// For production, use tiktoken or the model's tokenizer
import { encoding_for_model } from 'tiktoken';
const enc = encoding_for_model('gpt-4o');
const tokens = enc.encode('Hello, world!');
console.log(tokens.length); // 4 
  Rough estimate only - use tiktoken for accuracy 
  tiktoken gives exact counts for OpenAI models 

Why Tokens Matter

Pricing: You pay per input + output tokens (e.g., $3/million input for Claude Sonnet)
Context limits: Models have maximum context windows (e.g., 128K tokens for GPT-4o)
Response quality: Important context near the limit may be "forgotten"
Speed: More tokens = longer generation time

Key Takeaways

Tokens are chunks of text, not characters or words
1 token ≈ 4 English characters or ~0.75 words
Pricing and limits are based on token counts
Use tiktoken or model APIs for accurate counting
Code and non-English text often tokenize less efficiently

In This Platform

Token awareness matters when designing prompts. Our system prompts in prompts/*.json are written to be concise, avoiding unnecessary verbosity that would waste tokens.

Relevant Files:

prompts/analysis.json
prompts/recommendations.json

Sources

Tempered AI — Forged Through Practice, Not Hype

? Keyboard shortcuts

Tokens & Tokenization

How Tokenization Works

Why Tokens Matter

Key Takeaways

In This Platform

Related Concepts

Sources