Chain of Thought Prompting
Chain of Thought (CoT) Prompting
Example
Why It Matters
Chain of thought prompting improves accuracy by 20-40% on reasoning tasks according to research from Google. It is one of the most impactful prompt engineering techniques and forms the basis of modern reasoning models like o1 and o3. Understanding when and how to use CoT is a core prompt engineering skill.
How It Works
Chain of thought works because language models process text sequentially. Each generated token serves as additional context for the next token. When the model writes out intermediate steps, each step provides context that improves the accuracy of subsequent steps.
There are several CoT variants with different applications:
Zero-shot CoT: Simply add 'Let's think step by step' to your prompt. This triggers step-by-step reasoning without examples. It is the simplest approach and works surprisingly well on most reasoning tasks.
Few-shot CoT: Provide 2-3 example problems with worked-out solutions before your actual question. The examples teach the model the expected reasoning format. This produces more reliable output than zero-shot CoT.
Self-consistency: Generate multiple CoT reasoning paths for the same question, then take a majority vote on the final answer. If 4 out of 5 reasoning paths arrive at the same answer, that answer is likely correct.
Tree of Thought: Explore multiple reasoning branches at each step, evaluate which branches look most promising, and follow the best path. This is slower and more expensive but handles complex problems where the first approach may be a dead end.
Common Mistakes
Common mistake: Using chain of thought for simple factual lookups where it adds unnecessary cost and latency
Reserve CoT for multi-step reasoning: math, logic, code debugging, multi-part analysis. For simple questions like 'What is the capital of France?' direct prompting is faster and cheaper.
Common mistake: Providing chain of thought examples with flawed reasoning steps
Double-check the logic in your few-shot examples. Models learn from your examples, including your mistakes. One error in a CoT example can systematically bias all subsequent answers.
Common mistake: Assuming CoT always helps with smaller models
Research shows CoT provides the biggest gains with larger models (70B+ parameters). Smaller models sometimes perform worse with CoT because they struggle to maintain coherent multi-step reasoning. Test before committing.
Career Relevance
Chain of thought is the most widely applicable prompt engineering technique. It appears in nearly every prompt engineering interview and is a requirement for building reliable AI applications that handle reasoning tasks. Mastering CoT variants is a baseline expectation for prompt engineer roles.
Frequently Asked Questions
Does chain of thought cost more?
Yes. CoT prompts generate more output tokens (the reasoning steps), which increases cost. However, the accuracy improvement often reduces the need for retries, which can offset the higher per-call cost. For critical tasks, the accuracy gain is usually worth the extra tokens.
How is CoT different from reasoning models like o1?
Traditional CoT is a prompting technique that you apply manually. Reasoning models like o1 and o3 perform chain of thought internally as part of their architecture. You do not need to ask o1 to 'think step by step' because it already does. Traditional CoT is for standard models; reasoning models have it built in.
Related Terms
Learn More
Stay Ahead in AI
Join 1,300+ prompt engineers getting weekly insights on tools, techniques, and career opportunities.
Join the Community →