Prompt Engineering

Chain of Thought

Chain of thought is a prompting technique that instructs language models to produce intermediate reasoning steps before arriving at a final answer, improving accuracy on complex tasks.

What is Chain of Thought?

Chain of thought (CoT) is a prompting technique that instructs language models to produce intermediate reasoning steps before arriving at a final answer, improving accuracy on complex tasks. By generating explicit step-by-step reasoning, models achieve significantly better performance on math, logic, and multi-hop reasoning problems.

The technique works because transformer models compute their output token-by-token, and each generated token can attend to all previous tokens. When a model writes out intermediate steps, those steps become part of the context for subsequent tokens, effectively giving the model a scratchpad for computation. Without chain of thought, the model must compute the final answer in a single forward pass — compressing all reasoning into the hidden states.

Chain of thought can be elicited through few-shot examples that demonstrate step-by-step reasoning, or through zero-shot prompts like "Think step by step." Extended thinking features in modern models like Claude's thinking blocks formalize this pattern by providing a dedicated reasoning space that is processed before the final response. The technique scales with model size — larger models benefit more from CoT prompting than smaller ones.

Why does Chain of Thought matter?

Chain of thought unlocks capabilities that appear absent without it. Tasks where models score 20-30% accuracy with direct prompting often jump to 80-90% with chain-of-thought reasoning. For AI agents, CoT enables multi-step planning and self-correction that would be impossible in a single generation pass.

How is Chain of Thought used in practice?

Production systems use chain of thought for tool selection in agent loops, where the model reasons about which tool to call and why before generating the function call. A code generation agent might reason through the problem requirements, identify edge cases, plan the solution structure, and then write the implementation — producing better code than direct generation.

About the Author

Aaron is an engineering leader, software architect, and founder with 18 years building distributed systems and cloud infrastructure. Now focused on LLM-powered platforms, agent orchestration, and production AI. He shares hands-on technical guides and framework comparisons at fp8.co.