Chain of Thought
The Chain of Thought (CoT) prompting technique, introduced by Wei, Jason et al (2022), encourages an LLM to articulate its reasoning steps before arriving at a final answer to a given task.
Before its introduction, scaling LLMs had demonstrated the ability to generate coherent text and solve various tasks. However, these LLMs still underperformed on complex reasoning tasks like arithmetic and symbolic reasoning. While some prompting techniques and in-context learning had already been discovered, none had successfully enabled LLMs to handle complex reasoning tasks.
Original Implementation Details
CoT was originally introduced as a few-shot prompting technique where each included examplar is augmented with a chain of thought that explains how the final answer was determined. An example of such an examplar taken from the original paper is provided below:
# An examplar
examplar:
question: >
Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each
can has 3 tennis balls. How many tennis balls does he have now?
chain of thought: >
Roger started with 5 balls. 2 cans of 3 tennis balls each
is 6 tennis balls. 5 + 6 = 11.
answer: The answer is 11.
The authors used the same set of 8 examplars across all tested benchmarks, with the exception of AQuA for which 4 examplars derived from the training set was used instead.
Performance
With larger models, CoT outperformed standard prompting across all tested reasoning benchmarks (mathematical, commonsense, and symbolic). For some of these, it even achieved state of the art results, beating out previous methods that relied on fine-tuning. However, CoT added little benefit for smaller models, leading the authors to posit it as an emergent ability of model scale.
Limitations
One of the noted limitations of CoT is the lack of guarantees on correct reasoning paths taken by the LLM. In other words, the reasoning steps that the LLM performs can be flawed, leading to inefficient token generation and potentially amplifying the issue of LLM hallucinations.
Modern Implementations
Since its introdcution, the CoT prompting technique has become more flexible. Broadly speaking, it is widely recognized as a prompting technique that elicits a chain of thought output in its generation. To do so, many include general instructions in the prompt, specifying the desired output format and other requirements. With these system instructions and output formats, CoT can also be implemented in a zero-shot fashion.
# Example CoT prompt instructions
prompt:
system: >
You are a helpful assistant that is able to handle complex reasoning
tasks. To arrive at the final answer, perform chain of thought steps
and include these in your output.
Structure your output using the following format
<thought>
chain of thought here
</thought>
<answer>
answer here
</answer>
question: >
Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each
can has 3 tennis balls. How many tennis balls does he have now?