Contents

Zero-shot-CoT: Large Language Models are Zero-Shot Reasoners

NeurIPS 2022 Google Research, Brain Team arXiv 2205.11916

TL;DR

Zero-shot-CoT, a zero-shot task-agnostic prompt (Let’s think step by step.) without any step-by-step few-shot examples, elicits multi-hop reasoning ability.

Motivations & Innovations

  • The success of large language models is often attributed to (in-context) few-shot learning called “prompting”.
  • With CoT prompting, the reasoning performance satisfies the scaling laws better and jumps up with the size of the language model.

./images/index-20260120160643.webp

Approach: Two-stage Prompting

./images/index-20260120164753.webp

1st Prompt: Reasoning Extraction

2nd Prompt: Answer Extraction

Experiments

Zero-shot-CoT vs Zero-shot:

./images/index-20260128174335.webp

Comparison with other baselines:

./images/index-20260128174431.webp

Does model size matter for zero-shot reasoning?: Yes

./images/index-20260128174518.webp

How does prompt selection affect Few-shot-CoT:

./images/index-20260120160829.webp