Contents

LIMA: Less Is More for Alignment

arXiv 2305.11201 Hugging Face

TL;DR

Superficial Alignment Hypothesis: A model’s knowledge and capabilities are learned almost entirely during pretraining, while alignment teaches it the style or format when interacting with users. -> a rather small set of examples is sufficient to achieve alignment.

Motivations & Innovations

Existing alignment methods require significant amounts of instruction data. -> simply fine-tuning on 1,000 carefully curated training examples.

Approach

Data Source

  • Community Questions & Answers
  • Manually Authored Examples

./images/index-20260128165919.webp

Experiments

Baseline: Starting from LLaMa 65B, SFT on 1,000-example alignment training set

  • LIMA outperforms RLHF-trained DaVinci003 from OpenAI, as well as a 65B-parameter reproduction of Alpaca with 52 times more data.

./images/index-20260128170711.png