cover_image

817条样本就能激活大模型的复杂推理能力，来看看上海交大的LIMO

今天咱们来唠唠上海交大的LIMO，保证你听完直拍大腿："原来AI还能这么低成本训练"

一、LIMO是啥？

LIMO全名Less Is More for Reasoning（少即是多推理法），简单说就是用817道数学题就能让AI学会解奥数题。这就像给学霸看几道经典例题，他就能举一反三解出更难题目一样。

它的核心思想是：与其用大量数据训练AI模型，不如用少量但高质量的训练样本，就能激活大模型的复杂推理能力。

图：LIMO在样本更少的情况下，相较于NuminaMath实现了显著提升，并在多个数学和多学科基准测试中表现出色。

二、为啥说它牛？

省数据：以前训练AI解数学题要10万道题起步，现在只要817道，相当于用1%的题量考出更好的成绩
成绩好：在美国数学竞赛AIME测试中，LIMO准确率从传统方法的6.5%飙到57.1%，直接翻了近9倍
会举一反三：在10个不同类型的数学考试里，LIMO平均分比用100倍数据训练的模型还高40%

三、背后的门道

上海交大团队发现这个反常识的真相：现在的大模型其实早就在预训练时学完了数学知识，就像学生课本都背熟了但不会解题。LIMO就像个会教方法的老师，用少量典型例题教AI怎么把知识用出来。

举个栗子🌰：假设AI已经背熟《五年高考三年模拟》，但遇到新题还是懵。LIMO就专门教它："遇到几何题先画辅助线，遇到代数题先找等量关系"——教的是解题套路，不是死记硬背。

四、能干啥用？

教育：帮学生快速掌握解题思路
科研：加速复杂数学模型的验证
工业：优化生产线上的数学问题
医疗：辅助分析复杂的医学数据

五、代码示例

使用Hugging Face Transformers加载：

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "GAIR/LIMO",
    torch_dtype="auto",
    trust_remote_code=True,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMO", trust_remote_code=True)

messages = [
    {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
    {"role": "user", "content": "What is the result of 1+1?"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=32768,
    temperature=0.7,
    top_p=0.95,
    do_sample=True
)

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

使用vLLM加载：

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

llm = LLM(
    model="GAIR/LIMO",
    tensor_parallel_size=4,
    trust_remote_code=True,
    swap_space=60,
    gpu_memory_utilization=0.96,
)

messages = [
    {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
    {"role": "user", "content": "What is the result of 1+1?"}
]

tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMO", trust_remote_code=True)
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

sampling_params = SamplingParams(
    temperature=0.7,
    max_tokens=32768,
    top_p=0.95,
)

output = llm.generate(text, sampling_params)
print(output[0].outputs[0].text)