🏆 Benchmark Leaderboards

Compare training datasets by model performance on standard benchmarks. Higher score = better training data.

Total 6 benchmarks, 7 training datasets compared
🏆
CALVIN
ETH Zurich
**CALVIN Benchmark** is a language-conditioned long-sequence robot manipulation benchmark proposed by ETH Zurich. Defin...
🏆
LIBERO
Stanford / AI2
**LIBERO** is a lifelong robot learning benchmark proposed by Stanford / AI2. Defines **130 language-conditioned manipu...
🏆
RLBench
牛津大学
**RLBench** is a robot manipulation benchmark framework proposed by the University of Oxford. Defines **100 language-de...
🏆
SimplerEnv
Stanford / Google DeepMind
SimplerEnv 是用于评估机器人操作策略的仿真评测套件,包含 Fractal、Bridge 等多个真实环境的高保真仿真复现。GR00T N1 和 pi0 等模型使用 SimplerEnv 进行后训练评估。
🏆
FurnitureBench
UT Austin / NVIDIA Research
FurnitureBench 是真实机器人长序列家具装配基准,包含 9 款 IKEA 风格 3D 打印家具模型的装配任务。评估机器人在长序列操作中的抓取、插入、拧螺丝等精细操作能力。
🏆
EmbodiedBench
UIUC / Northwestern / Purdue
EmbodiedBench 是面向多模态大模型 (MLLM) 的综合具身评测基准。融合 4 大仿真环境(EB-ALFRED/EB-Habitat/EB-Navigation/EB-Manipulation),评估 6 大核心能力。注意:评测...
AI 助手
输入需求,AI 帮你在 58 个数据集、19 个标准、18 个工具中智能匹配