GR-1 Humanoid Data Ecosystem Deep Review: From Real Teleoperation to Million-Scale Synthetic Trajectories

2026-06-08 By Superdata RobotAI

GR-1HumanoidDataset ReviewVLAGR00TActionNetSim-to-Real

The GR-1 humanoid robot from Fourier Intelligence has become one of the most important platforms for embodied AI research. Three major datasets now exist around it, forming a data ecosystem spanning real-world teleoperation to million-scale synthetic generation.

This article provides a data engineer's perspective on all three GR-1 datasets, with side-by-side comparisons of scale, modality, licensing, and training results.

The Three Datasets

Fourier ActionNet: 30K+ real teleoperation trajectories, CC BY-NC-SA 4.0
NVIDIA GR-1 Simulation: Arena (50), Teleop-Sim (1,000), X-Embodiment (TB-scale)
GR00T N1 Training Set: 780K synthetic + real + internet video, Apache 2.0

Key Findings

GR00T N1 achieves 42.6% success with only 10% data, 76.8% with full data on real GR-1
Synthetic data alone reaches 46.4% — real-world data remains essential for Sim-to-Real
Full-body locomotion data is still not publicly available for GR-1
License fragmentation across the three datasets requires careful commercial review

Selection Guide

Goal	Recommended Dataset
Quick start (academic)	Fourier ActionNet
Train VLA foundation model	GR00T N1 Training Set
Sim-to-Real research	NVIDIA GR-1 Sim + ActionNet combo
Dexterous hand / bimanual	Fourier ActionNet