Open X-Embodiment

Datasetactive

Open X-Embodiment is a large-scale, collaborative, open-source dataset of robot manipulation demonstrations collected across 22 different robot embodiments, 34 research institutions, and over 500 skills. The dataset contains over 1 million episodes spanning diverse tasks, scenes, and robot types. The dataset was created to address the data fragmentation problem in robotics, where every lab collects data on different robots for different tasks, making it difficult to train generalist robot policies. Open X-Embodiment unifies this data into a standardized format (RLDS — Reinforcement Learning Datasets) and makes it publicly available. Open X-Embodiment was used to train RT-1-X and RT-2-X, demonstrating that models trained on diverse multi-robot data can transfer knowledge across robot embodiments. It serves as a foundational dataset for VLA training and is hosted on Google Research's TensorFlow Datasets (TFDS) platform. The key paper, "Open X-Embodiment: Robotic Learning Datasets and RT-X Models" (arXiv: 2212.06817), showed that co-training on diverse robot data improves performance on individual tasks compared to training on single-robot data.