BridgeData
DatasetactiveBridgeData is a large-scale robot manipulation dataset collected from diverse real-world settings using a WidowX 6-DoF robot arm. Developed by the RAIL Lab at UC Berkeley, BridgeData is designed to enable training of generalist robot policies that can perform a wide variety of manipulation tasks across different scenes, objects, and environmental conditions. BridgeData V2, the second version of the dataset, contains over 60,000 demonstration trajectories collected across hundreds of different tasks and settings. Each trajectory includes multi-view RGB images, joint positions, end-effector poses, actions, and language task descriptions. The dataset was designed specifically to study generalization in robot learning — how well policies trained on diverse data can adapt to new objects, backgrounds, and task configurations. BridgeData has been used extensively for training VLA models, including OpenVLA, and serves as a standard benchmark for evaluating generalist robot policies. BridgeData is hosted on TensorFlow Datasets and is widely used in the open-source robotics research community for training and evaluating imitation learning and VLA models.