Mobile ALOHA

Stanford IRIS Lab's low-cost whole-body teleoperation mobile bimanual manipulation platform

Mobile ALOHA

Home > Hardware > Mobile > Mobile ALOHA


Key Significance

  • Democratization of Mobile Bimanual Manipulation: Implemented ~$32,000 platform vs existing $200,000+ mobile manipulators - greatly improved research accessibility
  • Whole-Body Teleoperation: Enables complex household tasks that require moving while using both arms, beyond tabletop manipulation
  • Co-Training Paradigm: Achieves 80-90% success rate with only 50 demonstrations by jointly training with existing static ALOHA datasets
  • Open-Source Ecosystem: Complete disclosure of hardware design, software, 3D printing files, assembly tutorials
  • Practical Household Robot Research: Demonstrated potential for general household robots through real-life task demonstrations like cooking, cleaning, elevator calling

Overview

Mobile ALOHA is a low-cost whole-body teleoperation system developed by Stanford IRIS Lab. By mounting the existing tabletop-only ALOHA system on a mobile base, it enables learning complex tasks that simultaneously perform locomotion and bimanual manipulation.

ItemSpec
DevelopmentStanford IRIS Lab
AuthorsZipeng Fu*, Tony Z. Zhao*, Chelsea Finn
PublicationarXiv: January 2024 / CoRL 2024 presentation
BaseAgileX Tracer AGV
ArmsViperX 300 6-DoF x 2 (follower) + WidowX 250 x 2 (leader)
GripperCustom parallel gripper (3D printed)
CamerasWrist x 2 + front 1 (Logitech C922x)
ComputeLaptop (RTX 3070 Ti, i7-12800H)
Total Cost~$32,000
PaperarXiv:2401.02117
Projectmobile-aloha.github.io

Hardware Configuration

Mobile Base: AgileX Tracer AGV

Source: AgileX TRACER Documentation

ItemSpec
Drive2-wheel differential drive + 4 freewheel casters
Motors150W brushless servo x 2
Max Speed1.6 m/s (human walking speed)
Payload100 kg
Size (L x W x H)702 x 610 x 169 mm
Ground Clearance30 mm
Obstacle Crossing10 mm height, 8 degree slope
Operating TimeUp to 4 hours (100kg load)
Price~$7,000

Robot Arms: ViperX 300 6-DoF

Source: Trossen Robotics ViperX 300

ItemSpec
Configuration2 followers (for autonomous execution)
Arm DoF6-DoF (arm body)
Gripper1-DoF (open/close)
Arm+Gripper Total DoF7-DoF per arm
Horizontal Reach75 cm (base center to gripper)
Total Span150 cm
Working Payload750 g
ServosDYNAMIXEL XM540-W270-R, XM430-W350-R
Resolution4096 positions
Material20mm x 40mm extruded aluminum
Price~$6,130 x 2

Teleoperation Leader Arms: WidowX 250 6-DoF

ItemSpec
Configuration2 leaders (for data collection)
Features3D printed ergonomic handles
UseUsed only for demonstration data collection
Price~$3,550 x 2

Sensors and Compute

ItemSpec
CamerasLogitech C922x x 3
Resolution640 x 480
Control Frequency50 Hz (camera streaming and policy execution)
Placement2 wrist + 1 front
ComputeConsumer-grade laptop
GPUNVIDIA RTX 3070 Ti (8GB VRAM)
CPUIntel i7-12800H
CommunicationUSB serial (arms) + CAN bus (base)

Power System

ItemSpec
Battery1.26 kWh
Weight14 kg
PositionBottom of base (doubles as counterweight)
FeaturesUntethered wireless operation

Cost Breakdown (~$32,000)

Source: Mobile ALOHA Project Page, Trossen Robotics

ComponentPrice (USD)Notes
AgileX Tracer AGV~$7,000Mobile base
ViperX 300 6-DoF x 2~$12,260Follower arms
WidowX 250 6-DoF x 2~$7,100Leader arms (for teleop)
Battery (1.26kWh)~$2,000Estimate
Cameras (C922x x 3)~$300RGB webcams
3D Printed Parts~$500Grippers, mounts, etc.
Other Hardware~$2,840Brackets, cables, etc.
Total~$32,000Officially stated on project page

Comparison: Existing commercial mobile manipulators (e.g., Clearpath + dual arms) are $200,000+


Physical Specifications

Source: Mobile ALOHA Project Page

ItemSpec
Footprint90 cm x 135 cm
Arm Reach Height65 cm ~ 200 cm
Arm Forward Extension100 cm (from base)
Total Weight75 kg
Pull Force100 N @ 1.5 m height
Movement SpeedUp to 1.6 m/s

Differences from Static ALOHA

ItemALOHA (Static)Mobile ALOHA
BaseFixed tableAgileX Tracer (mobile)
Action Dimensions14-DoF (arms+grippers)16-DoF (arms+grippers + base velocity)
Task RangeTabletop manipulationFull indoor environment
Teleop MethodHand-operate leader armsWhole-body teleop (walk while manipulating)
Cost~$20,000~$32,000
LoadFixedSelf-balancing (using battery weight)

Action Space Expansion

DoF Explanation: Each ViperX 300 arm is 6-DoF (arm) + 1-DoF (gripper) = 7-DoF

ALOHA: 14-DoF joint positions
       [arm1(6) + gripper1(1) + arm2(6) + gripper2(1)]

Mobile ALOHA: 16-DoF
       [arm1(6) + gripper1(1) + arm2(6) + gripper2(1) + base_linear_vel(1) + base_angular_vel(1)]

This design allows existing imitation learning algorithms to be applied with minimal modification.


Co-Training: Core Technique

Motivation

Mobile bimanual manipulation datasets are sparse, but static bimanual manipulation data is abundant. Co-training improves performance by training these two types of data together.

Method

Training Data = Mobile ALOHA demos (50) + Static ALOHA datasets (existing)

Mobile data: Full 16-DoF actions
Static data: 14-DoF actions (base velocity padded with 0)

Effect

ConditionAverage Success Rate
Mobile data only~50%
With co-training~84%
Improvement+34%p

Demonstrated Tasks

Source: arXiv:2401.02117 Table 1

Success Rates (50 demos, with co-training)

TaskSuccess RateDescription
Wipe Wine95%Wipe wine spill
Call Elevator95%Call elevator and board
Use Cabinet85%Open wall cabinet and store pot
High Five85%High five
Rinse Pan80%Rinse pan at kitchen sink
Push Chairs80%Organize chairs
Cook Shrimp40%Stir-fry shrimp (75 sec, only 20 demos used)

Task Categories

Cooking

  • Stir-fry and serve shrimp
  • Handle pots/pans
  • Rinse at sink

Cleaning/Organizing

  • Wipe wine spill
  • Push chairs to organize
  • Store items in cabinet
  • Use vacuum cleaner

Navigation + Manipulation

  • Press elevator button and board
  • Transport items between rooms

Interaction

  • High five
  • Hand items to people

Technical Details

Supported Algorithms

AlgorithmDescription
ACTAction Chunking Transformer
Diffusion PolicyDiffusion-based action generation
VINNVisual Imitation through Nearest Neighbors

Simulation Environments

  • Transfer Cube
  • Bimanual Insertion

Training Settings

ItemValue
Number of Demos50/task
Control Frequency50 Hz
Image Resolution640 x 480
Number of Cameras3 (2 wrist + 1 front)

Open-Source Resources

Public Materials

ResourceLink
PaperarXiv:2401.02117
Project Pagemobile-aloha.github.io
GitHub (Hardware)mobile-aloha
GitHub (Algorithms)act-plus-plus
Assembly TutorialIncluded in project page
3D Printing FilesIncluded in GitHub

Tutorial Contents

  • 3D printing guide
  • Assembly sequence
  • Software installation
  • Calibration methods
  • Teleoperation usage

Research Team and Support

Authors

NameRole
Zipeng FuCo-first author
Tony Z. ZhaoCo-first author
Chelsea FinnAdvisor

Support

  • Stanford Robotics Center
  • Steve Cousins
  • Stanford IRIS Lab members

Subsequent Developments

ALOHA 2 (Google DeepMind, 2024)

Google DeepMind announced improved hardware version:

  • Improved rigidity and precision
  • Improved gripper design
  • Better cable management

Commercialization

Trossen Robotics sells ALOHA kits:

  • ALOHA Solo
  • ALOHA Bimanual Kit
  • Mobile ALOHA compatible parts

Significance and Impact

Academic Impact

  • Co-training Effect Proven: Performance improvement possible with related task data
  • Low-cost Research Platform: High-quality research possible at $32K
  • Reproducibility: Complete open-source enables replication in labs worldwide

Industrial Implications

“Mobile ALOHA has demonstrated something unique: relatively cheap robot hardware can solve really complex problems.” - Lerrel Pinto, NYU

  • Demonstrated feasibility of household robots
  • Complex tasks possible even with low-cost hardware
  • Dramatically reduced data collection costs

References

Paper

@article{fu2024mobile,
  author    = {Fu, Zipeng and Zhao, Tony Z. and Finn, Chelsea},
  title     = {Mobile ALOHA: Learning Bimanual Mobile Manipulation
               with Low-Cost Whole-Body Teleoperation},
  journal   = {arXiv preprint arXiv:2401.02117},
  year      = {2024},
  note      = {Presented at CoRL 2024}
}

See Also