Galbot

Overview

Galbot (Galaxy General Robotics) is a China-based full-stack robotics unicorn that achieved the first large-scale commercial deployment in the VLA field with its “Synthetic First, Real Data as a Complement” philosophy. They build production-ready systems with 99% synthetic + <1% real data.

Item	Details
Founded	May 2023
Headquarters	Beijing, China (Haidian District)
R&D Centers	Beijing, Shenzhen, Suzhou, Hong Kong
Co-founders	He Wang (CTO), TengZhou Yao
Affiliations	Beijing Academy of AI (BAAI), PKU EPIC Lab
Total Funding	$800M+ (as of December 2025)
Valuation	$3B (as of December 2025)
Mission	”Make robots for every industry and every home”

He Wang: Born 1992, graduated from Tsinghua University, earned PhD from Stanford University in 2021 (advisor: Leonidas J. Guibas). Currently an Assistant Professor at Peking University CFCS and founder of PKU EPIC Lab.

Key Products

Galbot G1

Semi-humanoid mobile manipulator:

Specification	Value
Height	173cm
Weight	85kg
Arm Span	190cm
Max Reach Height	240cm
Payload	5kg (single arm)
Battery Life	10 hours continuous
Item Handling	5,000+ SKU types

Galbot S1

Industrial heavy-duty robot (launched 2025):

Specification	Value
Payload	50kg (continuous dual-arm)
Application	Manufacturing, heavy industry

Key Achievements

Commercial Deployment

Metric	Value
Galbot Store	30+ Chinese cities
Smart Pharmacy/Warehouses	30+ fully unmanned
Workers per warehouse	0
MTBF	1 month+
Continuous operation	10 hours/charge

Key Partnerships

Manufacturing: CATL, Bosch, Toyota, Hyundai
Healthcare: Xuanwu Hospital (patient rooms, pharmacies, guidance systems)

GraspVLA Performance (LIBERO Zero-shot)

Model	Long	Goal	Object	Condition
OpenVLA	33.7%	56.6%	65.4%	fine-tuned
π0	62.7%	79.4%	93.8%	fine-tuned
GraspVLA	82.0%	91.2%	94.1%	zero-shot

Outperforms fine-tuned models without any fine-tuning

Technical Architecture

Cerebrum-Cerebellum Structure

Dual system inspired by human brain architecture:

Component	Role	Implementation
Cerebrum	High-level policy - what to do	VLA (Imitation + Web Grounding)
Cerebellum	Low-level motor - how to do	RL-based 100Hz control

GraspVLA

Component	Specification
Vision Encoder	DINO-v2 + SigLIP
LLM Backbone	InternLM2 1.8B
Action Expert	Flow Matching
Training Data	1B synthetic + 100M+ web grounding
Training Cost	~$5,000 (160×RTX 4090, 10 days)

Data Strategy

Synthetic Data Pipeline

Scene Synthesis
    ↓
Trajectory Generation
  ├─ Physics-based Energy Optimization (DexGraspNet)
  ├─ Human Videos → Synthetic (GenHOI)
  └─ Large-Scale RL (UniDexGrasp++)
    ↓
Validation & Rendering
  ├─ MuJoCo physics validation
  └─ Isaac Sim ray-tracing
    ↓
Sim2Real Transfer (1B frames convergence)

Scaling Law Discovery

1B frames: sim/real performance curves converge
Sim2real gap decreases with data scale
Scale impossible to achieve via teleoperation

Data Scale

Data Type	Scale
Synthetic trajectories	Billion-scale
DexGraspNet 2.0 grasps	426M
Web grounding (GRIT)	100M+ bboxes
Real data ratio	<1%

Research Portfolio

Grasping

Research	Venue	Key Contribution
DexGraspNet	ICRA 2023 Finalist	Million-scale dexterous grasp
UniDexGrasp++	ICCV 2023 Finalist	Large-scale RL, policy distillation
DexGraspNet 2.0	CoRL 2024	7 embodiments, 426M grasps
Dexonomy	RSS 2025	100+ grasp taxonomy
GraspVLA	2025	Billion-scale synthetic VLA

Sim2Real Solution

Research	Key Contribution
DexNDM	World model for sim2real gap correction

DexNDM Approach:

Train generalist policy in simulator
Train neural dynamics model with small real data
Fix sim2real gap via back-propagation (differentiable)

Research	Key Contribution
NavFoM	Cross-embodiment navigation foundation model
TrackVLA	30+ minute human tracking

Capability Assessment (6 Axes)

Axis	Rating	Evidence
Long-horizon	✓	Cloth folding deformable manipulation sim2real
Precision	✓	Driver, hammer manipulation via DexNDM
Deployment Robustness	✓	MTBF 1 month+, 30+ unmanned warehouses
Multi-task	✓	GraspVLA zero-shot 82% (LIBERO)
Cross-embodiment	✓	7 hand embodiments supported
Zero-shot	✓	Outperforms π0 fine-tuned

Limitations

Acknowledged Limitations (from presentation)

Limitation	Description
Task Scope	Grasping-specialized, not yet generalist
Dexterous Sim2Real	Initially failed → solved with DexNDM
Specialist vs Generalist	Motion control still specialist policy

Analytical Limitations

Limitation	Description
Not Pure Sim2Real	Includes web grounding (100M+ real images)
Lack of Theoretical Explanation	No theory on why 1B frames converge
Embodiment Generalization	Franka Panda-centric, limited cross-robot transfer

vs Competitors

Item	Galbot	Physical Intelligence	Figure
Deployment Scale	30+ cities + 30+ warehouses	Research demo	Research demo
Data Strategy	99% synthetic	Cross-embodiment real	VLA (Helix)
Strength	Only commercial deployment	Pursuing generality	Industrial optimization
Weakness	Grasping-specialized	Weak zero-shot	No consumer support