Gemini Robotics
Home > Models > Gemini Robotics
Key Significance
- Pinnacle of Large LLM-Based VLAs: Representative case of applying internet-scale knowledge to robotics based on Gemini 2.0
- Cross-Embodiment Support: Single model supports various robot forms from bimanual stationary (ALOHA, Franka) to humanoids (Apptronik Apollo)
- System 1/2 Architecture: In Gemini Robotics 1.5, “Think before action” - high-level reasoning/judgment in cloud, real-time action generation on-device
- On-Device Version: Network-independent low-latency inference considering actual deployment environments
- CoRL 2025 Live Demo: Operated booths at the conference where general attendees could directly experience actual operation
- Expanded Data vs RT Series: Training data scale significantly expanded compared to RT-1/2, though specific numbers undisclosed
- Industry Partnerships: Major humanoid companies like Boston Dynamics, Agility Robotics participating as trusted testers

Gemini Robotics: Gemini 2.0-based VLA Model Family
Overview
Gemini Robotics is a Gemini 2.0-based robotics model series announced by Google DeepMind in March 2025. It adds physical actions as a new output modality to directly control robots.
| Item | Details |
|---|---|
| Announced | March 12, 2025 |
| Base | Gemini 2.0 |
| Paper | arXiv:2503.20020 |
| Official | deepmind.google/models/gemini-robotics |
Model Family
Gemini Robotics (Base)
VLA model that adds physical action output to Gemini 2.0.
Gemini Robotics-ER
Version with Embodied Reasoning capabilities.
- Advanced spatial understanding
- Robotics program execution support
Gemini Robotics On-Device (2025.06)
Most powerful VLA model that runs locally on robotic devices.
| Feature | Description |
|---|---|
| Local Execution | Network independent |
| Low Latency | Suitable for latency-sensitive applications |
| Robustness | Intermittent/zero connectivity environments |
Gemini Robotics 1.5
Most powerful version.
- Visual information → Motor command conversion
- “Think before action”
- Transparent display of work process
Core Capabilities
Three essential characteristics for robotics AI:
| Characteristic | Description |
|---|---|
| General | Adapts to various situations |
| Interactive | Quickly responds to instructions/environmental changes |
| Dexterous | Performs delicate tasks with fingers |
Performance
2x+ performance over SOTA VLA on generalization benchmarks
Performable Tasks
- Origami
- Packing lunch box
- Preparing salad
- Other precision manipulation tasks
Robot Compatibility
Single model supports various robot forms:
| Robot Type | Examples |
|---|---|
| Bimanual Stationary | ALOHA, Bi-arm Franka |
| Humanoid | Apptronik Apollo |
Industry Partners
Trusted Testers (Gemini Robotics-ER)
- Agile Robots
- Agility Robots
- Boston Dynamics
- Enchanted Tools
Strategic Partner
- Apptronik: Developing next-generation humanoid with Gemini 2.0
Relationship with RT Series
| Model | Period | Features |
|---|---|---|
| RT-1 | 2022 | First Robotics Transformer |
| RT-2 | 2023 | First VLA (PaLM-E based) |
| RT-X | 2023 | Multi-robot |
| Gemini Robotics | 2025 | Gemini 2.0 based, most powerful |
References
- Google DeepMind - Gemini Robotics
- Blog - Gemini Robotics
- Blog - On-Device
- arXiv Paper
- How Google Built Gemini Robotics