Figure Helix

Figure AI's VLA Model for Humanoid Robots

Author’s Note

  • Helix 02 is the first humanoid VLA to control the full body with a single neural network.
  • Performing 61 sequential actions over 4 minutes without resets proves real-world capability, not cherry-picked demos.
  • Impressive that they’ve pushed control frequency to 1kHz with the three-layer System 0/1/2 architecture.

Key Significance

  • First Full-Body Autonomous Humanoid: Single neural system that controls the full body directly from pixels
  • Longest Autonomous Task: 4-minute continuous operation with 61 sequential loco-manipulation actions, no resets or human intervention
  • Three-Layer Architecture: System 0 (1kHz) + System 1 (200Hz) + System 2 (semantic reasoning)
  • New Sensors: Palm cameras and tactile sensors detecting forces as small as 3g
  • Replaces 100K Lines: 10-million parameter neural network replaces 109,504 lines of hand-engineered C++ code

Figure Helix 02 Official Demo Video (2026.01)


Overview

ItemDetails
Latest VersionHelix 02 (January 2026)
CompanyFigure AI
Blogfigure.ai/news/helix-02
RobotFigure 02

Helix 02 (2026.01)

Key Advancements

Helix 02 is the first humanoid with “a single neural system that controls the full body directly from pixels.”

FeatureHelix (2025.02)Helix 02 (2026.01)
Control RangeUpper body focusedFull body (locomotion + manipulation integrated)
ArchitectureSystem 1/2System 0/1/2
Max Frequency200Hz1kHz
Tactile SensorsNoneDetects 3g forces
Palm CamerasNoneYes

4-Minute Continuous Autonomous Task

Dishwasher loading/unloading across an entire kitchen:

ItemValue
Continuous Operation Time4 minutes
Sequential Loco-manipulation Actions61
Resets/Human InterventionNone

Dexterity Demonstrations

  • Unscrewing bottle caps with torque control
  • Extracting individual pills from organizers despite occlusion
  • Dispensing precisely 5ml from syringes
  • Singulating small metal pieces from cluttered bins

Architecture: System 0/1/2

Helix 02 uses a three-layer hierarchical architecture.

Helix 02 System 0/1/2 Architecture Explanation

System 0 (S0) - Physical Execution Layer

ItemDetails
Frequency1 kHz
Parameters10 million
Training Data1,000+ hours of human motion data
Simulation200,000+ parallel environments with domain randomization
Replaced Code109,504 lines of hand-engineered C++

Handles real-time balance and contact management.

System 1 (S1) - Visuomotor Control

ItemDetails
Frequency200 Hz
ArchitectureTransformer conditioned on System 2 latents
Structure”Pixels-to-whole-body” (all sensors → all actuators)

Sensor Inputs:

  • Head cameras
  • Palm cameras
  • Fingertip tactile sensors
  • Proprioception

Actuator Outputs:

  • Complete joint control: legs, torso, head, arms, wrists, individual fingers

System 2 (S2) - Semantic Reasoning

  • Processes visual scenes and language commands
  • Sequences high-level behaviors (“Walk to dishwasher and open it”)
  • Generates latent goals without specifying low-level coordination

New Hardware Capabilities

Palm Cameras

  • Cameras mounted on palms
  • Enable in-hand visual feedback when objects are occluded from head perspective

Tactile Sensors

  • Sensitivity: Detects forces as small as 3g (“sensitive enough to feel a paperclip”)
  • Force-modulated grasping across five-fingered hands

Helix (2025.02) - Initial Version

Helix Architecture

Helix Architecture: System 1 (200Hz low-level control) + System 2 (7-9Hz high-level planning)

Key Features

  • First full-body high-speed control VLA for humanoids
  • Controls upper body (wrists, torso, head, individual fingers) at 200Hz
  • Dual robot simultaneous control

Table-to-Dishwasher Task

ItemValue
Distance Traveled130+ feet
Unique Interactions33
Number of Objects21 (including delicate dishes)

Hardware: Figure 02

ItemSpec
DoF35 DoF
HandsHuman-like wrists, hands, fingers
ComputingNVIDIA RTX GPU (3x previous)
CamerasHead 6x RGB + Palm cameras (Helix 02)
Tactile SensorsFingertips (Helix 02)
Hand PayloadUp to 25kg

Hardware: Figure 03 (2025.10)

Figure 03 is the third-generation humanoid robot designed for home environments.

Figure 02 vs Figure 03

FeatureFigure 02Figure 03
Target EnvironmentIndustrial (BMW factory)Home
WeightBaseline9% lighter
ExteriorHard machined partsSoft textiles + multi-density foam
Battery-5-hour runtime, 2kW wireless charging
Data Transfer-10 Gbps mmWave

Vision System

ItemSpec
Total Cameras8 (Head 6 + Palm 2)
Frame Rate2x improved
LatencyReduced to 1/4
Field of View60% wider

Tactile Sensors

  • Fingertip sensors: Detect forces as small as 3g (paperclip weight)
  • Custom-developed first-generation tactile sensors

Battery & Charging

ItemSpec
ModelF.03 Battery
Runtime~5 hours
Charging2kW wireless inductive charging (foot coils)
CertificationUN38.3, targeting UL standard
Data Offload10 Gbps mmWave (for fleet learning)

Safety Design

  • Covered in soft textiles (washable, tool-free removal)
  • Multi-density foam eliminates pinch points
  • Optional: Cut-resistant garments
  • Customizable side screens (fleet branding)

Manufacturing: BotQ

ItemDetails
Current Capacity12,000 units/year
4-Year Goal100,000 total units
ManufacturingShift from CNC to die-casting, injection molding, stamping
Expected Price~$20,000

Figure AI History

DateEvent
2022Figure 01 - First prototype, bipedal, targeting logistics/warehousing
2024.08Figure 02 - Industrial deployment, BMW factory testing
2025.02Helix - First full-body humanoid VLA
2025.10Figure 03 - Home humanoid, soft design
2026.01Helix 02 - Full-body autonomy, System 0/1/2 architecture

Funding & Partnerships

TimingDetails
2024.01BMW partnership (automotive manufacturing deployment)
2024.02$675M funding (valuation $2.6B)
InvestorsJeff Bezos, Microsoft, NVIDIA, Intel, Amazon, OpenAI

References


See Also