Karol Hausman
Home > People > Karol Hausman
Profile
| Field | Details |
|---|---|
| Current Position | Physical Intelligence Co-founder |
| Previous | Google DeepMind Staff Research Scientist |
| PhD | USC (University of Southern California) |
| Nationality | Polish |
Key Contributions
- RT Series Key Leader: Led development of RT-1, RT-2, RT-X
- SayCan: Early research connecting LLMs to robot control
- Physical Intelligence Founding: pi0 development
- Google Robotics Key Figure: Led industrialization of VLA research
Research Timeline
PhD & Early Career (2012-2017)
USC - Advised by Stefan Schaal
| Year | Work | Impact |
|---|---|---|
| 2015 | Skill Learning | Robot skill learning |
| 2017 | Multi-Task Learning | Multi-task robot learning |
Google Brain / DeepMind (2017-2024)
Core Google Robotics Research
| Year | Work | Impact |
|---|---|---|
| 2018 | Joined | Google Brain Robotics |
| 2020 | Multi-Task RL | Multi-task learning |
| 2022 | SayCan | LLM + robot grounding |
| 2022 | RT-1 | Robotics Transformer |
| 2023 | RT-2 | First VLA model |
| 2023 | RT-X | Open X-Embodiment |
| 2023 | PaLM-E | Embodied Language Model |
Physical Intelligence (2024-present)
Co-founding & pi0 Development
| Year | Work | Impact |
|---|---|---|
| 2024 | Physical Intelligence Founded | General-purpose robot AI |
| 2024 | pi0 | Flow matching VLA |
| 2025 | pi0.5 | Open-world generalization |
Major Publications
VLA & Foundation Models
- RT-1 (2022) - Robotics Transformer
- RT-2 (2023) - Vision-Language-Action
- RT-X (2023) - Open X-Embodiment
- PaLM-E (2023) - Embodied multimodal model
- pi0 (2024) - Flow matching VLA
LLM + Robotics
- SayCan (2022) - LLM grounding to robotics
- Inner Monologue (2022) - LLM feedback for robots
Multi-Task Learning
- Multi-Task RL (2018)
- Skill Composition (2019)
Key Ideas
SayCan (2022)
Core: Combining LLM language understanding with robot's actual execution capability
Feasibility = P(useful|LLM) x P(success|Robot)
LLM: "What should be done" (semantic)
Robot: "What can be done" (affordance)
Impact:
- Core early research in LLM + robot integration
- Foundation for many subsequent LLM-robot studies
RT-2 & VLA (2023)
Core: Using VLM directly for robot control
Previous: Separate perception + planning + control
RT-2: Single VLM directly outputs image-to-action
Impact:
- Established VLA paradigm
- Foundation model application to robotics
Philosophy & Direction
Research Philosophy
“The key to robot AI is generalization. The goal is general capability, not specific tasks.”
Research Direction Evolution
- 2012-2017: Skill learning, multi-task RL
- 2017-2022: Large-scale robot learning at Google
- 2022-2023: LLM + robotics, VLA models
- 2024-present: Foundation models, Physical Intelligence
Google to Physical Intelligence
Achievements at Google
- Established VLA paradigm with RT series
- Initiated LLM-robot integration with SayCan
- Key figure in Google Robotics
Motivation for Founding
- Beyond academic research to actual productization
- Commercializing general-purpose robot AI
- Fast execution and focus