Sergey Levine

Profile

Field	Details
Current Position	Associate Professor, UC Berkeley
Lab	RAIL Lab (Robotic AI & Learning Lab)
Company	Physical Intelligence Co-founder
Previous	Google Research (2016-2024, concurrent)
PhD	Stanford University

Key Contributions

RT Series Key Figure: Participated in RT-1, RT-2 development, established VLA paradigm
Robot Reinforcement Learning Pioneer: Developed RL methods that work on real robots
Open X-Embodiment: Led collaboration across 33 research labs
OpenVLA, Octo: Developed open-source VLA models
Physical Intelligence Founding: Co-founded general-purpose robot AI company

Research Timeline

PhD & Postdoc (2009-2016)

Stanford -> UC Berkeley Postdoc

Year	Work	Impact
2013	Guided Policy Search	Foundation for model-based RL
2015	End-to-End Visuomotor Policies	Direct image-to-action learning
2016	Deep RL for Robotic Manipulation	Practical robot RL

UC Berkeley + Google (2016-2024)

UC Berkeley Faculty & Google Research Collaboration

Year	Work	Impact
2018	QT-Opt	Large-scale robot grasping
2018	Soft Actor-Critic (SAC)	Most widely used off-policy RL
2020	Offline RL Survey	Established offline RL
2021	Trajectory Transformer	RL as sequence modeling

RT Series & VLA (2022-2024)

RT Series, Open-source VLA

Year	Work	Impact
2022	RT-1	First large-scale Robotics Transformer
2023	RT-2	First VLA, Action as Language
2023	RT-X	Open X-Embodiment, 33 lab collaboration
2024	Octo	93M open-source generalist policy
2024	OpenVLA	7B open-source VLA

Physical Intelligence (2024-present)

Co-founded Physical Intelligence, pi0 development

Major Publications

Reinforcement Learning

SAC (Soft Actor-Critic) - Most widely used off-policy RL
CQL (Conservative Q-Learning) - Core offline RL method
Trajectory Transformer - RL as sequence modeling (concurrent with Decision Transformer)
Offline RL Tutorial - Established the field

Robot Learning

RT-1 (2022) - Robotics Transformer
RT-2 (2023) - Vision-Language-Action
RT-X (2023) - Open X-Embodiment
Octo (2024) - Open-source generalist policy
OpenVLA (2024) - 7B open-source VLA

End-to-End Learning

End-to-End Training of Deep Visuomotor Policies (2016)
QT-Opt (2018) - Large-scale robot grasping

Key Ideas

SAC (Soft Actor-Critic, 2018)

Core: Maximum entropy RL - maximize reward + maximize policy entropy

J(pi) = Sum E[r(st,at) + alpha*H(pi(.|st))]

Impact:

Currently the most widely used continuous control RL algorithm
Standard across robotics, games, and simulation

RT-2 & Action as Language (2023)

Core: Represent robot actions as text tokens to integrate with VLM

[Image + Language instruction] -> VLM -> [Action tokens] -> Robot control

Impact:

Beginning of VLA paradigm
All subsequent VLA models adopted this approach

Open X-Embodiment (2023)

Collaboration across 33 research labs
22 robot types, 1M+ episodes
Democratized research with open-source dataset

Philosophy & Direction

Research Philosophy

“The key to robot learning is data. More data, more diverse data is the key to generalization.”

Research Direction Evolution

2009-2015: Model-based RL, trajectory optimization
2015-2018: End-to-end deep RL
2018-2021: Off-policy RL, offline RL
2021-2023: Large-scale robot learning, foundation models
2023-present: VLA, general-purpose robot AI

Students & Mentees

Sergey Levine lab alumni:

Aviral Kumar (CMU Professor, Google DeepMind Researcher)
Justin Fu (Google DeepMind)
Michael Janner (UC Berkeley)
Numerous robotics/RL researchers

Key collaborators:

Chelsea Finn (Stanford Professor, Physical Intelligence Co-founder)
Pieter Abbeel (UC Berkeley Professor)

Awards & Recognition

Presidential Early Career Award for Scientists and Engineers (PECASE)
Sloan Research Fellowship (2019)
CoRL Best Systems Paper (QT-Opt, 2018)
Multiple conference Best Paper Awards