Deepmind Foundational Models Talk

SPEAKER: Ted Xiao, Google Deepmind

TITLE: What’s Missing for Robotics-first Foundation Models?

ABSTRACT:

Intelligent robotics have seen tremendous progress in recent years. In this talk, I propose that trends in robot learning have historically almost exactly followed key trends in broader foundation modeling. After covering robotics projects which showcase the power of following such foundation modeling paradigms, I will focus on a few future-looking research directions which may suggest a unique future for how robot learning systems may develop differently from LLMs and VLMs.

Notes:

There are current pineapple pen techniques

VLM (frozen) -> LLM (frozen) -> Control Policy

Problems

not optimized for robotics
narrow communication bandwidth between modules

Foundational models in robotics optimize model for all subtasks at once - vision, language, control

Missing pieces

#1 positive transfer in scale

#2 steerability and promptability

#3 scalable evaluation