Everyday Robots

Canvas Category Machinery : Industrial Robot : Autonomous Mobile Robot

Primary Location Mountain View, California, United States

Born from X, the moonshot factory, and working alongside teams at Google, we’re building a new type of robot. One that can learn by itself, to help anyone with (almost) anything.

Assembly Line

Towards Helpful Robots: Grounding Language in Robotic Affordances

📅 Date: August 16, 2022

✍️ Authors: Brian Ichter, Karol Hausman

🔖 Topics: Industrial Robot, Natural Language Processing

🏢 Organizations: Google, Everyday Robots

In “Do As I Can, Not As I Say: Grounding Language in Robotic Affordances”, we present a novel approach, developed in partnership with Everyday Robots, that leverages advanced language model knowledge to enable a physical agent, such as a robot, to follow high-level textual instructions for physically-grounded tasks, while grounding the language model in tasks that are feasible within a specific real-world context. We evaluate our method, which we call PaLM-SayCan, by placing robots in a real kitchen setting and giving them tasks expressed in natural language. We observe highly interpretable results for temporally-extended complex and abstract tasks, like “I just worked out, please bring me a snack and a drink to recover.” Specifically, we demonstrate that grounding the language model in the real world nearly halves errors over non-grounded baselines. We are also excited to release a robot simulation setup where the research community can test this approach.

Can Robots Follow Instructions for New Tasks?

📅 Date: February 2, 2022

✍️ Authors: Chelsea Finn, Eric Jang

🔖 Topics: robotics, natural language processing, imitation learning

🏢 Organizations: Google, Everyday Robots

The results of this research show that simple imitation learning approaches can be scaled in a way that enables zero-shot generalization to new tasks. That is, it shows one of the first indications of robots being able to successfully carry out behaviors that were not in the training data. Interestingly, language embeddings pre-trained on ungrounded language corpora make for excellent task conditioners. We demonstrated that natural language models can not only provide a flexible input interface to robots, but that pretrained language representations actually confer new generalization capabilities to the downstream policy, such as composing unseen object pairs together.

In the course of building this system, we confirmed that periodic human interventions are a simple but important technique for achieving good performance. While there is a substantial amount of work to be done in the future, we believe that the zero-shot generalization capabilities of BC-Z are an important advancement towards increasing the generality of robotic learning systems and allowing people to command robots. We have released the teleoperated demonstrations used to train the policy in this paper, which we hope will provide researchers with a valuable resource for future multi-task robotic learning research.