Researchers Use Large Language Models to Help Robots Navigate

Words instead of costly visual data direct multistep navigation tasks

A new navigation method uses language-based inputs to direct a robot through a multistep navigation task—such as doing laundry. Credit: iStock

Someday, you may want your home robot to carry a load of dirty clothes downstairs and deposit them in the washing machine in the far-left corner of the basement. The robot will need to combine your instructions with its visual observations to determine what it should do to complete this task.

For an AI agent, this is easier said than done. Current approaches often use multiple, hand-crafted machine-learning models to tackle different parts of the task, which requires a great deal of human effort and expertise to build. These methods, which use visual representations to directly make navigation decisions, demand massive amounts of visual data for training that are often hard to come by.

To overcome these challenges, researchers from MIT and the MIT-IBM Watson AI Lab devised a navigation method that converts visual representations into pieces of language that are then fed into one large language model that achieves all parts of the multistep navigation task.

…

Want to continue?

By logging in you agree to receive communication from Quality Digest. Privacy Policy.

Create a FREE account

Forgot My Password

Researchers Use Large Language Models to Help Robots Navigate

Words instead of costly visual data direct multistep navigation tasks

Social Sharing block

Add new comment