Main Content

Research makes robots better at following spoken instructions

A new software system helps robots to more effectively act on instructions from people, who by nature give commands that range from simple and straightforward to those that are more complex and imply a myriad of subtasks. A new system based on research by Brown University computer scientists makes robots better at following spoken instructions, no matter how abstract or specific those instructions may be. The development, which was presented this week at the Robotics: Science and Systems 2017 conference in Boston, is a step toward robots that are able to more seamlessly communicate with human collaborators. The research was led by Dilip Arumugam and Siddharth Karamcheti, both undergraduates at Brown when the work was performed (Arumugam is now a Brown graduate student). They worked with graduate student Nakul Gopalan and postdoctoral researcher Lawson L.S. Wong in the lab of Stefanie Tellex, a professor of computer science at Brown. “The issue we’re addressing is language grounding, which means having a robot take natural language commands and generate behaviors that successfully complete a task,” Arumugam said. “The problem is that commands can have different levels of abstraction, and that can cause a robot to plan its actions inefficiently or fail to complete the task at all.” For example, imagine someone in a warehouse working side-by-side with a robotic forklift. The person might say to the robotic partner, “Grab that pallet.” That’s a highly abstract command that implies a number of smaller sub-steps — lining up the lift, putting the forks underneath and hoisting it up. However, other common commands might be more fine-grained, involving only a single action: “Tilt the forks back a little,” for example. Those different levels of abstraction can cause problems for current robot language models, the researchers say. Most models try to identify cues from the words in the command as well as the sentence structure and then infer a desired action from that language. The inference results then trigger a planning algorithm that attempts to solve the task. But without taking into account the specificity of the instructions, the robot might overplan for simple instructions, or underplan for more abstract instructions that involve more sub-steps. That can result in incorrect actions or an overly long planning lag before the robot takes action. But this new system adds an additional level of sophistication to existing models. In addition to simply inferring a desired task from language, the new system also analyzes the language to infer a distinct level of abstraction.”

Link to article