New research allows humans to adjust the actions of robots in real time, similar to how they would give feedback to another person.

Imagine a robot helping you wash dishes. You ask it to retrieve a bowl of soap from the sink, but its grippers don't grasp it precisely where needed.
With a new methodology framework developed by researchers at MIT and NVIDIA, you can control the robot's behavior with simple gestures. You can point at the bowl or draw a path on the screen, or simply gently push the robot's arm in the right direction.
Unlike other robot behavior modification methods, this technique does not require users to collect new data and retrain the machine learning model controlling the robot. Instead, it allows the robot to use real-time, intuitive human feedback to select the sequence of actions that best aligns with the user's intentions.
When researchers tested this methodological framework, its success rate was 21% higher than an alternative method that did not utilize human intervention.
In the future, this methodology framework could make it easier for users to guide a factory-trained robot to perform various household tasks, even if the robot has never seen the environment or objects in that house before.
“We can’t expect the average user to manually collect data and fine-tune a neural network model. They’ll expect the robot to work right out of the box, and if an error occurs, they need an intuitive mechanism to adjust it. This is the challenge we addressed in this research,” said Felix Yanwei Wang, a graduate student in Electrical Engineering and Computer Science (EECS) at MIT and lead author of the study.
Minimize deviations
Recently, researchers have used pre-trained generative AI models to learn a "policy"—a set of rules that robots follow to complete a task. These models can solve many complex tasks.
During training, the model is only exposed to valid robot movements, so it learns to create appropriate trajectories.
However, this does not mean that every action of the robot will align with the user's wishes in reality. For example, a robot might be trained to retrieve boxes from a shelf without knocking them over, but might fail to reach a box on someone's bookshelf if the bookshelf layout is different from what it saw during training.
To overcome such errors, engineers typically gather more data on the new task and retrain the model, a costly and time-consuming process that requires expertise in machine learning.
Instead, the research team at MIT wants to allow users to adjust the robot's behavior as soon as it makes a mistake.
However, if humans interfere with the robot's decision-making process, it could inadvertently cause the generative model to choose an invalid action. The robot might retrieve the box the user wants, but could knock over books on the shelf in the process.
"We want users to interact with the robot without making such mistakes, thereby achieving behavior that is more consistent with user intent, while still ensuring validity and feasibility," said Felix Yanwei Wang.
Enhance decision-making capabilities
To ensure these interactions don't cause the robot to perform invalid actions, the research team used a special sampling process. This technique helps the model select an action from a set of valid options that best suits the user's goal.
"Instead of imposing our will on the user, we help the robot understand their intentions, and allow the sampling process to fluctuate around the behaviors it has learned," said Felix Yanwei Wang.
Thanks to this method, their research framework outperformed other methods in simulation experiments as well as testing with actual robotic arms in a model kitchen.
Although this method doesn't always complete the task immediately, it offers a significant advantage to users: they can fix the robot as soon as they detect a fault, instead of waiting for the robot to complete the task before giving new instructions.
Furthermore, after the user gently pushes the robot a few times to guide it to pick up the correct bowl, the robot can remember that corrective action and integrate it into its future learning process. As a result, the next day, the robot can pick up the correct bowl without needing further instruction.
"But the key to this continuous improvement is having a mechanism for users to interact with the robot, and that's exactly what we've demonstrated in this research," said Felix Yanwei Wang.
In the future, the research team aims to increase the speed of the sampling process while maintaining or improving efficiency. They also want to test this method in new environments to assess the robot's adaptability.
(Source: MIT News)
Source: https://vietnamnet.vn/ung-dung-ai-tao-sinh-giup-robot-tuong-tac-thong-minh-hon-2381531.html






Comment (0)