Northwestern University engineers have developed a new artificial intelligence (AI) algorithm designed specifically for smart robotics helping robots rapidly and reliably learn complex skills, called Maximum Diffusion Reinforcement Learning (MaxDiff RL). This new algorithm encourages robots to randomly explore their environments to gain as much experience as possible and by using high-quality simulated exploration data, robots demonstrated faster, more efficient learning, improving their reliability and performance and those robots using MaxDiff RL consistently outperformed other state-of-the-art models. (see research in the journal Nature Machine Intelligence).
The new algorithm works so well that in some tasks, robots were able to successfully performed tasks in a single attempt. “……Other AI frameworks can be somewhat unreliable, and sometimes robots will totally nail a task, but, other times, they will fail completely. With this new framework, as long as the robot is capable of solving the task at all, robots do exactly what they’ve been asked to do, making its easier to interpret robot successes and failures, which is crucial.”
Training of machine-learning algorithm requires huge quantities of filtered and curated data, and AI uses this to train until they reach optimal results, but this doesn’t work well for robots because robots typically need to collect data by themselves and traditional algorithms are not compatible because disembodied systems can take advantage of a world where physical laws do not apply and as AI failures have no consequences, but in robotics, one failure could be catastrophic. By learning through self-curated random experiences, using MaxDiff RL, robots acquire necessary skills to accomplish useful tasks, but the most impressive element is that robots using the MaxDiff RL method often succeeded at correctly performing a task in a single attempt, even when they started with no knowledge.
As MaxDiff RL is a general algorithm, it can be used for a variety of applications, paving the way for reliable decision-making in smart robotics, not only for robotic vehicles that move around, but for stationary robots learning too do complex local tasks.

