The robot arm hovers over a pile of products before it makes its move, snagging a toothbrush with its suction cup. It holds the product up, waits for the red flash of a barcode scanner, then turns and drops the toothbrush in a cubby hole. Next the arm suction-cups a box of Goldfish crackers, turns, and files it, too.
At a startup called Kindred in San Francisco, technicians are teaching robots how to precisely manipulate objects like these. Why? Because somebody’s got one hell of an online shopping habit. The idea is to get robots so good at picking and placing products that they make human workers look like sloths on sedatives, thus supercharging order fulfillment centers. And how these researchers are trying to do it has big implications for robots beyond the warehouse.
If you want to teach a robot to pick up an object, you could do it the classical way and program it with line after line of code. Or like Kindred says its system works, you can use more modern approaches in artificial intelligence: reinforcement learning and imitation learning.
According to Kindred, its robots start with the former. With reinforcement learning, the robots practice manipulating products on their own with trial and error. When they do something right, they “score,” hence the reinforcement. “The goal is to maximize the score over time,” says George Babu, cofounder of Kindred. “When you do something correctly, then you explore actions similar to the one that gave you a correct response.”
Reinforcement learning has its limitations, though. For one, it’s slow. In a purely digital environment, a simulator could rapidly try and fail, over and over and over—but with a robot in the real world, that iteration is constrained by the laws of the physical universe.
And two, Kindred’s robots can only teach themselves so much; there are simply too many scenarios that play out in the real world. So a human operator steps in to initiate the second of Kindred’s approaches: so-called imitation learning, looking through the robot’s eyes and guiding its arms. “Some of our algorithms are imitating where the human picked the object,” says Babu, “some of our algorithms are imitating how the human is moving through space to get the objects.”
This builds on what the robot learned through reinforcement, showing it what constitutes a good or bad grip. Essentially, it fills in knowledge gaps by creating lessons that the robot couldn’t practice on its own. Thus a robot learns to more precisely manipulate products like boxes of drugs and toothbrushes.
Which will be essential in an ecommerce environment (Gap is currently testing Kindred’s system), where a robot may encounter objects that are hard or soft or floppy or fragile. And with a human in the loop, the robot will have a tutor to guide it remotely if it comes across something novel. “If something changes, our algorithms say, Wait, I don’t recognize this object. I don’t feel confident doing this,” says Babu. “We quickly kick in the human to help the robot do the task and then we can learn from that and we can improve our algorithms.”
The power to easily teach robots will make for highly adaptable machines far beyond an order fulfillment center. “Long term, it’ll likely mean you don’t necessarily think of robots just doing one specific thing, like buying a robot for X or Y or Z,” says UC Berkeley roboticist Pieter Abbeel, whose own startup Embodied Intelligence is using VR controls to teach robots skills. “But you buy a robot that can help you with anything, assuming you can give a few demonstrations.”
Sure, the education of the robots has just begun—even boxes of allergy medicine still give them pause. But soon enough they’ll be running laps around us, all thanks to the gold old human touch.