Dan Szafir imagines a day when robots work alongside humans on factory floors, hospitals and in homes, following requests from human supervisors and even providing companionship to those in nursing homes. But for that day to arrive, he says, robots must become better communicators, able to interpret facial expressions, gestures and tone of voice, while mimicking social behaviors like respecting personal space.
Szafir, an assistant professor with the ATLAS Institute and a Department of Computer Science faculty member, is helping make this day a reality by designing robots that interpret subtle nods, vocal inflections and gestures. And recognizing the importance of this and his other related work, Szafir was recently named in Forbes magazine’s 2017 “30 Under 30: Science” list, placing him among an elite group of up-and-coming young innovators.
“Up to this point, a great deal of work has focused on making robots appear more human, rather than improving their abilities to understand us,” says Szafir, whose robotics research is currently funded by a $174,000 National Science Foundation grant and a $360,000 NASA Early Career Faculty Award. “The goal of my research is to improve a robot’s ability to work successfully with people.”
To program robots to interpret nonverbal cues, Szafir starts by watching people. By observing teams completing lists of standardized tasks in front of high-speed cameras, microphones and motion and depth sensors, Szafir is compiling a library of what nonverbal human communication looks like to a computer.
“That’s the first step,” he says. “To see what people do and why they are doing it; to match human actions to the human intention behind those behaviors.”
One of the challenges of this project is that nonverbal communication can mean different things, says Szafir.
Imagine, for example, enlisting a robot’s help in building a bookshelf. If you say to the robot, ‘Go grab that screwdriver,’ while pointing in a general direction, the robot should use the physical gesture to limit its search area. If you say, ‘find another one of these,” while pointing to a specific component, the meaning of the physical gesture changes.
To minimize mistakes, Szafir’s research will include studying nonverbal communication when human research subjects complete particular tasks. Based on a given circumstance, he will determine the likelihood of what a given gesture means. Most of the time these likelihoods will resolve the ambiguous meaning of the gesture, but not always.
“It might be that the probability is 49 percent one way and 51 percent the other,” says Szafir. “You have to choose, and you might choose wrong.”
To refine accuracy, Szafir plans to crowdsource interpretations of large volumes of images of robots interacting with humans. By compiling these interpretations, Szafir and his team will build a playbook that computational models can incorporate to tip the probability toward a robot responding correctly to ambiguous physical gestures.
Equipped with infrared and optical cameras, microphones, face detection and motion sensors, robots have a lot of data to reference, Szafir points out. To make correct decisions, they need better models to synthesize the data.
No small task, and Szafir has a team of students working on the problem in his ATLAS-based Laboratory for Interactive Robotics and Novel Technology (IRON), which opened in January, 2016. But it’s not their only project. Over the last year, Szafir and his team completed a dynamic big data fusion and modeling project sponsored by Intel Labs, as well as several studies investigating the trust human subjects place in robots, and how that trust changes under stress and when robots make mistakes.
“It’s fascinating work with big implications,” says Mark Gross, director of the ATLAS Institute. “Our sweet spot is complex, design- and technology-related problems with world-changing potential. Stay tuned for where this is going.”