On the Dual-Use Dilemma in Physical Reasoning and Force
Humans learn how and when to apply forces in the world via a complex physiological and psychological learning process. Attempting to replicate this in vision-language models (VLMs) presents two challenges: VLMs can produce harmful behavior, which is particularly dangerous for VLM-controlled robots which interact with the world, but imposing behavioral safeguards can limit their functional and ethical extents. We conduct two case studies on safeguarding VLMs which generate forceful robotic motion, finding that safeguards reduce both harmful and helpful behavior involving contact-rich manipulation of human body parts. Then, we discuss the key implication of this result--that value alignment may impede desirable robot capabilities--for model evaluation and robot learning.
References

Varying contextual semantics in the same scene can yield harm and help, often with a thin line separating them. We evaluate how VLMs under different prompt schemes which elicit physical reasoning for robot control navigate this line between harm and help for forceful, contact-rich tasks with potential for bodily danger.