The Pentagon Wants to Stop Enemies From 'Poisoning' AI
The Defense Advanced Research Projects Agency is looking to defend machine learning tools against outside manipulation.
The Pentagon’s research arm is working to stop digital adversaries from tricking federal artificial intelligence systems into doing their bidding.
Broadly speaking, machine learning tools learn by finding relationships within training data and applying that knowledge to real-life situations. In a perfect world, this approach lets the system accurately interpret new information, but bad actors can manipulate the process to sway its decisions in their favor.
Through the Guaranteeing AI Robustness against Deception program, the Defense Advanced Research Projects Agency aims to bolster the tech’s defenses against outside attacks.
“The growing sophistication and ubiquity of ML components in advanced systems dramatically ... increases opportunities for new, potentially unidentified vulnerabilities,” DARPA officials wrote in a solicitation published Thursday. “The field now appears increasingly pessimistic, sensing that developing effective ML defenses may prove significantly more difficult than designing new attacks, leaving advanced systems vulnerable and exposed.”
Among the most common techniques for corrupting AI are so-called poisoning attacks, in which adversaries feed the tool rigged training data to alter its decisions. Bad actors can also use inference attacks to figure out what data was used to train existing tools—and thus how they can be manipulated.
Groups selected for the program would create a way to assess how vulnerable machine learning tools are to these and other types of attacks, and determine how to shore up their defenses. They would also develop algorithms that predict how susceptible a tool is to manipulation, and detect when and how it could be compromised.
Officials suggested teams could “gain insight and inspiration” from animal immune systems and how they respond to bacteria and viruses. The solutions they propose, however, must be broad enough to combat many different types of threats rather than focusing on a specific attack, they said.
DARPA will host a proposers day for the program on Feb. 6, and participants must register by Feb. 1.
The program comes as the defense and intelligence communities grow increasingly wary of unseen threats to AI systems. Because it’s often impossible to know how AI tools come to certain conclusions, people have a difficult time determining if a system is acting according to plan.
The Intelligence Advanced Research Projects Activity is currently running two initiatives to protect AI against outside influence—the TrojAI program aims to uncover when training data has been corrupted, and the SAILS program centers on keeping enemies from figuring out how tools were trained.
IARPA is hosting a proposers day for both programs on Feb. 26.