DHS tests show facial recognition tech has varied results, but gaining ground
Although some systems were able to meet performance thresholds for all demographic groups, others did show differentials because of issues with obtaining quality photos of people with varying skin tones.
The research and development arm of the Department of Homeland Security published results of its 2022 biometric tech rally, meant to test identity systems in use cases of small groups of people in populated places like airports, on Thursday.
Overall, the tests show that group processing can work, DHS officials told media during a press briefing on Thursday, although some, but not all, systems did show some performance differences based on demographics, also known as demographic differentials.
The DHS Science and Technology Directorate has been holding tech rallies since 2018, meant to engage industry with challenge problems and test products against set metrics. Future tests will look at the performance of remote identity validation tech like document validation and liveness tech, which is meant to ensure that a live person is presenting themselves to be identified, as opposed to a spoof.
For the 2022 rally, the testers tested 40 combinations of acquisition systems that capture images and matching algorithms against 575 volunteers over 11 days in the agency’s test facility in Maryland. Specifically, DHS officials were looking at effectiveness, efficiency, privacy, user satisfaction and equitability.
The facility was set up in a mock situation akin to airport security, where groups of two and four people walked through a lane to be identified and others walked through nearby bypass lanes. Systems were meant to produce one good photo of each person and identify them, while not accidentally capturing bystanders or nearby individuals who’d opted out of the process.
All system combinations met a three-second timing mark and, in terms of privacy, many of the systems were able to refrain from getting any images of bystanders or individuals who opt-ed out, said Yevgeniy Sirotin, the principal investigator and technical director of the identity and data sciences lab at the Maryland test facility.
Of the 40 combinations, 37 captured less than 1% of individuals not in the opt-in space, showing that “if you specify it, industry can come up with solutions to help address” privacy and not capture faces not meant to be photographed, said Arun Vemury, director of DHS’s Bioemtric and Identity Technology Center within the DHS Science and Technology Directorate.
The errors that did come up generally came from the cameras being tested, not the facial recognition algorithms, which have “improved drastically” in recent years, said Sirotin.
In some systems, there were differentials in terms of race, gender and skin tone, of which DHS took calibrated readings. Some systems struggle at getting the right level of exposure in photos of people with darker skin and to a lesser extent, those with extremely light skin tones.
Nine system combinations met the 95% identification mark for all skin tones, compared to around 26 that hit the mark for medium skin tones, said Sirotin.
“There are technologies that will work well and work equitably, that hit that 95% performance threshold. However, as you saw, we also see some technologies that fall below that,” which “highlights the importance of testing and evaluating for specific use cases with diverse volunteers to make sure that we are actually picking the right technologies for specific applications,” said Vemury.
Although DHS does not release the name of industry participants publicly, the aliased results are public and are meant to help agencies make decisions about their technology with evidence about where tech does and doesn’t work from an “honest broker,” said Vemury. Industry participants also get insight to improve their products.
Agencies that want more specific information about vendors can contact DHS, which will forward requests on to vendors, he said, adding that participating companies that do well often self-announce their results.
Vemury and Sirotin said that the tests show the potential of group biometric processing in terms of speed and effectiveness as compared to traditional, manual checks by humans – the best system was able to identify 97% of people in groups of two and four in less than two seconds per person.
The tech has potential use cases in sea and land ports, and there’s an interest at DHS in its potential use for identifying groups of people in cars, a DHS official told FCW.
Another use case could be airports, where facial recognition tech is already used in some places to match face images to identity documents in place of a human check and use one-to-many systems to compare photos of individuals to a list of pre-collected information about people expected for a certain flight, for example, said Vemury.