State Dept. looks to AI for declassification, FOIA processing
The agency’s enterprise AI strategy is expected next month.
The State Department is testing out the use of artificial intelligence for declassifying decades-old cables and helping members of the public pinpoint what they want in public records or through the Freedom of Information Act process.
Next month, the State Department is launching its “first-ever enterprise AI strategy” that will be “laying out the framework so that the department can responsibly, safely and securely harness the capabilities of AI to advance our work,” said Giorleny Altamirano Rayo, chief data scientist at the department’s Center of Analytics and State’s responsible AI official, during a Sept. 7 meeting of the FOIA Advisory Committee.
The department already released an AI policy in the Foreign Affairs Manual in April 2023, and officials are working on two AI pilots started earlier this summer.
The department is testing the use of AI in Freedom of Information Act searches to combine similar requests, cut out duplicative work or find potentially responsive records, said Eric Stein, deputy assistant secretary for the department’s Office of Global Information Services.
The department’s presentation also describes using AI to identify classified, sensitive or otherwise FOIA-exempt materials in search results, as the agency finds responsive records through the pilot, although Stein said that the department isn’t yet at the point of using the technology for redactions.
“I think those are the fears that people have in the public: that we’re just going to apply a model, just [exempt] everything, and nothing will ever come out again,” he said. “That’s not what we want either.”
Another ongoing pilot is meant to make it easier for the people looking for information to find what they want in public records on the State Department’s website, said Stein. The department will “find and direct customer to existing released documents” and “automate customer engagement early in the request process,” the presentation states.
“So you come to State’s site hypothetically… looking for a specific country, topic and so forth, and as you type it in, it pops up, ‘Here are some records that have been released,’ which may either satisfy your request or help you narrow or identify what you’re looking for,” said Stein.
Already, the department recently operationalized the use of machine learning for declassifying cables between the State Department and overseas posts, Stein said.
The project started as a pilot last fall meant to augment the manual declassification process for information once it reaches a certain age, usually 25 years. Until recently, that had happened via people sitting at computers and going through records one-by-one, the vast majority of which are usually declassified, said Stein.
The department used cables from 1997 for the pilot after training a model on human decisions from 2020 and 2021 that concerned cables marked as confidential and secret in 1995 and 1996, said Rayo.
The model sorts cables into three categories: those it’s confident should be declassified, those it’s confident shouldn’t be declassified and those that need manual review.
For the 1997 pilot group of over 78,000 cables, there was “around 96% agreement with human reviewers,” said Rayo. The pilot “[reduced] burden up to 63%” by culling the number of cables that need manual review, she said.
The model was fully operationalized for the 1998 batch, said Rayo, who emphasized that people are still part of the process in both the cables tagged for manual review and a random subset from the other two categories for quality control.
“This AI machine learning technology does not replace human reviewers, but we can augment our work,” she said. “We can leverage this AI technology to do the tedious parts, but leaving the critical decision making to our staff.”
Freeing up resources is particularly important, said Stein, as the department stares down a tidal wave of records coming up for declassification.
“We have a big growth in cable traffic that’s going to occur in the next several years, and the question is, ‘How will we ever address this declassification review demand, which is also directly related to growing FOIA requests and the volume of electronic records that we have as well?’” he explained. “Without a change… we were going to have big problems meeting demands, which we’re already struggling to do as it is.”
“Machine learning, a subfield of AI, [will] be required if we’re going to automate FOIA searches for the billions and billions of records that government agencies hold,” said Debra Steidl Wall, deputy archivist. But “concerns exist about standards of use and thoughtful legal analysis.”
Another coming change the department will need to reckon with is the type of records up for declassification, which might not necessarily work well with the recently operationalized pilot. The number of classified emails up for that review is going to skyrocket as the department moves into the 2000s, hitting 12 million for the year 2018, said Rayo.
One meeting attendee asked about the potential for errors in the algorithm to lead to the release of records that shouldn’t have been unclassified. Stein’s response: “Should we just not release anything then and keep the status quo?”
Risk appetite can be a “real problem,” he said. “We never want to release anything we shouldn't, but we also know there's an obligation to be transparent with the public of these records.”