Intelligence Community Wants Multilingual Text Search

Cienpies Design/Shutterstock.com

IARPA is looking for a system that can search databases in other languages.

Spy agencies want investigators to be able to search large databases of foreign-language documents by typing a query in English.

Next month, the Intelligence Advance Research Projects Activity is gathering companies whose technology can process English phrases, churn through text stored in other languages, and return the results in English. In other words, "English in, English out."

The MATERIAL project — "Machine Translation for English Retrieval of Information in Any Language" — might let users search vast quantities of text and speech for relevant information on "Zika virus," provided they specify whether they're interested in results related to "government" or "health," according to a FedBizOpps posting.

» Get the best federal technology news and ideas delivered right to your inbox. Sign up here.

An ideal solution would communicate how relevant each result is to the initial query, the posting said. It would also involve elements of advanced natural language processing, speech recognition, information retrieval and machine-learning techniques.

Currently, this kind of technology requires "a substantial investment" and "many months or years of development," the FBO posting said. Eventually, the technology would need to run queries in a shorter amount of time, and with more languages. It would also need to process formal and informal text and speech. 

The MATERIAL program could change the way government analysts "identify foreign language speech and text data," Office of the Director of National Intelligence spokesman Charles Carithers said in a statement to Nextgov.

The Proposers' Day is Sept. 27.