Big ideas about big data could win prizes
NASA and other agencies are hosting a series of competitions to gain ideas regarding some challenges of big data.
Government agencies have been using competitions with some success to harness the collective brainpower of the public in solving thorny technology problems. Now three agencies – NASA, the Energy Department and the National Science Foundation -- are using the model to approach big-data conundrums. Separately, the NSF and the National Institutes of Health are funding additional research into big data through a more conventional grants process.
The NASA-NSF-DOE project is only the first of a planned series of contests, run through the crowdsourcing site TopCoder.com.
“Big Data is characterized not only by the enormous volume or the velocity of its generation but also by the heterogeneity, diversity and complexity of the data,” said Suzi Iacono, co-chair of the interagency Big Data Senior Steering Group, in a statement posted to TopCoder announcing the challenge. “There are enormous opportunities to extract knowledge from these large-scale diverse data sets, and to provide powerful new approaches to drive discovery and decision-making, and to make increasingly accurate predictions. We’re excited to see what this competition will yield and how it will guide us in funding the next round of big data science and engineering.”
Before the data can be used, however, it has to be homogenized. As the information flows in from a variety of data sources, it does not easily blend. So the question that entrants in the first contest must answer, as reported on GigaOm, is: “How can we make heterogeneous (dissimilar and incompatible) data sets homogeneous (uniformly accessible, compatible, able to be grouped and/or matched) so usable information can be extracted? How can information then be converted into real knowledge that can inform critical decisions and solve societal challenges?”
Those interested in tackling that question must register by Oct. 13 and submit their entry by Oct. 19. Three $500 prizes will be awarded. According to GigaOm’s report, future contests will concern more specific applications such as energy, health care and earth science.
Steve Kelman, Weatherhead Professor of Public Management at Harvard University, and an FCW Columnist, has long advocated the use of competitions to find innovative solutions.
"Contests now seem to be spreading beyond developing solutions for very specific, isolated problems to the kind of challenge NASA has created here, which is for working on a major conceptual challenge faced by many organizations, in both government and industry," he said. "It’s a real experiment to see whether the challenge format can successfully work in such less-concrete areas, but I am very pleased that NASA is giving this a try."
However, he suggested, the prize amount might be too low for an effort that could require a lot of work.
Jason Cursan, director of the NASA Tournament Lab, said contests are effective ways to get innovative ideas, even for relatively low prize amounts, because the winning competitors get some prestige and an accomplishment for their resumes, leading to career advancement. "We get unsolicited ideas all the time," he said, quoted in FCW’s sister publication GCN. "It doesn’t require a high incentive to get high-quality ideas."
Meanwhile, the NSF announced it has awarded grants totaling nearly $15 million for new fundamental research projects in big data. The awards are intended to fund efforts to develop better tools and methods for turning massive amounts of data into useful information.
NSF made the awards to eight research institutions at universities, on behalf of NIH. "To get the most value from the massive biological data sets we are now able to collect, we need better ways of managing and analyzing the information they contain," said NIH Director Francis Collins in a written statement. "The new awards that NIH is funding will help address these technological challenges--and ultimately help accelerate research to improve health--by developing methods for extracting important, biomedically relevant information from large amounts of complex data."