When search knows what you need
In his new book, Microsoft’s Stefan Weitz previews what predictive analytics could mean in practice.
Stefan Weitz is director of search at Microsoft’s Bing, so it’s no surprise that he has lots to say about the technology. His insights are not limited to the underlying math and mechanics (though he has plenty to offer on those topics) because he believes search has the potential to evolve far beyond an information-retrieval tool and truly make our lives better.
Search is a “hinge that can join together the best parts of machines and the best parts of humans,” Weitz declares.
If that sounds a bit utopian for a tool that’s often used to settle bar bets and find obscure tax forms, well, Weitz agrees.
That elusive hinge, he writes, is “not the search that you know today, and likely not even the search that the big technology companies are currently building — but it’s the search that comes into view when we think about it less as a tool for finding pages and more as a group of functions that can be deployed to make us smarter, happier, and better connected in our real-world lives.”
In other words, Weitz’s focus is not search in the sense most of us think about it. The subtitle of his book, “Search: How the Data Explosion Makes Us Smarter,” tips his hand. The real emphasis is on big data and what it increasingly makes possible.
At a recent presentation in Falls Church, Va., Weitz sketched his definition of what “near-term search” could entail:
- Search queries will not be words. They will be any change in state.
- Search won’t need to listen to what you say to know what you mean.
- Search will understand and take action in the real world.
- Search will appear when and where you need it, even if you don’t know you need it.
- Search will contribute to human knowledge, not just index it.
- Search will simplify our lives.
Again, that’s optimistic stuff. But in his book, Weitz offers example after example of how the future is often already here.
Take, for instance, the research that computer scientist Eric Horvitz, a former president of the Association for the Advancement of Artificial Intelligence, has done on searches for first-aid information.
According to Weitz, Horvitz realized that traditional search was not very good for time-sensitive medical queries and that individuals doing the searching often muddled through several unhelpful results before getting to the information they needed.
So Horvitz “analyzed query logs and identified chains of queries (basically a succession of queries within a period of time) in which the final query was something like a hospital address,” Weitz writes. “He also looked at mobile query chains and isolated sets where the GPS stopped at a hospital or the user dialed 911. By tying together disparate graphs (location, phone, queries), he was able to train the system to better understand situations where an immediate response was necessary.... For example, if search detected a query that was likely resolved by CPR, the system would not show a YouTube video on CPR that had a two-minute introduction.”
Similarly, when the system detected a pattern of queries made via mobile phone, “it could automatically begin to build a route to the nearest hospital or dial 911 in the background.”
That’s not just data retrieval. It’s what Weitz terms “the capable web,” and it’s where he argues we should be heading as quickly as the technology will allow.
Reality check
It could take a while, however. Weitz reports that “major search systems still see 25 percent or more of queries failing for users, as measured by how quickly users click back to the search results page after they have clicked on a link.”
And his optimism comes with reality checks. “Search” has whole chapters devoted to the technology, business, legal and cultural hurdles that could slow or stall this dramatic evolution of our ability to access and use knowledge. (The brief examination of data ownership alone is worth the book’s cover price.) What’s holding back search? Weitz asks. The answer is “a lot.”
Yet for those tasked with making government more citizen-centric or mining data in support of other critical missions, those hurdles are valuable food for thought. How does one cross-correlate the discrete “islands of data” to provide better service while still protecting and respecting privacy? If search can understand what someone has already read elsewhere and can apply probabilistic models, what does that mean for delivering content? As sensors drive the amount of searchable data into new stratospheres, how does one decide what information to ignore? What is the best way to help a user find that obscure tax form?
The techno-optimism is informative as well. Weitz is a breezy writer, and there are plenty of fun nuggets scattered through the book. For example, Microsoft has some 33 billion objects modeled in Bing’s “knowledge repository.” Silicon Valley venture capitalist Vinod Khosla believes machine learning will prove to have a bigger impact than mobile technology. Siri worked better five years ago, before Apple modified it to work at scale.
Far more useful, however, are Weitz’s mini-seminars on what analytics and search technologies are making possible and why we need to think bigger than keywords and Web pages.
Ultimately, Weitz argues, “more equal access to information for all people will radically change the world. The end result is better decision-making.”