Search, Information Retrieval, and Data Mining

By Evgeniy Gabrilovich, Ravi Kumar and Belle Tseng


Search technologies are an international team of experts in search, algorithms, data processing and data mining, information retrieval and natural language processing. Together, we build systems and algorithms to analyze user needs, then synthesize and deliver the right responses from data sources around the globe.

Challenges

Task Completion

How to understand and model user information needs such as long-running proclivities/preferences, longitudinal tasks, session-level models, etc? How can we learn what a user is actually searching for from the query and click logs? How do we learn components of a mixture of intents, having aggregated information about many users and their patterns of interactions with search results?

Metrics

How to gauge intent satisfaction and design a general framework for measuring relevance? With the presentation aspects of search getting more intricate, how do we treat the search results presentation as a global optimization problem? How does diversity play a role? And what types of result set re-ranking can be beneficial? For these questions, it becomes important to develop a more holistic notion of user satisfaction. How does diversity play a role? And what types of result set re-ranking can be beneficial?

Web Mining

The focus is Web information mining, including analysis of click and query logs for tasks, such as classification and translation. Large-scale distributed computing infrastructure has helped us analyze data of magnitude unimaginable a few years ago. Mining this vast amount of data is important to identify latent patterns, track trend changes, and analyze data at various scales seamlessly. How to do this efficiently?

Multilingual IR

With an increasing number of non-English web users, information retrieval in languages other than English becomes important in the near future. This involves an effective and efficient adaptation of existing techniques, such as statistical machine translation, developed for English to other languages. How to do this adaptation? Can we use common-sense and background knowledge across the world to go beyond information retrieval at the level of mere words?

Nextgen Search

Can we define the IR outcome from providing a list of links to potentially relevant documents to synthesizing pages that comprehensively answer the user's information need? How can we provide a personalized search experience by leveraging each individual's search history, while taking advantage of many users' past patterns of interactions? What kinds of personal information can be stored and aggregated without violating user privacy?

World Knowledge

Can we use common-sense and background knowledge to go beyond information retrieval at the level of mere words? Imagine using all of the knowledge available in Freebase to sift through information more effectively.