Full proposal

Here is a slightly more in-depth perspective on what we're hoping to achieve, provided as informal background material. Consider the CFP and homepage authoritative regarding specific topics for paper submissions. 


This full-day Workshop focuses on new approaches to using structured data for improving Web search. Most Web documents and queries are about entities and the relationships between them, i.e., structured data with documented semantics. However, popular search engines (such as Google, Bing, Yahoo and Yandex) have historically ignored structured data, instead relying on techniques that model the document and queries as a bag of words.

Academic research has in the past approached the topic of ‘concept’ based searching both in order to improve recall (e.g., latent semantic indexing) and for disambiguation (e.g., use of word senses for disambiguating query intent). Specific verticals, such as medicine have made some progress in the area of improving document search by using knowledge bases about the topic of interest. Unfortunately, because of both the scale and range of topics covered on the web, little of this work has made its way into the major Web search engines.

Recent developments, most notably the dramatic increase in the use of structured data markup on web pages (through efforts such as Schema.org and Open Graph Protocol) and the easy availability of structured data from sources such as DBpedia and Freebase have lead to substantial interest from mainstream search engines. According to various estimates, up to 20% of web pages now contain some form of structured data markup, with most of the new markup created in the last two years.

However, we are still in the very early stages in the evolution of how search engines use this structured data. Most of the current work is focussed on searching databases of facts about entities and presenting them either alongside the search results, or on annotating search results with additional data. The core problems of utilizing knowledge about entities for improving the ranking of documents, helping set the user context, etc. are still largely unexplored territories.

While the use of structured data is still limited in Web search engines, active research in this direction can be observed in many communities. Most notably, there is a broad range of solutions proposed by IR, database, and Semantic Web researchers for exploiting structured data for various search tasks such as annotated document retrieval, vertical specific search (e.g., curriculum search, job search), relational search as well as complex question answering. The goal of this Workshop is to bring these communities together to focus on the central question of how to make these solutions applicable to Web search engines. The central theme of the workshop is to explore new and novel ways for exploiting explicit representations of the entities and the relationships between them to improve Web search. The focus will be presentations and discussion of new ideas on how structured data can be used in Web search, as opposed to presentations on improvements in performance of known applications of structured data in search.