In recent years, the amount of RDF data on the Web has drastically increased. For an effective search over such a large Web of data, ranking of results is crucial. To allow efficient query processing of ranked queries , top-k join processing has been proposed, which aims at computing k top-ranked results , without complete result materialization. However, Web search poses novel challenges. Most notably, users are frequently willing to sacrifice result accuracy in favor of result computation time. Thus, there is a strong need to approximate the top-k results. Previous work on approximate top-k processing , however, is not applicable for the join top-k |+|
|Abstract=the Web of data, ranking of results is crucial. allow efficient processing of ranked queries-k at computing k top-ranked results without complete result materialization. However, result computation time . Thus, there is a strong
|−|setting – it solely targets the selection top-k problem. Further, existing approaches require offline computed score statistics to be available. Unfortunately, many important kinds of Web search queries, e.g., keyword or spatial queries, commonly query-/user-dependent |+|
need results. work on approximate top-k processing is not for the of . In this paper, we propose approximate top-k join framework
|−|ranking is employed. Thus, one only has score information at runtime. In this paper, we address these shortcomings and propose a novel approximate top-k join framework , well-suited for Web search queries and ranking. For this, all necessary score statistics are learned via a pay-as-you-go training at runtime. We study our approach in a theoretical manner, and formally show its efficiency as well as effectiveness. Further, we conducted experiments on benchmarks comprising synthetic as well as real- world Web data/- |+|
for Web queries. necessary statistics are learned a pay-as-you-go . We conducted experiments on --. Our
|−|queries. Our results are very promising , as we could achieve up to 65% time savings, while maintaining a high precision/recall. |+|
results are very promisingwe could achieve up to 65% time savings, while maintaining a high precision/recall.
Version vom 15. Januar 2014, 14:49 Uhr
Pay-as-you-go Approximate Join Top-k Processing for the Web of Data
Institution: Institut AIFB, KIT
Erscheinungsort / Ort: Karlsruhe
For effectively searching the Web of data, ranking of results is a crucial. Top-k processing strategies have been proposed to allow an efficient processing of such ranked queries. Top-k strategies aim at computing k top-ranked results without complete result materialization. However, for many applications result computation time is much more important than result accuracy and completeness. Thus, there is a strong
need for approximated ranked results. Unfortunately, previous work on approximate top-k processing is not well-suited for the Web of data. In this paper, we propose the first approximate top-k join framework
for Web data and queries. Our approach is very lightweight – necessary statistics are learned at runtime in a pay-as-you-go manner. We conducted extensive experiments on state-of-art SPARQL benchmarks. Our
results are very promising: we could achieve up to 65% time savings, while maintaining a high precision/recall.