Stage-oe-small.jpg

Techreport3040: Unterschied zwischen den Versionen

Aus Aifbportal
Wechseln zu:Navigation, Suche
K
 
(4 dazwischenliegende Versionen desselben Benutzers werden nicht angezeigt)
Zeile 20: Zeile 20:
 
}}
 
}}
 
{{Publikation Details
 
{{Publikation Details
|Abstract=In recent years, the amount of RDF data on the Web has drastically increased. For an effective search over such a large Web of data, ranking of results is crucial. To allow efficient query processing of ranked queries, top-k join processing has been proposed, which aims at computing k top-ranked results, without complete result materialization. However, Web search poses novel challenges. Most notably, users are frequently willing to sacrifice result accuracy in favor of result computation time. Thus, there is a strong need to approximate the top-k results. Previous work on approximate top-k processing, however, is not applicable for the join top-k
+
|Abstract=For effectively searching the Web of data, ranking of results is a crucial. Top-k processing strategies have been proposed to allow an efficient processing of such ranked queries. Top-k strategies aim at computing k top-ranked results without complete result materialization. However, for many applications result computation time is much more important than result accuracy and completeness. Thus, there is a strong
setting – it solely targets the selection top-k problem. Further, existing approaches require offline computed score statistics to be available. Unfortunately, many important kinds of Web search queries, e.g., keyword or spatial queries, commonly query-/user-dependent
+
need for approximated ranked results. Unfortunately, previous work on approximate top-k processing is not well-suited for the Web of data. In this paper, we propose the first approximate top-k join framework
ranking is employed. Thus, one only has score information at runtime. In this paper, we address these shortcomings and propose a novel approximate top-k join framework, well-suited for Web search queries and ranking. For this, all necessary score statistics are learned via a pay-as-you-go training at runtime. We study our approach in a theoretical manner, and formally show its efficiency as well as effectiveness. Further, we conducted experiments on benchmarks comprising synthetic as well as real-world Web data/-
+
for Web data and queries. Our approach is very lightweight – necessary statistics are learned at runtime in a pay-as-you-go manner. We conducted extensive experiments on state-of-art SPARQL benchmarks. Our
queries. Our results are very promising, as we could achieve up to 65% time savings, while maintaining a high precision/recall.
+
results are very promising: we could achieve up to 65% time savings, while maintaining a high precision/recall.
|Download=awa-approx-topk-tr.pdf
+
|Download=approxtopk-eswc-awa-tr.pdf
 
|Projekt=IZEUS
 
|Projekt=IZEUS
 
|Forschungsgruppe=Wissensmanagement
 
|Forschungsgruppe=Wissensmanagement

Aktuelle Version vom 14. April 2014, 06:50 Uhr


Pay-as-you-go Approximate Join Top-k Processing for the Web of Data




Published: 2013 Oktober
Institution: Institut AIFB, KIT
Erscheinungsort / Ort: Karlsruhe
Archivierungsnummer:3040

BibTeX



Kurzfassung
For effectively searching the Web of data, ranking of results is a crucial. Top-k processing strategies have been proposed to allow an efficient processing of such ranked queries. Top-k strategies aim at computing k top-ranked results without complete result materialization. However, for many applications result computation time is much more important than result accuracy and completeness. Thus, there is a strong need for approximated ranked results. Unfortunately, previous work on approximate top-k processing is not well-suited for the Web of data. In this paper, we propose the first approximate top-k join framework for Web data and queries. Our approach is very lightweight – necessary statistics are learned at runtime in a pay-as-you-go manner. We conducted extensive experiments on state-of-art SPARQL benchmarks. Our results are very promising: we could achieve up to 65% time savings, while maintaining a high precision/recall.

Download: Media:approxtopk-eswc-awa-tr.pdf

Projekt

IZEUS



Forschungsgruppe

Wissensmanagement


Forschungsgebiet

Semantische Suche