Techreport3040: Unterschied zwischen den Versionen
Awa (Diskussion | Beiträge) |
Awa (Diskussion | Beiträge) |
||
Zeile 24: | Zeile 24: | ||
ranking is employed. Thus, one only has score information at runtime. In this paper, we address these shortcomings and propose a novel approximate top-k join framework, well-suited for Web search queries and ranking. For this, all necessary score statistics are learned via a pay-as-you-go training at runtime. We study our approach in a theoretical manner, and formally show its efficiency as well as effectiveness. Further, we conducted experiments on benchmarks comprising synthetic as well as real-world Web data/- | ranking is employed. Thus, one only has score information at runtime. In this paper, we address these shortcomings and propose a novel approximate top-k join framework, well-suited for Web search queries and ranking. For this, all necessary score statistics are learned via a pay-as-you-go training at runtime. We study our approach in a theoretical manner, and formally show its efficiency as well as effectiveness. Further, we conducted experiments on benchmarks comprising synthetic as well as real-world Web data/- | ||
queries. Our results are very promising, as we could achieve up to 65% time savings, while maintaining a high precision/recall. | queries. Our results are very promising, as we could achieve up to 65% time savings, while maintaining a high precision/recall. | ||
− | |Download= | + | |Download=awa-approx-topk-tr.pdf |
|Projekt=IZEUS | |Projekt=IZEUS | ||
|Forschungsgruppe=Wissensmanagement | |Forschungsgruppe=Wissensmanagement |
Version vom 27. Oktober 2013, 19:24 Uhr
Published: 2013
Oktober
Institution: Institut AIFB, KIT
Erscheinungsort / Ort: Karlsruhe
Archivierungsnummer:3040
Kurzfassung
In recent years, the amount of RDF data on the Web has drastically increased. For an effective search over such a large Web of data, ranking of results is crucial. To allow efficient query processing of ranked queries, top-k join processing has been proposed, which aims at computing k top-ranked results, without complete result materialization. However, Web search poses novel challenges. Most notably, users are frequently willing to sacrifice result accuracy in favor of result computation time. Thus, there is a strong need to approximate the top-k results. Previous work on approximate top-k processing, however, is not applicable for the join top-k
setting – it solely targets the selection top-k problem. Further, existing approaches require offline computed score statistics to be available. Unfortunately, many important kinds of Web search queries, e.g., keyword or spatial queries, commonly query-/user-dependent
ranking is employed. Thus, one only has score information at runtime. In this paper, we address these shortcomings and propose a novel approximate top-k join framework, well-suited for Web search queries and ranking. For this, all necessary score statistics are learned via a pay-as-you-go training at runtime. We study our approach in a theoretical manner, and formally show its efficiency as well as effectiveness. Further, we conducted experiments on benchmarks comprising synthetic as well as real-world Web data/-
queries. Our results are very promising, as we could achieve up to 65% time savings, while maintaining a high precision/recall.
Download: Media:awa-approx-topk-tr.pdf