Stage-oe-small.jpg

Estimating Characteristic Sets for RDF Dataset Profiles Based on Sampling

Aus Aifbportal
Version vom 5. März 2021, 18:05 Uhr von Zg2916 (Diskussion | Beiträge) (Die Seite wurde neu angelegt: „{{Publikation Erster Autor |ErsterAutorNachname=Heling |ErsterAutorVorname=Lars }} {{Publikation Author |Rank=2 |Author=Maribel Acosta }} {{Inproceedings |Refe…“)
(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Wechseln zu:Navigation, Suche


Estimating Characteristic Sets for RDF Dataset Profiles Based on Sampling


Estimating Characteristic Sets for RDF Dataset Profiles Based on Sampling



Published: 2020 Mai

Buchtitel: The Semantic Web - 17th International Conference, ESWC 2020
Ausgabe: 12123
Reihe: Lecture Notes in Computer Science
Verlag: Springer

Nicht-referierte Veröffentlichung

BibTeX

Kurzfassung
RDF dataset profiles provide a formal representation of a dataset’s characteristics (features). These profiles may cover various aspects of the data represented in the dataset as well as statistical descriptors of the data distribution. In this work, we focus on the characteristic sets profile feature summarizing the characteristic sets contained in an RDF graph. As this type of feature provides detailed information on both the structure and semantics of RDF graphs, they can be very beneficial in query optimization. However, in decentralized query processing, computing them is challenging as it is difficult and/or costly to access and process all datasets. To overcome this shortcoming, we propose the concept of a profile feature estimation. We present sampling methods and projection functions to generate estimations which aim to be as similar as possible to the original characteristic sets profile feature. In our evaluation, we investigate the feasibility of the proposed methods on four RDF graphs. Our results show that samples containing 0.5% of the entities in the graph allow for good estimations and may be used by downstream tasks such as query plan optimization in decentralized querying.

Weitere Informationen unter: Link
DOI Link: 10.1007/978-3-030-49461-2\_10



Forschungsgruppe

Web Science


Forschungsgebiet