Stage-oe-small.jpg

Inproceedings3917: Unterschied zwischen den Versionen

Aus Aifbportal
Wechseln zu:Navigation, Suche
(Die Seite wurde neu angelegt: „{{Publikation Erster Autor |ErsterAutorNachname=Färber |ErsterAutorVorname=Michael }} {{Publikation Author |Rank=2 |Author=Ann-Kathrin Leisinger }} {{Inprocee…“)
 
 
Zeile 12: Zeile 12:
 
|Year=2021
 
|Year=2021
 
|Booktitle=Proceedings of the 15th ACM Recommender Systems Conference (RecSys'21)
 
|Booktitle=Proceedings of the 15th ACM Recommender Systems Conference (RecSys'21)
|Pages=1-5
+
|Pages=749--752
 
|Publisher=ACM
 
|Publisher=ACM
 
}}
 
}}
Zeile 20: Zeile 20:
 
{{Publikation Details
 
{{Publikation Details
 
|Abstract=The number of datasets is steadily rising, making it increasingly difficult for researchers and practitioners in the various scientific disciplines to be aware of all datasets, particularly of the most relevant datasets for a given research problem. To this end, dataset search engines have been proposed. However, they are based on the users' keywords and thus have difficulties in determining precisely fitting datasets for complex research problems. In this paper, we propose the system at http://data-hunter.io that recommends suitable datasets to users based on given research problem descriptions. It is based on fastText for the text representation and text classification, the Data Set Knowledge Graph (DSKG) with metadata about almost 1,700 unique datasets, as well as 88,000 paper abstracts as research problem descriptions for training the model. Overall, our system demonstrates that recommending datasets facilitates data provisioning and reuse according to the FAIR principles and that dataset recommendation is a promising future research direction.
 
|Abstract=The number of datasets is steadily rising, making it increasingly difficult for researchers and practitioners in the various scientific disciplines to be aware of all datasets, particularly of the most relevant datasets for a given research problem. To this end, dataset search engines have been proposed. However, they are based on the users' keywords and thus have difficulties in determining precisely fitting datasets for complex research problems. In this paper, we propose the system at http://data-hunter.io that recommends suitable datasets to users based on given research problem descriptions. It is based on fastText for the text representation and text classification, the Data Set Knowledge Graph (DSKG) with metadata about almost 1,700 unique datasets, as well as 88,000 paper abstracts as research problem descriptions for training the model. Overall, our system demonstrates that recommending datasets facilitates data provisioning and reuse according to the FAIR principles and that dataset recommendation is a promising future research direction.
|Download=DataHunter_RecSys2021.pdf
+
|Download=DataHunter_RecSys2021_v3.pdf
 +
|Link=https://doi.org/10.1145/3460231.3478882
 
|DOI Name=10.1145/3460231.3478882
 
|DOI Name=10.1145/3460231.3478882
 
|Forschungsgruppe=Web Science
 
|Forschungsgruppe=Web Science

Aktuelle Version vom 27. September 2021, 09:36 Uhr


DataHunter: A System for Finding Datasets Based on Scientific Problem Descriptions


DataHunter: A System for Finding Datasets Based on Scientific Problem Descriptions



Published: 2021

Buchtitel: Proceedings of the 15th ACM Recommender Systems Conference (RecSys'21)
Seiten: 749--752
Verlag: ACM

Referierte Veröffentlichung

BibTeX


Kurzfassung
The number of datasets is steadily rising, making it increasingly difficult for researchers and practitioners in the various scientific disciplines to be aware of all datasets, particularly of the most relevant datasets for a given research problem. To this end, dataset search engines have been proposed. However, they are based on the users' keywords and thus have difficulties in determining precisely fitting datasets for complex research problems. In this paper, we propose the system at http://data-hunter.io that recommends suitable datasets to users based on given research problem descriptions. It is based on fastText for the text representation and text classification, the Data Set Knowledge Graph (DSKG) with metadata about almost 1,700 unique datasets, as well as 88,000 paper abstracts as research problem descriptions for training the model. Overall, our system demonstrates that recommending datasets facilitates data provisioning and reuse according to the FAIR principles and that dataset recommendation is a promising future research direction.

Download: Media:DataHunter_RecSys2021_v3.pdf
Weitere Informationen unter: Link
DOI Link: 10.1145/3460231.3478882

Verknüpfte Tools

DataHunter


Forschungsgruppe

Web Science


Forschungsgebiet

Information Retrieval, Natürliche Sprachverarbeitung, Digitale Bibliotheken, Knowledge Discovery, Künstliche Intelligenz