Aktuelle Version vom 18. März 2020, 22:23 Uhr

KORE 50^DYWC: An Evaluation Data Set for Entity Linking Based on DBpedia, YAGO, Wikidata and Crunchbase

Kristian Noullet, Rico Mix, Michael Färber

Published: 2020

Buchtitel: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC'20)
Verlag: European Language Resources Association (ELRA)

Referierte Veröffentlichung

BibTeX

Kurzfassung
A major domain of research in natural language processing is named entity recognition and disambiguation (NERD). One of the main ways of attempting to achieve this goal is through use of Semantic Web technologies and its structured data formats. Due to the nature of structured data, information can be extracted more easily, therewith allowing for the creation of knowledge graphs. In order to properly evaluate a NERD system, gold standard data sets are required. A plethora of different evaluation data sets exists, mostly relying on either Wikipedia or DBpedia. Therefore, we have extended a widely-used gold standard data set, KORE 50, to not only accommodate NERD tasks for DBpedia, but also for YAGO, Wikidata and Crunchbase. As such, our data set, KORE 50 DYWC , allows for a broader spectrum of evaluation. Among others, the knowledge graph agnosticity of NERD systems may be evaluated which, to the best of our knowledge, was not possible until now for this number of knowledge graphs.

Download: Media:KORE50-DYWC_LREC2020.pdf

Verknüpfte Datasets

KORE 50^DYWC

Forschungsgruppe

Web Science

Forschungsgebiet

Information Retrieval, Informationsextraktion, Natürliche Sprachverarbeitung

@@ Zeile 24: / Zeile 24: @@
 |Abstract=A major domain of research in natural language processing is named entity recognition and disambiguation (NERD). One of the main ways of attempting to achieve this goal is through use of Semantic Web technologies and its structured data formats. Due to the nature of structured data, information can be extracted more easily, therewith allowing for the creation of knowledge graphs. In order to properly evaluate a NERD system, gold standard data sets are required. A plethora of different evaluation data sets exists, mostly relying on either Wikipedia or DBpedia. Therefore, we have extended a widely-used gold standard data set, KORE 50, to not only accommodate NERD tasks for DBpedia, but also for YAGO, Wikidata and Crunchbase. As such, our data set, KORE 50 DYWC , allows for a broader spectrum of evaluation. Among others, the knowledge graph agnosticity of NERD systems may be evaluated which, to the best of our knowledge, was not possible until now for this number of knowledge graphs.
 |Download=KORE50-DYWC_LREC2020.pdf
-|Forschungsgruppe=Security • Usability • Society
+|Forschungsgruppe=Web Science
 }}
 {{Forschungsgebiet Auswahl

Inproceedings3787: Unterschied zwischen den Versionen

Aktuelle Version vom 18. März 2020, 22:23 Uhr

KORE 50^DYWC: An Evaluation Data Set for Entity Linking Based on DBpedia, YAGO, Wikidata and Crunchbase

KORE 50^DYWC: An Evaluation Data Set for Entity Linking Based on DBpedia, YAGO, Wikidata and Crunchbase