Cyrillic Script Publication Metadata Extraction
Aus Aifbportal
Cyrillic Script Publication Metadata Extraction |
|
Daten für Training und Evaluation von Metadataexraktionsmodellen basierend auf 15 Tausend kyrillischen Publikationen Veröffentlichungsdatum: 2021/04/22 |
Beschreibung
Data for training and evaluating sequence labeling models for metadata extraction based on 15,553 Cyrillic script language papers spanning 27 years and three languages. For each paper, ground truth sequence labeling output is provided in TEI format and as annotated plain text.
Involvierte Personen
Publikationen
inproceedings
Igor Shapiro, Tarek Saier, Michael Färber
Sequence Labeling for Citation Field Extraction from Cyrillic Script References
Proceedings of the AAAI Workshop on Scientific Document Understanding (SDU∂AAAI'22), ACM
(Details)
Johan Krause, Igor Shapiro, Tarek Saier, Michael Färber
Bootstrapping Multilingual Metadata Extraction: A Showcase in Cyrillic
Proceedings of the Second Workshop on Scholarly Document Processing, CEUR-WS
(Details)
↑ top
Projekte