A Comparative Evaluation of Cross-lingual Text Annotation Techniques
Buchtitel: Conference and Labs of the Evaluation Forum (CLEF 2013)
Reihe: Lecture Notes on Computer Science
In this paper, we study the problem of extracting knowledge from textual documents written in different languages by annotating the text on the basis of a cross-lingual knowledge base, namely Wikipedia. Our contribution is twofold. First, we propose a novel framework for evaluating cross-lingual text annotation techniques, based on annotation of a parallel corpus to a hub-language in a cross-lingual knowledge base. Second, we investigate the performance of different cross-lingual text annotation techniques according to our proposed evaluation framework. We perform experiments for an empirical comparison of three approaches: (i) Cross-lingual Named Entity Annotation (CL-NEA), (ii) Cross-lingual Wikifier Annotation (CL-WIFI), and (iii) Cross-lingual Explicit Semantic Analysis (CL-ESA). Besides establishing an evaluation framework, our results show the differences between the three investigated approaches and demonstrate their advantages and disadvantages.