Stage-oe-small.jpg

Inproceedings2038: Unterschied zwischen den Versionen

Aus Aifbportal
Wechseln zu:Navigation, Suche
K (Added from ontology)
 
 
(6 dazwischenliegende Versionen von 2 Benutzern werden nicht angezeigt)
Zeile 1: Zeile 1:
 +
{{Publikation Erster Autor
 +
|ErsterAutorNachname=Sorg
 +
|ErsterAutorVorname=Philipp
 +
}}
 
{{Publikation Author
 
{{Publikation Author
 
|Rank=2
 
|Rank=2
 
|Author=Philipp Cimiano
 
|Author=Philipp Cimiano
}}
 
{{Publikation Author
 
|Rank=1
 
|Author=Philipp Sorg
 
 
}}
 
}}
 
{{Inproceedings
 
{{Inproceedings
Zeile 13: Zeile 13:
 
|Month=Juni
 
|Month=Juni
 
|Booktitle=Proceedings of the International Conference on Applications of Natural Language to Information Systems (NLDB)
 
|Booktitle=Proceedings of the International Conference on Applications of Natural Language to Information Systems (NLDB)
 +
|Pages=36-48
 +
|Publisher=Springer
 +
|Editor=Helmut Horacek, Elisabeth Métais, Rafael Muñoz, Magdalena Wolska
 
}}
 
}}
 
{{Publikation Details
 
{{Publikation Details
Zeile 21: Zeile 24:
 
of approaches and allows for various instantiations. As our first contribution, we generalize ESA in order to clearly show the degrees of freedom it provides. Second, we propose some variants of ESA along different dimensions, testing their impact on performance on a cross-lingual mate retrieval task on two datasets
 
of approaches and allows for various instantiations. As our first contribution, we generalize ESA in order to clearly show the degrees of freedom it provides. Second, we propose some variants of ESA along different dimensions, testing their impact on performance on a cross-lingual mate retrieval task on two datasets
 
(JRC-ACQUIS and Multext). Our results are interesting as a systematic investigation has been missing so far and the variations between different basic design choices are significant. We also show that the settings adopted in the original ESA implementation are reasonably good, which to our knowledge has not been demonstrated so far, but can still be significantly improved by tuning the right parameters (yielding a relative improvement on a cross-lingual mate retrieval task of between 62% (Multext) and 237% (JRC-ACQUIS) with respect to the original ESA model).
 
(JRC-ACQUIS and Multext). Our results are interesting as a systematic investigation has been missing so far and the variations between different basic design choices are significant. We also show that the settings adopted in the original ESA implementation are reasonably good, which to our knowledge has not been demonstrated so far, but can still be significantly improved by tuning the right parameters (yielding a relative improvement on a cross-lingual mate retrieval task of between 62% (Multext) and 237% (JRC-ACQUIS) with respect to the original ESA model).
|VG Wort-Seiten=
+
|Download=2009_2038_Sorg_An_Experimental_1.pdf
|Download=2009_2038_Sorg_An Experimental_1.pdf
+
|Projekt=Multipla
|DOI Name=
+
|Forschungsgruppe=Wissensmanagement
|Forschungsgebiet=Informationsextraktion, Information Retrieval, Data Mining, Natürliche Sprachverarbeitung, Text Mining,
+
}}
|Projekt=Multipla,
+
{{Forschungsgebiet Auswahl
|Forschungsgruppe=
+
|Forschungsgebiet=Informationsextraktion
 +
}}
 +
{{Forschungsgebiet Auswahl
 +
|Forschungsgebiet=Natürliche Sprachverarbeitung
 +
}}
 +
{{Forschungsgebiet Auswahl
 +
|Forschungsgebiet=Data Mining
 +
}}
 +
{{Forschungsgebiet Auswahl
 +
|Forschungsgebiet=Text Mining
 +
}}
 +
{{Forschungsgebiet Auswahl
 +
|Forschungsgebiet=Information Retrieval
 
}}
 
}}

Aktuelle Version vom 23. Juni 2010, 11:55 Uhr


An Experimental Comparison of Explicit Semantic Analysis Implementations for Cross-Language Retrieval


An Experimental Comparison of Explicit Semantic Analysis Implementations for Cross-Language Retrieval



Published: 2009 Juni
Herausgeber: Helmut Horacek, Elisabeth Métais, Rafael Muñoz, Magdalena Wolska
Buchtitel: Proceedings of the International Conference on Applications of Natural Language to Information Systems (NLDB)
Seiten: 36-48
Verlag: Springer

Referierte Veröffentlichung

BibTeX

Kurzfassung
Explicit Semantic Analysis (ESA) has been recently proposed as an approach to computing semantic relatedness between words (and indirectly also between texts) and has thus a natural application in information retrieval, showing the potential to alleviate the vocabulary mismatch problem inherent in standard Bag-of-Word models. The ESA model has been also recently extended to cross-lingual retrieval settings, which can be considered as an extreme case of the vocabulary mismatch problem. The ESA approach actually represents a class of approaches and allows for various instantiations. As our first contribution, we generalize ESA in order to clearly show the degrees of freedom it provides. Second, we propose some variants of ESA along different dimensions, testing their impact on performance on a cross-lingual mate retrieval task on two datasets (JRC-ACQUIS and Multext). Our results are interesting as a systematic investigation has been missing so far and the variations between different basic design choices are significant. We also show that the settings adopted in the original ESA implementation are reasonably good, which to our knowledge has not been demonstrated so far, but can still be significantly improved by tuning the right parameters (yielding a relative improvement on a cross-lingual mate retrieval task of between 62% (Multext) and 237% (JRC-ACQUIS) with respect to the original ESA model).

Download: Media:2009_2038_Sorg_An_Experimental_1.pdf

Projekt

Multipla



Forschungsgruppe

Wissensmanagement


Forschungsgebiet

Information Retrieval, Text Mining, Informationsextraktion, Natürliche Sprachverarbeitung, Data Mining