Inproceedings3517: Unterschied zwischen den Versionen
Nk6388 (Diskussion | Beiträge) (Die Seite wurde neu angelegt: „{{Publikation Erster Autor |ErsterAutorNachname=Mogadala |ErsterAutorVorname=Aditya }} {{Inproceedings |Referiert=True |Title=Bilingual Word Embeddings from Paral…“) |
Nk6388 (Diskussion | Beiträge) |
||
Zeile 2: | Zeile 2: | ||
|ErsterAutorNachname=Mogadala | |ErsterAutorNachname=Mogadala | ||
|ErsterAutorVorname=Aditya | |ErsterAutorVorname=Aditya | ||
+ | }} | ||
+ | {{Publikation Author | ||
+ | |Rank=2 | ||
+ | |Author=Achim Rettinger | ||
}} | }} | ||
{{Inproceedings | {{Inproceedings | ||
Zeile 15: | Zeile 19: | ||
|Abstract=In many languages, sparse availability of resources causes numerous challenges for textual analysis tasks. Text classification is one of such standard tasks that is hindered due to limited availability of label information in low-resource languages. Transferring knowledge (i.e. label information) from high-resource to low-resource languages might improve text | |Abstract=In many languages, sparse availability of resources causes numerous challenges for textual analysis tasks. Text classification is one of such standard tasks that is hindered due to limited availability of label information in low-resource languages. Transferring knowledge (i.e. label information) from high-resource to low-resource languages might improve text | ||
classification as compared to the other approaches like machine translation. We introduce BRAVE (Bilingual paRAgraph VEctors), a model to learn bilingual distributed representations (i.e. embeddings) of words without word alignments either from sentence-aligned parallel or label-aligned non-parallel document corpora to support cross-language text classification. Empirical analysis shows that classification models trained with our bilingual embeddings outperforms other state-of-the-art systems on three different cross-language text classification tasks. | classification as compared to the other approaches like machine translation. We introduce BRAVE (Bilingual paRAgraph VEctors), a model to learn bilingual distributed representations (i.e. embeddings) of words without word alignments either from sentence-aligned parallel or label-aligned non-parallel document corpora to support cross-language text classification. Empirical analysis shows that classification models trained with our bilingual embeddings outperforms other state-of-the-art systems on three different cross-language text classification tasks. | ||
− | |Download=NAACL-HLT-2016-Camera-Ready.pdf, | + | |Download=NAACL-HLT-2016-Camera-Ready.pdf, |
|Projekt=XLiMe | |Projekt=XLiMe | ||
|Forschungsgruppe=Web Science und Wissensmanagement | |Forschungsgruppe=Web Science und Wissensmanagement | ||
}} | }} |
Aktuelle Version vom 3. Mai 2016, 17:24 Uhr
Bilingual Word Embeddings from Parallel and Non-parallel Corpora for Cross-Language Text Classification.
Bilingual Word Embeddings from Parallel and Non-parallel Corpora for Cross-Language Text Classification.
Published: 2016
Juni
Buchtitel: Human Language Technologies: The 2016 Annual Conference of the North American Chapter of the ACL.
Verlag: Association for Computational Linguistics
Organisation: NAACL HLT
Referierte Veröffentlichung
BibTeX
Kurzfassung
In many languages, sparse availability of resources causes numerous challenges for textual analysis tasks. Text classification is one of such standard tasks that is hindered due to limited availability of label information in low-resource languages. Transferring knowledge (i.e. label information) from high-resource to low-resource languages might improve text
classification as compared to the other approaches like machine translation. We introduce BRAVE (Bilingual paRAgraph VEctors), a model to learn bilingual distributed representations (i.e. embeddings) of words without word alignments either from sentence-aligned parallel or label-aligned non-parallel document corpora to support cross-language text classification. Empirical analysis shows that classification models trained with our bilingual embeddings outperforms other state-of-the-art systems on three different cross-language text classification tasks.
Download: Media:NAACL-HLT-2016-Camera-Ready.pdf
Web Science und Wissensmanagement