Aktuelle Version vom 27. November 2015, 21:42 Uhr

Learning Concept Hierarchies from Text Corpora using Formal Concept Anaylsis

Philipp Cimiano, Andreas Hotho, Steffen Staab

Veröffentlicht: 2005 August

Journal: Journal of Artificial Intelligence Research (JAIR)

Seiten: 305-339

Volume: 24

Referierte Veröffentlichung

BibTeX

Kurzfassung
bstract: We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector representing syntactic dependencies which are automatically acquired from the text corpus with a linguistic parser. On the basis of this context information, FCA produces a lattice that we convert into a special kind of partial order constituting a concept hierarchy. The approach is evaluated by comparing the resulting concept hierarchies with hand-crafted taxonomies for two domains: tourism and finance. We also directly compare our approach with hierarchical agglomerative clustering as well as with Bi-Section-KMeans as an instance of a divisive clustering algorithm. Furthermore, we investigate the impact of using different measures weighting the contribution of each attribute as well as of applying a particular smoothing technique to cope with data sparseness.

Download: Media:2005_977_Cimiano_Learning_Concep_1.pdf,Media:2005_977_Cimiano_Learning_Concep_1.ps
Weitere Informationen unter: Link

Projekt

SmartWeb, Dot.Kom

Forschungsgruppe

Web Science und Wissensmanagement

Forschungsgebiet

Ontology Learning

@@ Zeile 1: / Zeile 1: @@
-{{Publikation Author
+{{Publikation Erster Autor
-|Rank=2
+|ErsterAutorNachname=Cimiano
-|Author=Andreas Hotho
+|ErsterAutorVorname=Philipp
 }}
 {{Publikation Author
@@ Zeile 8: / Zeile 8: @@
 }}
 {{Publikation Author
-|Rank=1
+|Rank=2
-|Author=Philipp Cimiano
+|Author=Andreas Hotho
 }}
 {{Article
@@ Zeile 24: / Zeile 24: @@
 |Abstract=bstract: We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from a text corpus. The approach is based on Formal Concept Analysis (FCA), a method mainly used for the analysis of data, i.e. for investigating and processing explicitly given information. We follow Harris' distributional hypothesis and model the context of a certain term as a vector representing syntactic dependencies which are automatically acquired from the text corpus with a linguistic parser. On the basis of this context information, FCA produces a lattice that we convert into a special kind of partial order constituting a concept hierarchy. The approach is evaluated by comparing the resulting concept hierarchies with hand-crafted taxonomies for two domains: tourism and finance. We also directly compare our approach with hierarchical agglomerative clustering as well as with Bi-Section-KMeans as an instance of a divisive clustering algorithm. Furthermore, we investigate the impact of using different measures weighting the contribution of each attribute as well as of applying a particular smoothing technique to cope with data sparseness.
 |VG Wort-Seiten=
-|Download=2005_977_Cimiano_Learning_Concep_1.pdf, 2005_977_Cimiano_Learning_Concep_2.ps
+|Download=2005_977_Cimiano_Learning_Concep_1.pdf, 2005_977_Cimiano_Learning_Concep_1.ps
 |Link=http://www.jair.org/contents/v24.html
-|Projekt=Dot.Kom, SmartWeb,
+|Projekt=SmartWeb, Dot.Kom,
-|Forschungsgruppe=
+|Forschungsgruppe=Web Science und Wissensmanagement
 }}
 {{Forschungsgebiet Auswahl
 |Forschungsgebiet=Ontology Learning
 }}

Article977: Unterschied zwischen den Versionen

Aktuelle Version vom 27. November 2015, 21:42 Uhr

Learning Concept Hierarchies from Text Corpora using Formal Concept Anaylsis

Learning Concept Hierarchies from Text Corpora using Formal Concept Anaylsis