Aus Aifbportal
Version vom 16. Oktober 2009, 17:40 Uhr von Nicole Arlt (Diskussion | Beiträge) (Wikipedia python library)
(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Wechseln zu:Navigation, Suche

Ontology-based Text Clustering

Ontology-based Text Clustering

Published: 2001

Buchtitel: Proc. of IJCAI 2001

Referierte VeröffentlichungNote: Workshop "Text Learning: Beyond Supervision"


Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. In this paper, we propose a new approach for applying background knowledge during preprocessing in order to improve clustering results and allow for selection between results. We built various views basing our selection of text features on a heterarchy of concepts. Based on these aggregations, we compute multiple clustering results using K-Means. The results may be distinguished and explained by the corresponding selection of concepts in the ontology. Our results compare favourably with a sophisticated baseline preprocessing strategy.

Download: Media:2001_486_Hotho_Ontology-based__1.pdf