Article19: Unterschied zwischen den Versionen
K (Added from ontology) |
K (Added from ontology) |
||
Zeile 25: | Zeile 25: | ||
|Download=2002_19_Hotho_Text_Clustering_1.pdf | |Download=2002_19_Hotho_Text_Clustering_1.pdf | ||
|Projekt= | |Projekt= | ||
− | |Forschungsgruppe=Effiziente Algorithmen | + | |Forschungsgruppe=Komplexitätsmanagement, Effiziente Algorithmen, Betriebliche Informations- und Kommunikationssysteme, |
}} | }} | ||
{{Forschungsgebiet Auswahl | {{Forschungsgebiet Auswahl |
Version vom 10. September 2009, 18:21 Uhr
Text Clustering Based on Good Aggregations
Text Clustering Based on Good Aggregations
Veröffentlicht: 2002
Journal: Künstliche Intelligenz (KI)
Nummer: 4
Seiten: 48-54
Volume: 16
Referierte Veröffentlichung
Kurzfassung
Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. In this paper, we propose a new approach for applying background knowledge during preprocessing in order to improve clustering results and allow for selection between results. We preprocess our input data applying an ontology-based heuristics for feature selection and feature aggregation. Thus, we construct a number of alternative text representations. Based on these representations, we compute multiple clustering results using K-Means. The results may be distinguished and explained by the corresponding selection of concepts in the ontology. Our results compare favourably with a sophisticated baseline preprocessing strategy.
Download: Media:2002_19_Hotho_Text_Clustering_1.pdf
Komplexitätsmanagement,Effiziente Algorithmen,Betriebliche Informations- und Kommunikationssysteme„Betriebliche Informations- und Kommunikationssysteme“ befindet sich nicht in der Liste (Effiziente Algorithmen, Komplexitätsmanagement, Betriebliche Informationssysteme, Wissensmanagement, Angewandte Technisch-Kognitive Systeme, Information Service Engineering, Critical Information Infrastructures, Web Science und Wissensmanagement, Web Science, Ökonomie und Technologie der eOrganisation, ...) zulässiger Werte für das Attribut „Forschungsgruppe“.