Clustering of Polysemic Words
Herausgeber: Reinhold Decker and Hans-Joachim Lenz
Buchtitel: Advances in Data Analysis: Proceedings of the 30th Annual Conference of the German Classification Society (GfKl), Berlin, Germany, March 8-10, 2006
Reihe: Studies in Classification, Data Analysis, and Knowledge Organization
In this paper, we outline an approach for constructing clusters of related terms that may be used for deriving formal conceptual structures like taxonomies. In contrast to previous approaches in this direction, we take into account the fact that words can have several meanings and consider two alternative soft clustering techniques, namely Overlapping Pole-Based Clustering (PoBOC) and Clustering by Comittees (CBC) for this purpose. These soft clustering algorithms try to detect different contexts of the clustered words, resulting in possibly more than one cluster membership. We report on initial experiments conducted on textual data from the tourist domain.
Weitere Informationen unter: Link