Enhancing Preprocessing in Data-Intensive Domains using Online-Analytical Processing
Published: 2000 September
Buchtitel: DaWaK'2000 - 2nd International Conference on Data Warehousing and Knowledge Discovery, LCNS September 4-6, 2000, Greenwich, UK
The application of any data mining algorithm needs a goal-oriented preprocessing of the data. In practical applications the preprocessing task is very time consuming and has an important in uence on the quality of the generated models. In this paper we describe an new approach for data preprocessing. While merging database technology with classical data mining systems using an OLAP engine as interface we outline an architecture for OLAP-based preprocessing, which enables interactive and iterative processing of data. This high level of interaction between human and database systems enables eÆcient understanding and preparing of data for building scalable data mining applications. Our case study taken from the data-intensive telecommunications domain uses the proposed methodology for the generation of user communication proles. The user proles are then investigated with data mining algorithms for clustering customers with similar communication behavior and describing the investigated clusters using intensional rules.