Mining frequent itemsets using advanced partition approach /

Date

2004-12

Journal Title

Journal ISSN

Volume Title

Publisher

Texas Tech University

Abstract

Data Mining is the process of extracting interesting and previously unknown patterns and correlations from data stored in Database Management Systems (DBMSs). Association rule mining, a descriptive mining technique of data mining is the process of discovering items, which tend to occur together in transactions. As the data to be mined is large, the time taken for accessing data is considerable.

In this thesis, a new Association rule mining algorithm which generates the frequent itemsets in a single pass over the database is implemented. The algorithm mainly uses two approaches for association rule mining over data stored in multiple relations in one or more databases: The Partition approach, where the data is mined in partitions and merges the result, and the Apriori approach that helps to find the frequent sets within each partition. In order to evaluate the performance of the new association algorithm, it is compared with the existing algorithms which require multiple database passes to generate the frequent itemsets. Extensive experiments are performed and results are presented for both the approaches. Experiments show that time taken for the database scan is more than the time taken for the candidate generation when the database size is large, which provides evidence that focus to decrease the database access time is a viable approach to the association rule mining

Description

Keywords

Data mining, Computer algorithms

Citation