Icneac Poster Presentation1-Sai

8/6/2019 Icneac Poster Presentation1-Sai

1/1

TEMPLATEDESIGN2008

www.PosterPr esentations.com

Performance Enhancement through Association Rule Mining With Hashing & Pipelining

K.S.Sai Krishna * , D.K.Beberta #

* Department of CSE, GMR Institute Of Technology, Rajam, (A.P), India.# Assoc. Professor, Department of CSE, GMR Institute Of Technology, Rajam, (A.P), India.

.

PIPELINE DESIGN &

TRANSACTION TRIMMING

CONCLUSIONSYSTEM ARCHITECTURE

EXPERIMENTAL RESULTS

REFERENCES

.

.HAPPI architecture for enhance the performance of

association rule mining. The bottleneck of a priori-based hardware schemes is related to the number ofcandidate itemsets and the size of the database. Tosolve this problem, we apply the pipeline

methodology in the HAPPI architecture to compareitemsets with the database and collect useful

information to reduce the number of candidateitemsets and items in the database simultaneously.HAPPI increases computing power and saves thecosts of data mining in hardware design as compared

to the previous approach. Furthermore, HAPPIpossessesgoodscalability

PRELIMINARIES

HAPPI ARCHITECTURE

After conducting several experiments on a number of synthetic

data sets to evaluate the performance of the HAPPIarchitecture. We also implement an approach mainly based on

abbreviated as Direct Comparison (DC) method, forcomparison purposes. The hardware modules are coded inVerilog. We use Model Sim to simulate Verilog codes and verifythe functions of our design

[1] R.Agarwal, C. Aggarwal, and V. Prasad, A

Tree Projection Algorithm for Generation ofFrequent Itemsets, J. Parallel and Distributed

Computing, 2000.

[2] R. Agrawal and R. Srikant, Fast Algorithms

for Mining Association Rules, Proc. 20th Intl

Conf. Very Large Databases (VLDB), 1994.[3] Z.K. Baker and V.K. Prasanna, Efficient

Hardware Data Mining with the Apriori Algorithm

on FPGAS, Proc. 13th Ann. IEEE Symp. Field-

Programmable Custom Computing Machines

(FCCM), 2005.[4] Z.K. Baker and V.K. Prasanna, An

Architecture for Efficient Hardware Data Mining

Using Reconfigurable Computing Systems, Proc.14th Ann. IEEE Symp. Field-Programmable

Custom Computing Machines (FCCM 06), pp.67-75, Apr. 2006.

5 ] J. S. P ar k, M. -S. C he n, a nd P. S. Yu , An

Effec tive Hash Based Algor ithm for Mining

AssociationRules,Proc. ACMSIGMOD95, pp.

175-186,May 1995

ICNEAC-2011 ICNEAC-2011

INTRODUCTIONThe procedure flow of one round.

The HAPPI architecture: (a) systolic array, (b) trimming filter, and (c) hash

table filter

HASH TABLE FILTERING

Datamining technology isnow used in a wide varietyof fields. One of the most important data miningapplications is association rule mining. Generally

speaking, Apriori-based hardware schemes requireloading the candidate item setsand the databaseinto thehardware. Since the capacity of the hardware is fixed, ifthe number of items in the database is larger than the

hardware capacity, the data items must be loadedseparately. Therefore, the process of comparingcandidate item sets with the database needs to be

executed several times. Similarly, if the number ofcandidate item sets is larger than the capacity of thehardware, the pattern matching procedure has to be

separated into many rounds. Clearly, it is infeasible forany hardware design to load the candidate Item sets andthe database into hardware for multiple times. Numerouscandidate item sets and a huge database may cause a

bottleneck in the system. To avoid this bottleneck weintroduce the hash based pipeline. With the help of thisarchitecture, pruning of infrequent data item sets will bedoneso as to increasethe performanceof the system

System architecture.

The process of building H2 and using H2 to filter out C2.

An example of transaction trimming

An example of streaming a transaction and the corresponding trimming

information into the cell. (a) Stream a transaction into the cell. (b) Streamtrimming information into the cell.

The trimming filter.

PERFORMANCE ANALYSIS

The hash value generator.

While the support counting procedure is being executed,the whole database is streamed into the systolic array.However, not all the transactions are useful for

generating frequent item sets. Therefore, we filter outitems in the transactions so that the database is reduced.In the HAPPI architecture, the trimming information

records the frequency of each item in a transaction thatappearsin thecandidate itemsets.

In this section,properties are derived from the systemwith and without the pipeline scheme to investigate the

total execution time. Suppose the number of candidate k-itemsets is Ncand-k and the number of frequent k-itemsets is Nfreq-k: There are Ncell hardware cells in the

systolic array. ITI represents the average number of itemsin a transaction, and IDI is the total number of items inthedatabase.

The hash table scheme proposed in DHP improves the

performance of Apriori-based algorithms by filtering outInfrequent candidate itemsets. In addition, DHP employs

an effective pruning scheme to eliminate infrequentitems in transactions. We summarize these two schemes

below. In the hash table scheme, a hash function isapplied to all of candidate k-itemsets generated by

frequent sub itemsets. Each candidate k-itemset ismapped to a hash value, and itemsets with the samehash value are put into the same bucket of the hashtable. If the number of the candidate item sets in the

bucket is less than the minimum support threshold, thenumber of these candidate itemsets in the database islessthantheminimumsupportthreshold.

Icneac Poster Presentation1-Sai

Documents

Transcript of Icneac Poster Presentation1-Sai