Icneac Poster Presentation1-Sai
-
Upload
suresh-uriti -
Category
Documents
-
view
217 -
download
0
Transcript of Icneac Poster Presentation1-Sai
-
8/6/2019 Icneac Poster Presentation1-Sai
1/1
TEMPLATEDESIGN2008
www.PosterPr esentations.com
Performance Enhancement through Association Rule Mining With Hashing & Pipelining
K.S.Sai Krishna * , D.K.Beberta #
* Department of CSE, GMR Institute Of Technology, Rajam, (A.P), India.# Assoc. Professor, Department of CSE, GMR Institute Of Technology, Rajam, (A.P), India.
.
PIPELINE DESIGN &
TRANSACTION TRIMMING
CONCLUSIONSYSTEM ARCHITECTURE
EXPERIMENTAL RESULTS
REFERENCES
.
.HAPPI architecture for enhance the performance of
association rule mining. The bottleneck of a priori-based hardware schemes is related to the number ofcandidate itemsets and the size of the database. Tosolve this problem, we apply the pipeline
methodology in the HAPPI architecture to compareitemsets with the database and collect useful
information to reduce the number of candidateitemsets and items in the database simultaneously.HAPPI increases computing power and saves thecosts of data mining in hardware design as compared
to the previous approach. Furthermore, HAPPIpossessesgoodscalability
PRELIMINARIES
HAPPI ARCHITECTURE
After conducting several experiments on a number of synthetic
data sets to evaluate the performance of the HAPPIarchitecture. We also implement an approach mainly based on
abbreviated as Direct Comparison (DC) method, forcomparison purposes. The hardware modules are coded inVerilog. We use Model Sim to simulate Verilog codes and verifythe functions of our design
[1] R.Agarwal, C. Aggarwal, and V. Prasad, A
Tree Projection Algorithm for Generation ofFrequent Itemsets, J. Parallel and Distributed
Computing, 2000.
[2] R. Agrawal and R. Srikant, Fast Algorithms
for Mining Association Rules, Proc. 20th Intl
Conf. Very Large Databases (VLDB), 1994.[3] Z.K. Baker and V.K. Prasanna, Efficient
Hardware Data Mining with the Apriori Algorithm
on FPGAS, Proc. 13th Ann. IEEE Symp. Field-
Programmable Custom Computing Machines
(FCCM), 2005.[4] Z.K. Baker and V.K. Prasanna, An
Architecture for Efficient Hardware Data Mining
Using Reconfigurable Computing Systems, Proc.14th Ann. IEEE Symp. Field-Programmable
Custom Computing Machines (FCCM 06), pp.67-75, Apr. 2006.
5 ] J. S. P ar k, M. -S. C he n, a nd P. S. Yu , An
Effec tive Hash Based Algor ithm for Mining
AssociationRules,Proc. ACMSIGMOD95, pp.
175-186,May 1995
ICNEAC-2011 ICNEAC-2011
INTRODUCTIONThe procedure flow of one round.
The HAPPI architecture: (a) systolic array, (b) trimming filter, and (c) hash
table filter
HASH TABLE FILTERING
Datamining technology isnow used in a wide varietyof fields. One of the most important data miningapplications is association rule mining. Generally
speaking, Apriori-based hardware schemes requireloading the candidate item setsand the databaseinto thehardware. Since the capacity of the hardware is fixed, ifthe number of items in the database is larger than the
hardware capacity, the data items must be loadedseparately. Therefore, the process of comparingcandidate item sets with the database needs to be
executed several times. Similarly, if the number ofcandidate item sets is larger than the capacity of thehardware, the pattern matching procedure has to be
separated into many rounds. Clearly, it is infeasible forany hardware design to load the candidate Item sets andthe database into hardware for multiple times. Numerouscandidate item sets and a huge database may cause a
bottleneck in the system. To avoid this bottleneck weintroduce the hash based pipeline. With the help of thisarchitecture, pruning of infrequent data item sets will bedoneso as to increasethe performanceof the system
System architecture.
The process of building H2 and using H2 to filter out C2.
An example of transaction trimming
An example of streaming a transaction and the corresponding trimming
information into the cell. (a) Stream a transaction into the cell. (b) Streamtrimming information into the cell.
The trimming filter.
PERFORMANCE ANALYSIS
The hash value generator.
While the support counting procedure is being executed,the whole database is streamed into the systolic array.However, not all the transactions are useful for
generating frequent item sets. Therefore, we filter outitems in the transactions so that the database is reduced.In the HAPPI architecture, the trimming information
records the frequency of each item in a transaction thatappearsin thecandidate itemsets.
In this section,properties are derived from the systemwith and without the pipeline scheme to investigate the
total execution time. Suppose the number of candidate k-itemsets is Ncand-k and the number of frequent k-itemsets is Nfreq-k: There are Ncell hardware cells in the
systolic array. ITI represents the average number of itemsin a transaction, and IDI is the total number of items inthedatabase.
The hash table scheme proposed in DHP improves the
performance of Apriori-based algorithms by filtering outInfrequent candidate itemsets. In addition, DHP employs
an effective pruning scheme to eliminate infrequentitems in transactions. We summarize these two schemes
below. In the hash table scheme, a hash function isapplied to all of candidate k-itemsets generated by
frequent sub itemsets. Each candidate k-itemset ismapped to a hash value, and itemsets with the samehash value are put into the same bucket of the hashtable. If the number of the candidate item sets in the
bucket is less than the minimum support threshold, thenumber of these candidate itemsets in the database islessthantheminimumsupportthreshold.