Icneac Poster Presentation1-Sai

download Icneac Poster Presentation1-Sai

of 1

Transcript of Icneac Poster Presentation1-Sai

  • 8/6/2019 Icneac Poster Presentation1-Sai

    1/1

    TEMPLATEDESIGN2008

    www.PosterPr esentations.com

    Performance Enhancement through Association Rule Mining With Hashing & Pipelining

    K.S.Sai Krishna * , D.K.Beberta #

    * Department of CSE, GMR Institute Of Technology, Rajam, (A.P), India.# Assoc. Professor, Department of CSE, GMR Institute Of Technology, Rajam, (A.P), India.

    .

    PIPELINE DESIGN &

    TRANSACTION TRIMMING

    CONCLUSIONSYSTEM ARCHITECTURE

    EXPERIMENTAL RESULTS

    REFERENCES

    .

    .HAPPI architecture for enhance the performance of

    association rule mining. The bottleneck of a priori-based hardware schemes is related to the number ofcandidate itemsets and the size of the database. Tosolve this problem, we apply the pipeline

    methodology in the HAPPI architecture to compareitemsets with the database and collect useful

    information to reduce the number of candidateitemsets and items in the database simultaneously.HAPPI increases computing power and saves thecosts of data mining in hardware design as compared

    to the previous approach. Furthermore, HAPPIpossessesgoodscalability

    PRELIMINARIES

    HAPPI ARCHITECTURE

    After conducting several experiments on a number of synthetic

    data sets to evaluate the performance of the HAPPIarchitecture. We also implement an approach mainly based on

    abbreviated as Direct Comparison (DC) method, forcomparison purposes. The hardware modules are coded inVerilog. We use Model Sim to simulate Verilog codes and verifythe functions of our design

    [1] R.Agarwal, C. Aggarwal, and V. Prasad, A

    Tree Projection Algorithm for Generation ofFrequent Itemsets, J. Parallel and Distributed

    Computing, 2000.

    [2] R. Agrawal and R. Srikant, Fast Algorithms

    for Mining Association Rules, Proc. 20th Intl

    Conf. Very Large Databases (VLDB), 1994.[3] Z.K. Baker and V.K. Prasanna, Efficient

    Hardware Data Mining with the Apriori Algorithm

    on FPGAS, Proc. 13th Ann. IEEE Symp. Field-

    Programmable Custom Computing Machines

    (FCCM), 2005.[4] Z.K. Baker and V.K. Prasanna, An

    Architecture for Efficient Hardware Data Mining

    Using Reconfigurable Computing Systems, Proc.14th Ann. IEEE Symp. Field-Programmable

    Custom Computing Machines (FCCM 06), pp.67-75, Apr. 2006.

    5 ] J. S. P ar k, M. -S. C he n, a nd P. S. Yu , An

    Effec tive Hash Based Algor ithm for Mining

    AssociationRules,Proc. ACMSIGMOD95, pp.

    175-186,May 1995

    ICNEAC-2011 ICNEAC-2011

    INTRODUCTIONThe procedure flow of one round.

    The HAPPI architecture: (a) systolic array, (b) trimming filter, and (c) hash

    table filter

    HASH TABLE FILTERING

    Datamining technology isnow used in a wide varietyof fields. One of the most important data miningapplications is association rule mining. Generally

    speaking, Apriori-based hardware schemes requireloading the candidate item setsand the databaseinto thehardware. Since the capacity of the hardware is fixed, ifthe number of items in the database is larger than the

    hardware capacity, the data items must be loadedseparately. Therefore, the process of comparingcandidate item sets with the database needs to be

    executed several times. Similarly, if the number ofcandidate item sets is larger than the capacity of thehardware, the pattern matching procedure has to be

    separated into many rounds. Clearly, it is infeasible forany hardware design to load the candidate Item sets andthe database into hardware for multiple times. Numerouscandidate item sets and a huge database may cause a

    bottleneck in the system. To avoid this bottleneck weintroduce the hash based pipeline. With the help of thisarchitecture, pruning of infrequent data item sets will bedoneso as to increasethe performanceof the system

    System architecture.

    The process of building H2 and using H2 to filter out C2.

    An example of transaction trimming

    An example of streaming a transaction and the corresponding trimming

    information into the cell. (a) Stream a transaction into the cell. (b) Streamtrimming information into the cell.

    The trimming filter.

    PERFORMANCE ANALYSIS

    The hash value generator.

    While the support counting procedure is being executed,the whole database is streamed into the systolic array.However, not all the transactions are useful for

    generating frequent item sets. Therefore, we filter outitems in the transactions so that the database is reduced.In the HAPPI architecture, the trimming information

    records the frequency of each item in a transaction thatappearsin thecandidate itemsets.

    In this section,properties are derived from the systemwith and without the pipeline scheme to investigate the

    total execution time. Suppose the number of candidate k-itemsets is Ncand-k and the number of frequent k-itemsets is Nfreq-k: There are Ncell hardware cells in the

    systolic array. ITI represents the average number of itemsin a transaction, and IDI is the total number of items inthedatabase.

    The hash table scheme proposed in DHP improves the

    performance of Apriori-based algorithms by filtering outInfrequent candidate itemsets. In addition, DHP employs

    an effective pruning scheme to eliminate infrequentitems in transactions. We summarize these two schemes

    below. In the hash table scheme, a hash function isapplied to all of candidate k-itemsets generated by

    frequent sub itemsets. Each candidate k-itemset ismapped to a hash value, and itemsets with the samehash value are put into the same bucket of the hashtable. If the number of the candidate item sets in the

    bucket is less than the minimum support threshold, thenumber of these candidate itemsets in the database islessthantheminimumsupportthreshold.