Digital Library Service – An overview Introduction System Architecture Components and their...

Digital Library Service – An overview

Introduction System Architecture Components and their functionalities Experimental Results

Introduction Peer-to-Peer (P2P) Information Retrieval

framework Peers that share informationCumulative bandwidthHigh processing power and storageAbsence of high cost hardware

Three generations of P2P networks

1st Generation Centralized DB for coordinated look upNapster

2nd Generation Flooding to search every node on the networkGneutella

3rd Generation’Distributed Hash TablesTapestry, Chord, Pastry, CAN, Kademlia Uses routing tables to maintain the addresses of its

neighbours

In 3G P2P networks log N to N nodes have to be contacted to reach destination.

Proposed method, the target peer can be contacted directly from

the source peer.Search occurs within the target peer to

retrieve file reference using keyword indices in a B+ tree

System Architecture

P2P cluster and Hadoop cluster Hadoop cluster

Extract keywords for efficient searchingMapReduce programming paradigm

P2P clusterUpload filesServicing search requests

Map reduce Master(Job Tracker)

DFS Master(Name node)

Map reduce Slave(Task Tracker)

DFS Slave(Data node)

Map reduce Slave(Task Tracker)

DFS Slave(Data node)

HADOOP CLUSTER

P2P CLUSTER

Keyword extraction

SYSTEM ARCHITECTURE

Hadoop Software platform to handle vast amounts of data Moving computation to the place of data rather than

moving large data blocks to the place of computation

HDFS and MapReduce frameworkHDFS – NameNode and DataNodeMapReduce computation

Map – splits input data set into fragments and assigns each fragment to a map task. (K,V)

Reduce – Merges all intermediate values associated with a key

D1,B1 D2,B1 D1,B2 D1,B3 D3,B1 D2,B2 D3,B2

M M M M M M M

Sort and Group (D2)

K1,[C6] K2,[C2] K3,[C2,C6] K5,[C2] K6,[C6]

Sort and Group (D1)

R R R R R R

K1,[C1] K2,[C1,C4] K3,[C1,C3] K4,[C4,C3] K5,[C4] K6,[C3]

R R R R R

K1,I K2,I K3, I K4, I K5, I K6,I K1, I K2, I K3, I K5, I K6, I

Map Task 1 Map Task 2 Map Task 3

Reduce Task 1 Reduce Task 2

B+ Tree – IP and its hash Represents sorted data indexed by a key for efficient

insertion, retrieval and removal of records. Inserting / Searching a record requires O(logBN)

operations in the worst case B - order, N - nodes

153 270 450

IP14 450

305305

153 270153 270 450450

DLS Components Start up component: Starting up the Hadoop cluster Identifying nodes to participate in the P2P

cluster. Determining the IP hash values for the peers

Using SHA1 (160-bit 40-bit) Forming the B+ tree. Uploading B+ trees in other peers. Starting the Web Server.

DB Distribution Component

Keyword extraction using Hadoop cluster Hashing keywords (SHA1 (160-bit40-bit) Find peer with relatively close match Upload in target peer Update B+ tree (Keyword – file-ref) in target

HADOOP CLUSTER

Doc 1 Doc 2 Doc n

File name, list of keywords

Hash search keys

Target Identification

Upload the documentin target

PEERS in P2P network

Search Component Process keywords Find 40-bit hash value Search the B+ tree in peer to identify target node Search B+ tree in target node to retrieve file

reference

list of keywords

Hash search keys

Identify the search node using Relative difference between hash vales of keywords and IP address in B+ tree

Search the document

in target peer

PEER2 in P2P network

Search request

Search

request

PEER1 in P2P network

Add/Delete Peer Update IP address table Compute IP-hash of newly added peer Reconstruct the B+ tree and update in peers Relocate appropriate files to new peer Modify metadata in peers

Experimental Results – Keyword Extraction from multiple files(1MB each)

1 . 0 E + 0

1 0 0 . 0 E + 0

1 0 . 0 E + 3

1 . 0 E + 6

1 0 0 . 0 E + 6

1 0 . 0 E + 9

1 . 0 E + 1 2

1 f ile 2 f ile 3 f ile 4 f ile 5 f ile 6 f ile 7 f ile

N o o f F ile s

Observation – depends on no of keywords

0 . 0 0 E + 0 0

1 . 0 0 E + 0 9

2 . 0 0 E + 0 9

3 . 0 0 E + 0 9

4 . 0 0 E + 0 9

5 . 0 0 E + 0 9

6 . 0 0 E + 0 9

7 . 0 0 E + 0 9

8 . 0 0 E + 0 9

9 . 0 0 E + 0 9

2 4 6 1 0N o o f N o d e s

sCluster Set up Time

It is a factor of No.of nodes

0 . 0 0 E + 0 0

5 . 0 0 E + 0 9

1 . 0 0 E + 1 0

1 . 5 0 E + 1 0

2 . 0 0 E + 1 0

2 . 5 0 E + 1 0

2 – 3 3 – 4 4 – 5 5 – 6 6 – 7 7 – 8 8 – 9 9 – 1 0

N o . o f N o d e s

5 K e y w o r d s 1 0 K e y w o r d s 2 0 K e y w o r d s

Add a new Peer

It is a factor of No. of keywords (for 1 peer)

Performance of data distribution Component

2 E + 1 0

4 E + 1 0

6 E + 1 0

8 E + 1 0

1 E + 1 1

1 . 2 E + 1 1

5 1 0 2 0N o . o f K e y w o r d s

2 N o d e s 4 N o d e s 6 N o d e s 1 0 N o d e s

Load time is a factor of No.of keywords

Performance of Search Component

1 . 0 0 E + 0 6

1 . 0 0 E + 0 7

1 . 0 0 E + 0 8

1 2 3 4 5

N o o f N o d e s

Search time remains a constant (9 msec)

- B+ tree and search distribution

2 4 6 8 10

Conclusion P2P Information Retrieval Framework uses

3G P2P DHT approach B+ trees are maintained in peers Hadoop is used for keyword extraction from

multiple files in parallel Efficient search on peers

THANK YOU

Digital Library Service – An overview Introduction System Architecture Components and their...

Documents

Transcript of Digital Library Service – An overview Introduction System Architecture Components and their...

Functionalities document 3

D2.3 MOVESMART Technical Specificationsmovesmartfp7.iti.gr/movesmartfp7/wp-content/... · In this task the specifications of the key components and their functionalities are derived.

EXPERIMENTAL AND NUMERICAL INVESTIGATION OF RESONATOR ... · EXPERIMENTAL AND NUMERICAL INVESTIGATION OF RESONATOR BASED ... system components like resonator volume, ... damping effects

Ess mss functionalities introduction

Experimental Components

ISOBUS in Functionalities - aef-online.org · for assurance of conformance of ISOBUS components with the ISO 11783 standard. The tool will test functionalities exceeding the standard,

IIBF LMS Trainee Functionalities

HOTELS quality functionalities

TNT Functionalities

Additive Manufacturing - Fraunhofer · Additive manufacturing offers enormous potential for the pro-duction of components with highly complex geometries and integrated functionalities

Experimental Investigation of Heat Transfer in Components ...

AX58100 Product Introduction - TRG Components · 5. AX58100 Evaluation Board ASIX Electronics provides AX58100 Evaluation Board for users to evaluate AX58100 functionalities on different

Experimental research example (Components)

Computer Architecture - WordPress.com · Computer organization Deals with all physical components of computer systems that interacts with each other to perform various functionalities

Experimental characterization of hybrid structural ...steel.fsv.cvut.cz/.../SUSCOS_M_Theses_2014_Refat_Ahmed_Bhuiyan_… · Experimental characterization of hybrid structural components

GALILEO FUNCTIONALITIES

TCSM3i Functionalities

OVERVIEW - ANM IMPORTANTE/18527_EV_Bucharest.pdf · 3 DAY ONE 09:00 Introduction 09:30 Session 1 • Introduction to the New EudraVigilance System: components and functionalities

User Manual Guide - FUNCTIONALITIES

Powertrain Tribology and Components Laboratories Title OCOH...Powertrain Tribology and Components Laboratories. Research Facility. EXPERIMENTAL TRIBOLOGY. ARL’stribology lab is used