L ư u Thùy Dung Nguyễn Thị Hà (14-05) Nguyễn Thị Hảo Lê Thị Hằng.
1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà.
-
Upload
todd-wells -
Category
Documents
-
view
217 -
download
0
Transcript of 1 RESOURCE DISCOVERY Presenter: Cù Nguyễn Phương Hà.
1
RESOURCE DISCOVERY
Presenter Cugrave Nguyễn Phương Hagrave
2
Overview
Introduction of problems Several approaches Solution model
3
Overview
Introduction of problems Several approaches Solution model
4
Introduction
The goal of VN-Grid project is connecting the available computational resources on the network to utilize available resources from those sites to resolve big scientific problems
Therefore knowing resources available from all Grid sites and finding which Grid sites having available resources are necessary =gt Resource Discovery services
5
Functions of Resource Discovery
The prospective Resource Discovery services in each Gridsite must be able to know find and provide the resource information from others
The main function is that when receiving a specific request about resources form client Resource Discovery services must find out reliable information about Gridsites in the network that possess available resources satisfying the query
6
Resources in VN-Grid
There are three kinds of resources Resources for executing job or computing
resources It is information about the resources used to execute submitted job for example the computational power data storage network bandwidth
Information about services these are information about the services which user wants to learn about for example Information Services Resource Discovery Services
Information about applications these are information about special applications deployed on Grid such as MPI POP C
7
Resources in VN-Grid
Characteristics of Resources in VN-Grid environment The resources are heterogeneous not only in the
network but also in each The resources have variety of properties with
different data types The existing resources continuously vary especially
the computing resources for example CPUs memory disk network bandwidth
New resources are continually being published
8
Forwarding in VN-Grid
The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers that means those peers could exchange information directly
Interacting is limit to known peers The peers are equally considered The number of peers participating in Grid can be
raised enormously
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
2
Overview
Introduction of problems Several approaches Solution model
3
Overview
Introduction of problems Several approaches Solution model
4
Introduction
The goal of VN-Grid project is connecting the available computational resources on the network to utilize available resources from those sites to resolve big scientific problems
Therefore knowing resources available from all Grid sites and finding which Grid sites having available resources are necessary =gt Resource Discovery services
5
Functions of Resource Discovery
The prospective Resource Discovery services in each Gridsite must be able to know find and provide the resource information from others
The main function is that when receiving a specific request about resources form client Resource Discovery services must find out reliable information about Gridsites in the network that possess available resources satisfying the query
6
Resources in VN-Grid
There are three kinds of resources Resources for executing job or computing
resources It is information about the resources used to execute submitted job for example the computational power data storage network bandwidth
Information about services these are information about the services which user wants to learn about for example Information Services Resource Discovery Services
Information about applications these are information about special applications deployed on Grid such as MPI POP C
7
Resources in VN-Grid
Characteristics of Resources in VN-Grid environment The resources are heterogeneous not only in the
network but also in each The resources have variety of properties with
different data types The existing resources continuously vary especially
the computing resources for example CPUs memory disk network bandwidth
New resources are continually being published
8
Forwarding in VN-Grid
The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers that means those peers could exchange information directly
Interacting is limit to known peers The peers are equally considered The number of peers participating in Grid can be
raised enormously
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
3
Overview
Introduction of problems Several approaches Solution model
4
Introduction
The goal of VN-Grid project is connecting the available computational resources on the network to utilize available resources from those sites to resolve big scientific problems
Therefore knowing resources available from all Grid sites and finding which Grid sites having available resources are necessary =gt Resource Discovery services
5
Functions of Resource Discovery
The prospective Resource Discovery services in each Gridsite must be able to know find and provide the resource information from others
The main function is that when receiving a specific request about resources form client Resource Discovery services must find out reliable information about Gridsites in the network that possess available resources satisfying the query
6
Resources in VN-Grid
There are three kinds of resources Resources for executing job or computing
resources It is information about the resources used to execute submitted job for example the computational power data storage network bandwidth
Information about services these are information about the services which user wants to learn about for example Information Services Resource Discovery Services
Information about applications these are information about special applications deployed on Grid such as MPI POP C
7
Resources in VN-Grid
Characteristics of Resources in VN-Grid environment The resources are heterogeneous not only in the
network but also in each The resources have variety of properties with
different data types The existing resources continuously vary especially
the computing resources for example CPUs memory disk network bandwidth
New resources are continually being published
8
Forwarding in VN-Grid
The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers that means those peers could exchange information directly
Interacting is limit to known peers The peers are equally considered The number of peers participating in Grid can be
raised enormously
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
4
Introduction
The goal of VN-Grid project is connecting the available computational resources on the network to utilize available resources from those sites to resolve big scientific problems
Therefore knowing resources available from all Grid sites and finding which Grid sites having available resources are necessary =gt Resource Discovery services
5
Functions of Resource Discovery
The prospective Resource Discovery services in each Gridsite must be able to know find and provide the resource information from others
The main function is that when receiving a specific request about resources form client Resource Discovery services must find out reliable information about Gridsites in the network that possess available resources satisfying the query
6
Resources in VN-Grid
There are three kinds of resources Resources for executing job or computing
resources It is information about the resources used to execute submitted job for example the computational power data storage network bandwidth
Information about services these are information about the services which user wants to learn about for example Information Services Resource Discovery Services
Information about applications these are information about special applications deployed on Grid such as MPI POP C
7
Resources in VN-Grid
Characteristics of Resources in VN-Grid environment The resources are heterogeneous not only in the
network but also in each The resources have variety of properties with
different data types The existing resources continuously vary especially
the computing resources for example CPUs memory disk network bandwidth
New resources are continually being published
8
Forwarding in VN-Grid
The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers that means those peers could exchange information directly
Interacting is limit to known peers The peers are equally considered The number of peers participating in Grid can be
raised enormously
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
5
Functions of Resource Discovery
The prospective Resource Discovery services in each Gridsite must be able to know find and provide the resource information from others
The main function is that when receiving a specific request about resources form client Resource Discovery services must find out reliable information about Gridsites in the network that possess available resources satisfying the query
6
Resources in VN-Grid
There are three kinds of resources Resources for executing job or computing
resources It is information about the resources used to execute submitted job for example the computational power data storage network bandwidth
Information about services these are information about the services which user wants to learn about for example Information Services Resource Discovery Services
Information about applications these are information about special applications deployed on Grid such as MPI POP C
7
Resources in VN-Grid
Characteristics of Resources in VN-Grid environment The resources are heterogeneous not only in the
network but also in each The resources have variety of properties with
different data types The existing resources continuously vary especially
the computing resources for example CPUs memory disk network bandwidth
New resources are continually being published
8
Forwarding in VN-Grid
The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers that means those peers could exchange information directly
Interacting is limit to known peers The peers are equally considered The number of peers participating in Grid can be
raised enormously
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
6
Resources in VN-Grid
There are three kinds of resources Resources for executing job or computing
resources It is information about the resources used to execute submitted job for example the computational power data storage network bandwidth
Information about services these are information about the services which user wants to learn about for example Information Services Resource Discovery Services
Information about applications these are information about special applications deployed on Grid such as MPI POP C
7
Resources in VN-Grid
Characteristics of Resources in VN-Grid environment The resources are heterogeneous not only in the
network but also in each The resources have variety of properties with
different data types The existing resources continuously vary especially
the computing resources for example CPUs memory disk network bandwidth
New resources are continually being published
8
Forwarding in VN-Grid
The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers that means those peers could exchange information directly
Interacting is limit to known peers The peers are equally considered The number of peers participating in Grid can be
raised enormously
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
7
Resources in VN-Grid
Characteristics of Resources in VN-Grid environment The resources are heterogeneous not only in the
network but also in each The resources have variety of properties with
different data types The existing resources continuously vary especially
the computing resources for example CPUs memory disk network bandwidth
New resources are continually being published
8
Forwarding in VN-Grid
The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers that means those peers could exchange information directly
Interacting is limit to known peers The peers are equally considered The number of peers participating in Grid can be
raised enormously
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
8
Forwarding in VN-Grid
The proposed VN-Grid infrastructure simulates a Peer-to-Peer model in which clients control the networking instead of servers that means those peers could exchange information directly
Interacting is limit to known peers The peers are equally considered The number of peers participating in Grid can be
raised enormously
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
9
Summary
Good resource discovery services must Provide the most exact update and sufficient
information with timely solution Be flexible with features of resources such as
variety heterogeneity and newly added resources Be scalable to adapt with the number of peers in
Grid environment rising Reduce the expense of transmitting information in
P2P environment
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
10
Overview
Introduction of problems Several approaches Solution model
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
11
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
12
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
13
JSDL
JSDL is used to describe the requirements of computational jobs for submission to resources particularly in Grid environments
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
14
JSDL
ltJobDefinitiongtltJobDescriptiongt
ltJobIdentification gtltApplication gtltResources gtltDataStaging gt
ltJobDescriptiongtltxsdanyothergt
ltJobDefinitiongt
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
Resources
This is a complex type that defines the operating system required by the jobcomplex typeOperatingSystem
This is a boolean that designates whether the job must have exclusive access to the resources allocated to it by the consuming systemxsdbooleanExclusiveExecution
This element describes a filesystem that is required by the jobcomplex typeFileSystem
This element is a complex type specifying the set of named hosts which may be selected for running the jobcomplex typeCandidateHosts
DescriptionTypeName of attribute
Resources
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
Resources
This is a range value that describes the required amount of disk space for each resource allocated to the job
jsdlRangeValue_TypeIndividualDiskSpace
This element is a range value specifying the required amount of virtual memory for each of theresources to be allocated for this job submission
jsdlRangeValue_Type
IndividualVirtualMemory
This element is a range value specifying the amount of physical memory required on each indi-vidual resource
jsdlRangeValue_Type
IndividualPhysicalMemory
This element is a range value specifying the bandwidth requirements of each individual resource
jsdlRangeValue_Type
IndividualNetworkBandwidth
This element is a range value specifying the number of CPUs for each of the resources to be allocated to the job submission
jsdlRangeValue_TypeIndividualCPUCount
This element is a range value specifying the total number of CPU seconds required on each resource to execute the job
jsdlRangeValue_TypeIndividualCPUTime
This element is a range value specifying the speed of each CPU required by the job in the execution environment
jsdlRangeValue_TypeIndividualCPUSpeed
DescriptionTypeName of attribute
Resources
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
Resources
This element is a range value specifying the total number of resources required by the job jsdlRangeValue_TypeTotalResourceCount
This is a range value that describes the required total amount of disk space that should be allocated to the job jsdlRangeValue_TypeTotalDiskSpace
This element is a range value specifying the required total amount of virtual memory for the jobsubmission jsdlRangeValue_TypeTotalVirtualMemory
This element is a range value specifying the required amount of physical memory for the entirejob across all resources jsdlRangeValue_TypeTotalPhysicalMemory
This element is a range value specifying the total number of CPUs required for this job submission jsdlRangeValue_TypeTotalCPUCount
This element is a range value specifying total number of CPU seconds required across all CPUs used to execute the job jsdlRangeValue_TypeTotalCPUTime
DescriptionTypeName of attribute
Resources
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
Resources
This element is a simple type containing a single name of a hostxsdstringHostName
CandidateHosts
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
Resources
This is a token that describes the type of filesystem of the containing FileSystem element
jsdlFileSystemTypeEnumerationFileSystemType
This is a range value that describes the required amount of disk space on the containing FileSystem element for the job
jsdlRangeValue_TypeDiskSpace
This is a string that describes a remote location that MUST be made available locally for the jobxsdstringMountSource
This is a string that describes a local location that MUST be made available in the allocated resources for the jobxsdstringMountPoint
xsdstringDescription
FileSystem
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
Resources
xsdstringDescription
This element is a string that defines the version of the operating system required by the jobxsdstringOperatingSystemVersion
This is a complex type that contains the name of the operating systemcomplex typeOperatingSystemType
OperatingSystem
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
Resources
This element is a token specifying the CPU architecture required by the job in the execution environment
jsdlProcessorArchitectureEnumerationCPUArchitectureName
CPUArchitecture
This is a token type that contains the name of the operating system
jsdlOperatingSystemTypeEnumerationOperatingSystemName
OperatingSystemType
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
22
RDSResults EDAGrid
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltResource rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltWSGRAMgtltReservationAddressgt
ltOSNamegt ltOSVersiongt ltPlatformgt
ltResourcegt +ltDiscoverygt
ltRDSResultSetgt
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
23
RDSResults
ltRDSResultSet count = ltnumgt gtltDiscovery rank = ldquordquo id = ldquordquogt
ltIndividual rank = ldquordquo id = ldquordquo nodeCount = ldquordquo gt
ltIndividualgtltTotalgtltInteractBandwidthgt
ltDiscoverygt ltRDSResultSetgt
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
24
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
25
Ranking EDAGrid
FUN CTION Ranking_AlgorithmFOR each (Ri satisfy the individual condition)Rank(Ri) = 0FOR each (Aj is the attribute of job)Rank(Ri) = Rank(Ri) + (w[j] R[ij]A[j])Next ANext RSort the Resource SetReturn list of resource with order
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
26
Set-matching EDAGrid
The set-matching algorithm Create an empty set Add the resources into the set with the higher
rank one after one the lower Each time check if the Total condition is met
and if the InteractBandwidth is violated Terminate the loop if these conditions are
satisfied or the number of gridnodes reaches the number user required
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
27
Several Approaches
Resources descriptionndash Global Grid Forum JSDL ndash EDAGrid RDSResult
Matching methodndash EDAGrid Ranking Set-matching
Forwarding methodndash Napster Centralized Indexingndash Gnutella Flooding Queryndash Chord Indexing Using Distributed Hash Tablesndash HyperCuP system Interests-basedndash Ant Colony Optimizing
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
28
Centralized Indexing
Proposes a centralization management for the whole resources of all the grid sites
There is a server machine which holds all the index of available resources on the network
Users start the query process by sending the request to the index server
The server will send the answers to the users bases on the information stored
Advantages quickly Disadvantages
ndash The bottleneck at the server machinendash Update the information continuouslyndash Not suitable to the P2P
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
29
Flooding
The query will be routed from one peer to all of its neighbors By this way the query will be sent throughout the network If the peer finds out the resources in its local storage it will send the
answer to the original peer who makes the request Using Time-To-Live (TTL) to limit the number of hops a request could
be sent so that after a certain times to be sent the request will automatically disappear out of the network
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
30
Indexing Using Distributed Hash Tables
In this method each peer in the network has a partition of the hash table
Each entry in the hash table is the key space which point to the peer where the search file can be found
When there is a request of a file the file name will be hash by a uniform hash function
Base on the hash value and the hash table the look up value will be found and return to the requester
The cost of this method consists of the cost to build and update the hash table and route the query to the location search file
Disadvantages not apply to the complex query
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
31
Interests-based Methods
This method based on the interest of users The idea is to search on the peers that seem to contain what
users have required To reach this point peers are organized into groups of similar
interest Therefore the search queries will be forwarded to the interest
group to get the high hit rate and reduce the redundant time to search on other peers
Disadvantage ndash the peers interest may change over time ndash peers have more than one interest
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
32
Ant Colony Optimizing
Ants start from their nests and wander randomly The ants which found food will return to their nests in terms of their memory and drop
pheromone on trails Other ants which come across such a trail will follow the trail to check the food instead of
wandering randomly If they find the food they will return home and reinforce the pheromone on the trail A key point is that the pheromone evaporates over time The more time it takes for an ant to travel back to its nest the more pheromone will be
evaporated When an ant reaches an intersection the ant has to decide which branch to take The ants which take a short branch march faster than those which take a long branch Therefore the pheromone density on the short branch remains higher Other ants will more likely choose the branch in terms of the pheromone density Eventually all the ants which go to get the food will take the shortest branch
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
33
Ant Colony Optimizing
Initially the ants may take the
paths of ArarrBrarrCrarrE ArarrBrarrE or ArarrBrarrDrarrE After the initial stage most
of the ants will take the shortest path ArarrBrarrE
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
34
Overview
Introduction of problems Several approaches Solution model
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
35
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
36
Reference
[1]Thien-Nga Nguyen-Vu Information Service API Specification raquoVNGRID PROJECT [2]Tran Vu Pham Lydia MS Lau and Peter M Dew An Ontology-based Adaptive Approach
to P2P Resource Discovery in Distributed School of Computing University of Leeds Leeds UK
[3]Tuan Anh Nguyen VN-Grid_design-Oct_1 VNGrid Project[4] Project Overview_VN-Grid Project wiki httpwwwcsehcmuteduvn~vngridwikiProject_Overview
[4] [JSDL]Job Submission Description Language (JSDL) Specification Version 10 httpforgegridforumorgprojectsjsdl-wg
Nguyễn Quang Hugraveng Nguyễn Thanh Sơn USER-DRIVEN GRID RESOURCE DISCOVERY Khoa Khoa Học amp Kỹ Thuật Maacutey Tiacutenh Nhagrave A3 Trường Đại học Baacutech Khoa ndash ĐHQG TpHCM
Yuhui Deng middot FrankWang middot Adrian Ciura Ant colony optimization inspired resource discovery in P2P Grid systems Springer Science+Business Media LLC 2008
Zenggang Xiong 12 Yang Yang1 Xuemin Zhang2 Fu Chen110485791048579104857910485791048579Li Liu1 Integrating Genetic and Ant Algorithm into P2P Grid Resource Discovery School of Information Engineering University of Science and Technology Beijing Beijing 100083 China
Tran Vu Pham A Collaborative e-Science Architecture for Distributed Scientific Communities The University of Leeds School of Computing October 2006
37
Thank You
37
Thank You