Synergies among Grid, Peer-to-Peer and Cloud Computing (Towards e-Science Communities)
description
Transcript of Synergies among Grid, Peer-to-Peer and Cloud Computing (Towards e-Science Communities)
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
Synergies among Grid, Peer-to-Peer
and Cloud Computing(Towards e-Science
Communities)
Luís Antunes [email protected]
Distributed Systems Group INESC-ID Lisboa / Instituto Superior Técnico
2Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...Why...: e-Science
Most science is becoming e-Science
large data repositories
growing every day
processed in myriads of ways
powerful applications
computational intensive
increasing demand for resources
3Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...Why...: e-Science Communities
Researchers form natural communities
they tend to gather around...
research areas
tools, instruments, applications
data repositories used
affiliation, geography
projects, consortia
special kind of “social” network
4Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...Why...: Synergies...
Leverage globally available computing resources
harness resources of whatever shape or sourcee.g., Clusters, Grids, multiprocessors
P2P voluntary cycle-sharing, Desktop Grids
Utility and Cloud Computing
provide uniform and easy-to-use interface to resources
data storage, sharing, transfer
resource allocation and scheduling
work/task distribution
most e-scientists will not be programmers (no-code)!
5Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...Where...:e-Science At Large...
E-Science examples... video codingvideo and image processingraytracing, high-res renderingface recognition in pictures/movies
mollecular modelingchemical reaction simulation
6Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...e-Science At Large...
...E-Science examples network protocol simulation
financial investments stock exchangederivatives
statistical numerical methods, data processing
language/speech processing
7Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...e-Science At Large
What is common to all these e-Science activities?
large amounts of datacomplex methods/algorithmslong processing times, resource intensive
no software development /classical programming languages, API, sockets, synchronization, MPI, etc.
use mostly pre-developed/deployed applicationsscripting, customization, configurationpossible intrincate and very advanced
comprises large numbers of parallelizable tasksmost can be made completely independent
8Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
Synergy VisionResources from P2P, Grid, Utility Computing
Deployed Tasks
Job Result
9Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...What:...Synergy
Application Execution Model
Gridlet concept: intuitive, simple to use, data-centered
suited to most applications used by researchers
Resource Sharing Architecture
leverage mostly any computing and storage provider
a P2P-based Cloud encompassing Clusters, Grids, PCs
Community Support
social network integration (facebook,hi5)
deployable via BOINC (SETI@home)
10Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...How:...Gridlet
Gridletuniform basis of workload division, computation off-load
chunk of data with associated operations to be performed
– parameters, scripts, configuration files, ...
cost estimate: G$: (CPU, Bandwidth, Memory, Disk)
jobs are gridlets sent to applications
– allow adaptation of unmodified applications
– operation/data transformation via XML policies
intuitive approach to
data-partitions, task-spawning, resource management
11Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...How:...Infrastructure
Synergy InfrastructureExtendable peer-to-peer architecture
– harness cycles of desktops, clusters, utility-computing
– gathers asymmetric participants, different capabilities
Hybrid structured/unstructured overlay
– structured: data repository, caching, results, indexes
– unstructured: execution scheduled on any node
Hierarchical overlay
super-peers aggregate information of neighborsresources, applications, reputation, cached data, ...
12Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
P2P overlay
network
P2P overlay
network
gridletssubmitted
gridletsserved
gridletsreceived
gridletsreturned
Synergy Infrastructure
cloud on overlay/mesh
oceans of gridlets
flow across the overlay
lifecycle
cost estimate
G$ = (CPU, BW)
13Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...How:...Community Support
e-Science Infrastructure driven by Communities.
Social network integration facebook,hi5, widgets on web pagesexecute code on idle computers of “friends”discover similar interests
e.g., tools, applications
Community-driven portalsdata sets, benchmark data, resultsalgorithm, topology, process descriptionsask/donate storage and CPUcode deployable via BOINC (SETI@home)
14Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...What For:...Current and Next Activities
Application ScenariosVideo TranscodingNetwork Topology/Protocol SimulationRaytracing for 3D renderingFace Dectection on Film Archives (e.g., Cinemateca)Synergy VM for transactional-memory applications
Execution Infrastructures (combined)P2P cycle-sharing, volunteer computingClustered Virtual Machines (e.g., Java, .Net)Grids, Utility Computing Infrastructures
15Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...What For:...Video Transcoding (1)
file splittingsemantics-aware data-partitioningappend/prepend gridlet-datacomplete framesmovie header informationkeep full (intra) & predicted framesXML description:
formatheadersboundariesconstraintstransformations
H I1 P1 P1 P1 I2 P2 P2 I3 P3 P3 P3 P3 I4 P4 P4 I5 P5 P5 I6 P6
H I1 P1 P1
H I2 P2 P2
H I3 P3 P3 P3 P3
H I4 P4 P4
H I5 P5 P5
H I6 P6
movie file (e.g., mpg, avi, flv, mov, wmv) movie file (e.g., mpg, avi, flv, mov, wmv)
XML FormatDescription< >
< >............................</ >
</ >
XML FormatDescription< >
< >............................</ >
</ >
Gridlet Manager
16Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...What For:...Video Transcoding (2)
XML FormatDescription< >
< >............................</ >
</ >
XML FormatDescription< >
< >............................</ >
</ >
Gridlet Manager
H' I1' P1' P1' P1' I2' P2' P2' I3' P3' P3' P3' P3' I4' P4' P4' I5' P5' P5' I6' P6'
movie file converted/processed
H' I2' P2' P2'
H' I1' P1' P1' P1'
H' I4' P4' P4'
H' I5' P5' P5'
H' I6' P6'
H' I3' P3' P3' P3' P3'
gather available gridlet-resultssent by servicing peersextract result data & discard headersreassemble file according to semantics
new headerorderingconstraintstransformations
special cases:discard gridletscrypto-challenge
17Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
...What For:...Network Simulation
COGITARE addresses:limits on size & complexity of simulationsinefficient resource utilization (e.g., multi-core)no agnostic topology description languagesno repository for research result interchangeabsence of teaching a platform
18Encontro Ciência 2009 / Science 2009 – Luís Veiga 2009/07/30
03:48
Conclusione-Science is becoming dominant
increasing demand for computing resourcesharness resources from various sources (P2P,Grid,Cloud)
minority of computer researchers and programmers
intuitive application and resource model
manage activities around communities
Future Workassessment of financial derivative productschemical reaction and process simulation
Thank you: Questions?www.gsd.inesc-id.pt/~lveiga