The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal [email protected] Vrije...
-
date post
21-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal [email protected] Vrije...
![Page 1: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/1.jpg)
The Ibis Project:Simplifying Grid Programming &
Deployment
Henri [email protected]
Vrije Universiteit Amsterdam
![Page 2: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/2.jpg)
![Page 3: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/3.jpg)
The ‘Promise of the Grid’
Efficient and transparent (i.e. easy-to-use) wall-socket computing over a distributed set of resources [Sunderam ICCS’2004, based on Foster/Kesselman]
![Page 4: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/4.jpg)
Parallel computing on grids
● Mostly limited to● trivially parallel applications
● parameter sweeps, master/worker● applications that run on one cluster at a
time● use grid to schedule application on a suitable
cluster
● Our goal: run real parallel applications on a large-scale grid, using co-allocated resources
![Page 5: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/5.jpg)
Efficient wide-area algorithms
● Latency-tolerant algorithms with asynchronous communication● Search algorithms (Awari-solver [CCGrid’08])● Model checkers (DiVinE [PDMC’08])
● Algorithms with hierarchical communication● Divide-and-conquer● Broadcast trees
● …..
![Page 6: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/6.jpg)
● Performance & scalability
● Heterogeneous● Low-level & changing
programming interfaces
● writing & deploying grid applications is hard
Reality: ‘Problems of the Grid’
● Connectivity issues
● Fault tolerance
● Malleability
Wide-Area Grid Systems
UseUserr
!
![Page 7: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/7.jpg)
The Ibis Project
● Goal:● drastically simplify grid
programming/deployment
● write and go!
![Page 8: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/8.jpg)
Approach (1)
● Write & go: minimal assumptions about execution environment● Virtual Machines (Java) deal with
heterogeneity
● Use middleware-independent APIs● Mapped automatically onto middleware
● Different programming abstractions● Low-level message passing
● High-level divide-and-conquer
![Page 9: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/9.jpg)
Approach (2)
● Designed to run in dynamic/hostile grid environment● Handle fault-tolerance and malleability● Solve connectivity problems automatically
(SmartSockets)
● Modular and flexible: can replace Ibis components by external ones● Scheduling: Zorilla P2P system or external
broker
![Page 10: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/10.jpg)
Global picture
![Page 11: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/11.jpg)
Rest of talk
Satin: divide & conquer
Communication layer (IPL)
SmartSockets
Applications
JavaGATZorilla P2P
![Page 12: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/12.jpg)
Outline
● Grid programming● IPL● Satin● SmartSockets
● Grid deployment● JavaGAT● Zorilla
● Applications and experiments
![Page 13: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/13.jpg)
Ibis Design
![Page 14: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/14.jpg)
Ibis Portability Layer
(IPL)● Java-centric “run-anywhere” library
● Sent along with the application (jar-files)● Point-to-point, multicast, streaming, ….
● Efficient communication● Configured at startup, based on
capabilities (multicast, ordering, reliability, callbacks)
● Bytecode rewriter avoids serialization overhead
![Page 15: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/15.jpg)
Serialization
● Based on bytecode-rewriting ● Adds (de)serialization code to serializable
types● Prevents reflection overhead
during runtimeJava
compilerbytecoderewriter JVM
JVM
JVM
source bytecodebytecode
![Page 16: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/16.jpg)
Membership Model
● JEL (Join-Elect-Leave) model● Simple model for tracking resources,
supports malleability & fault-tolerance● Notifications of nodes joining or leaving● Elections
● Supports all common programming models● Centralized and distributed
implementations● Broadcast trees, gossiping
![Page 17: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/17.jpg)
Programming models
● Remote Method Invocation (RMI)● Group Method Invocation (GMI)● MPJ (MPI Java 'standard')● Satin (Divide & Conquer)
![Page 18: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/18.jpg)
Satin: divide-and-conquer
● Divide-and-conquer isinherently hierarchical
● More general thanmaster/worker
● Cilk-like primitives (spawn/sync) in Java● Supports malleability and fault-tolerance● Supports data-sharing between different
branches through Shared Objects
fib(1) fib(0) fib(0)
fib(0)
fib(4)
fib(1)
fib(2)
fib(3)
fib(3)
fib(5)
fib(1) fib(1)
fib(1)
fib(2) fib(2)cpu 2
cpu 1cpu 3
cpu 1
![Page 19: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/19.jpg)
Satin implementation
● Load-balancing is done automatically● Cluster-aware Random Stealing (CRS)● Combines Cilk’s Random Stealing with
asynchronous wide-area steals
● Self-adaptive malleability and fault-tolerance● Add/remove machines on the fly● Survive crashes by efficient
recomputations/checkpointing
![Page 20: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/20.jpg)
Self-adaptation with Satin
● Adapt #CPUs to level of parallelism● Migrate work from overloaded to idle CPUs● Remove CPUs with poor network connectivity● Add CPUs dynamically when
● Level of parallelism increases● CPUs were removed or crashed
● Can also remove/add entire clusters● E.g., for network problems
[Wrzesinska et al., PPoPP’07 ]
![Page 21: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/21.jpg)
Approach
● Weighted Average Efficiency (WAE):
1/#CPUs * Σ speedi * (1 – overheadi )
overhead is fraction idle+communication time
speedi= relative speed of CPUi (measured periodically)
● General idea:
Keep WAE between Emin (30%) and Emax(50%)
![Page 22: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/22.jpg)
Overloaded network link
● Uplink of 1 cluster reduced to 100 KB/s● Remove badly connected cluster, get new one
Iter
atio
n d
ura
tion
Iteration
![Page 23: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/23.jpg)
Connectivity Problems
● Firewalls & Network Address Translation (NAT) restrict incoming traffic
● Addressing problems● Machines with >1 network interface (IP
address)● Machine on a private network (e.g., NAT)
● No direct communication allowed● E.g., between compute nodes and external
world
![Page 24: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/24.jpg)
SmartSockets library
● Detects connectivity problems● Tries to solve them automatically
● With as little help from the user as possible● Integrates existing and several new
solutions● Reverse connection setup, STUN, TCP splicing,
SSH tunneling, smart addressing, etc.● Uses network of hubs as a side channel
![Page 25: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/25.jpg)
Example
![Page 26: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/26.jpg)
Example
[Maassen et al., HPDC’07 ]
![Page 27: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/27.jpg)
Overview
JavaGATZorilla P2P
![Page 28: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/28.jpg)
JavaGAT
● GAT: Grid Application Toolkit● Makes grid applications independent of the
underlying grid infrastructure
● Used by applications to access grid services● File copying, resource discovery, job
submission & monitoring, user authentication
● API is currently standardized (SAGA)● SAGA implemented on JavaGAT
![Page 29: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/29.jpg)
Grid Applications with GAT
GAT Engine
Remote
FilesMonitoring
Info
service
Resource
Management
GridLab Globus Unicore SSH P2P LocalGAT
Grid ApplicationFile.copy(...)
submitJob(...)
gridftp globus
Intelligentdispatching
[van Nieuwpoort et al., SC’07 ]
![Page 30: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/30.jpg)
Zorilla: Java P2P supercomputing
middleware
![Page 31: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/31.jpg)
Zorilla components● Job management
● Handling malleability and crashes
● Robust Random Gossiping● Periodic information exchange between nodes● Robust against Firewalls, NATs, failing nodes
● Clustering: nearest neighbor● Flood scheduling
● Incrementally search for resources at more and more distant nodes
[Drost et al., HPDC’07 ]
![Page 32: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/32.jpg)
Overview
![Page 33: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/33.jpg)
Ibis applications● e-Science (VL-e)
● Brain MEG-imaging● Mass spectroscopy
● Multimedia content analysis● Various parallel applications
● SAT-solver, N-body, grammar learning, …
● Other programming systems● Workflow engine for astronomy (D-grid),
grid file system, ProActive, Jylab, …
![Page 34: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/34.jpg)
Overview experiments
● DAS-3: Dutch Computer Science grid
● Satin applications on DAS-3● Zorilla desktop grid experiment● Multimedia content analysis● High resolution video processing
![Page 35: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/35.jpg)
DAS-3
272 nodes(AMD Opterons)
792 cores1TB memoryLAN: Myrinet 10G Gigabit
EthernetWAN
(StarPlane): 20-40 Gb/s
OPN
Heterogeneous: 2.2-2.6 GHz Single/dual-
core Delft no
Myrinet
![Page 36: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/36.jpg)
Gene sequence comparison in Satin (on
DAS-3)
Speedup on 1 cluster Run times on 5 clusters
• Divide&conquer scales much better than master-worker
• 78% efficiency on 5 clusters (with 1462 WAN-msgs/sec)
![Page 37: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/37.jpg)
Barnes-Hut (Satin) on DAS-3
• Shared object extension to D&C model improves scalability
• 57% efficiency on 5 clusters (with 1371 WAN-msgs/sec)
Speedup on 1 cluster Run times on 5 clusters
![Page 38: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/38.jpg)
Zorilla Desktop Grid Experiment
● Small experimental desktop grid setup● Student PCs running Zorilla overnight ● PCs with 1 CPU, 1GB memory, 1Gb/s
Ethernet
● Experiment: gene sequence application● 16 cores of DAS-3 with Globus● 16 core desktop grid with Zorilla ● Combination, using Ibis-Deploy
![Page 39: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/39.jpg)
Ibis-Deploy deployment tool
3574 sec 1099 sec
877 sec
• Easy deployment with Zorilla, JavaGAT & Ibis-Deploy
![Page 40: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/40.jpg)
Multimedia content analysis
● Analyzes video streams to recognize objects
● Extract feature vectors from images● Describe properties (color, shape)● Data-parallel task implemented with C+
+/MPI
● Compute on consecutive images● Task-parallelism on a grid
![Page 41: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/41.jpg)
MMCA application
Client
(Java)
Broker
(Java)
Parallel
Horus
Server
Parallel
Horus
Servers
Servers
(C++)
(any machine world-wide)
(local desk-top machine)IbisIbis
(Java)
(grid)
![Page 42: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/42.jpg)
MMCA with Ibis● Initial implementation with TCP was unstable● Ibis simplifies communication, fault tolerance● SmartSockets solves connectivity problems● Clickable deployment interface● Demonstrated at many
conferences (SC’07)● 20 clusters on 3 continents, 500-800 cores
● Frame rate increased from 1/30 to 15 frames/sec
[Seinstra et al., IEEE Multimedia’07 ]
![Page 43: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/43.jpg)
MMCA
‘Most Visionary Research’ award at AAAI 2007, (Frank Seinstra et al.)
![Page 44: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/44.jpg)
High Resolution Video Processing
● Realtime processing of CineGrid movie data● 3840x2160 (4xHD) @ 30 fps = 1424 MB/sec
● Multi-cluster processing pipeline ● Using DAS-3, StarPlane and Ibis
![Page 45: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/45.jpg)
CineGrid with Ibis● Use of StarPlane requires no configuration
● StarPlane is connected to local Myrinet network● Detected & used automatically by SmartSockets
● Easy setup of application pipeline● Connection administration of application is simplified by the IPL
election mechanism
● Simple multi-cluster deployment (Ibis-Deploy)● Uses Ibis serialization for high throughput
![Page 46: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/46.jpg)
Summary● Goal: Simplify grid
programming/deployment
● Key ideas in Ibis● Virtual machines (JVM) deal with
heterogeneity● High-level programming abstractions
(Satin)● Handle fault-tolerance, malleability,
connectivity problems automatically (Satin, SmartSockets)
● Middleware-independent APIs (JavaGAT)● Modular
![Page 47: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/47.jpg)
Acknowledgements Current members Rob van Nieuwpoort Jason Maassen Thilo Kielmann Frank Seinstra Niels Drost Ceriel Jacobs Kees Verstoep
Roelof Kemp Kees van Reeuwijk
Past members John Romein Gosia Wrzesinska Rutger Hofman Maik Nijhuis Olivier Aumage Fabrice Huet Alexandre Denis
![Page 48: The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam.](https://reader038.fdocuments.net/reader038/viewer/2022103005/56649d595503460f94a392d8/html5/thumbnails/48.jpg)
More information● Ibis can be downloaded from
● http://www.cs.vu.nl/ibis
● Papers:● Satin [PPoPP’07], SmartSockets [HPDC’07],
Gossiping [HPDC’07], JavaGAT [SC’07],MMCA [IEEE Multimedia’07]
• Ibis tutorials• Next one at CCGrid 2008 (19 May, Lyon)