Resource Management in Volunteer Computing Grids An analysis of the different approaches to...

Post on 30-Dec-2015

235 views 0 download

Transcript of Resource Management in Volunteer Computing Grids An analysis of the different approaches to...

Resource Management in Volunteer Computing Grids

An analysis of the different approaches to maximizing

throughput on a BOINC grid

Presented by Geoffrey Oxholm and Beata ChrulkiewiczCS-575 Position Paper Presentation Fall 2007

Volunteer Grids• A Type of Grid Computer

– Decentralized, volunteer nodes• Supercomputing for free

– 1.1 PetaFLOPS vs. 360 TeraFLOPS

Image: http://www.di.unipi.it/groups/architetture/images/grid.gifhttp://holistic.com.mt/h/?Page=Article&Ref=107

• Unreliable Nodes– Users can disconnect their computers anytime– Amount of donated resources is subject to change– Evil jerks can upload malicious data

Berkeley Open Infrastructure for Network Computing

• Duplicate work to ensure validity– R – The “Redundancy Factor”

• Validate computation results. If the validation fails, repeat computation. – Validation Methods:

• Majority Voting– More than R/2 nodes must agree

• M-First Voting– First M nodes must agree

Image: http://en.wikipedia.org/wiki/Image:BOINC_logo_July_2007.png

Success and Limitations of BOINC

• With proper configuration high throughput can be achieved

• Still quite difficult to get volunteers

• Proper configuration is difficult• Fixed configurations can not

account for constantly changing grid characteristics

Image: http://www.baseacid.com/imagesRR/workBand.jpg

Fix: User Encouragement Feedback and Reward

• Each node generates statistics• Teams can be formed• Sense of pride in commitment• Encourages users to donate more time, resources

Image: http://teamocuk.com/cprojectcred1.php?p=PAH

Team OCUKPredictor@home

total credit.

Go team!

Fix: Maximizing Configuration Through Usage Simulation

• Enumerate a set of possible configurations• Test configurations in a fraction of the time • Avoid disturbing volunteers by simulating• Zero in on an effective configuration

Image: http://www.cyberroach.com/tron/tron3_circuit.jpg

Fix: Dynamic Redundancy Through Reliability Prediction

• Wait for a minimum number of nodes before assigning work

• Choose nodes which have higher reliability• Higher reliability means less need for redundancy

• Successful completion yields higher reliability rating for the node

Image: http://image.compusa.com/prodimages/44/8537c95c-8027-4840-b976-67deb0690e13.gif

Evaluation• User Encouragement

– Encourages cheating– Does nothing to maximize

efficient use of resources• Usage Simulation

– Still requires researchers to configure system– Static configuration fails to match dynamic grid

• Reliability Rating– Subject to further exploitation– Further minimizes the value of slow nodes, working

against incentives

Image: GPL Licensed

Conclusion• Build on existing methods

– Continue to encourage users– Create a starting point by using simulation– Update reliability system to avoid conflict

with system of incentives

• Develop new technologies– Blacklist malicious nodes– Develop a more comprehensive reliability system

which uses past schedules to predict future availability

Image: http://pixels.dessgeega.com/wp-content/uploads/2006/10/organize_big.gif

Questions?

Image: http://www.grid.phys.uvic.ca/

Geoff Oxholm Beata Churkiewicz