Single
Datacenter
Nebula: Distributed Edge Cloud
Nebula: Distributed Edge Cloud for Data-Intensive ComputingStudents: Mathew Ryden, Kwangsung Oh
PIs: Abhishek Chandra, Jon Weissman
Motivation
MapReduce is standard for data-intensive
computing.
But it is designed for a single data-center.
Widely distributed data does not fit its design
assumptions.
Design Goal & Architecture MapReduce on Nebula
Future Work
Nebula
Evaluation
Explore how external data can be inserted into
Nebula for aggregation and decomposition.
Expand the range of data-intensive frameworks
and applications ported to Nebula.
Partition resources across frameworks and
applications.
0
200
400
600
800
1000
1200
1400
1600
10 20 30
Run T
ime (
s)
Node Scalability of Nebula
MAP REDUCE
0
1000
2000
3000
4000
5000
6000
7000
8000
CSCI CSDI NEBULA CSCI CSDI NEBULA
Run T
ime (
s)
Comparison to CSCI and CSDI
MAP REDUCE
500MB
1GB
0
500
1000
1500
2000
2500
3000
Random LA Random LA Random LA
Run T
ime (
s)
Nebula scheduler comparison
MAP REDUCE
500MB
1GB
2GB
Nebula outperforms centralized
infrastructure.
0
500
1000
1500
2000
2500
3000
0% 10% 20% 30% 40% 50% 60% 65% 70%
Run T
ime (
s)
Fixed proportion of compute node failures
MAP REDUCE
Nebula’s location-awareness
improves performance.
Nebula provides fault-tolerance. Nebula provides scalability.
A location and context-aware Distributed Edge
Cloud Infrastructure.
Using computation and storage resources of edge
volunteer and dedicated nodes.
Nebula Application
Voluntary compute & storage nodes can easily
join, leave, and fail at any time.
Distributed data-intensive computing.
Location-aware.
Sandbox for security.
Fault tolerance.
Nebula MapReduce reduces network traffic.
The data is already dispersed on Nebula.
Only intermediate files need to be sent for results.
Nebula MapReduce provides fault-tolerance and
scalability.
Dispersed data can be processed by nodes in-
situ or close to the data on Nebula cloud.
Ex: Analyzing video feeds taken from distributed
airports looking for suspicious activity.
Results are only sent when there are
suspicious activities.
Lots of network traffic
Nebula ServicesDedicated Nodes
Data Nodes Compute Nodes
DataStoreMaster
NebulaCentral
NebulaMonitor
ComputePoolMaster
Volunteer Nodes
NebulaUsers
dcsg.cs.umn.edu
Top Related