Introduction to CSE Grad Programs and Admissions Prof. Gagan Agrawal Grad Studies Chair.
A Map-Reduce System with an Alternate API for Multi-Core Environments Wei Jiang, Vignesh T. Ravi and...
-
Upload
lynne-kennedy -
Category
Documents
-
view
222 -
download
2
Transcript of A Map-Reduce System with an Alternate API for Multi-Core Environments Wei Jiang, Vignesh T. Ravi and...
A Map-Reduce System with an Alternate API for Multi-Core Environments
Wei Jiang, Vignesh T. Ravi and Gagan Agrawal
Outline
April 20, 20232
Introduction MapReduce Generalized Reduction System Design and Implementation Experiments Related Work Conclusion
April 20, 20233
Growing need for analysis of large scale data Scientific Commercial
Data-intensive Supercomputing (DISC) Map-Reduce has received a lot of attention
Database and Datamining communities High performance computing community
E.g. this conference !!
Motivation
April 20, 20234
Positives: Simple API
Functional language based Very easy to learn
Support for fault-tolerance Important for very large-scale clusters
Questions Performance?
Comparison with other approaches Suitability for different class of applications?
Map-Reduce: Positives and Questions
Class of Data-Intensive Applications Many different types of applications
Data-center kind of applications Data scans, sorting, indexing
More ``compute-intensive`` data-intensive applications Machine learning, data mining, NLP Map-reduce / Hadoop being widely used for this class
Standard Database Operations Sigmod 2009 paper compares Hadoop with Databases
and OLAP systems
What is Map-reduce suitable for? What are the alternatives?
MPI/OpenMP/Pthreads – too low level?April 20, 20235
This Paper Proposes MATE (a Map-Reduce system with an
AlternaTE API) based on Generalized Reduction Phoenix implemented Map-Reduce in shared-memory
systems MATE adopted Generalized Reduction, first proposed in
FREERIDE that was developed at Ohio State 2001-2003 Observed API similarities and subtle differences between
MapReduce and Generalized Reduction Comparison for
Data Mining Applications Compare performance and API Understand performance overheads
Will an alternative API be better for ``Map-Reduce``?
April 20, 20236
April 20, 20237
Map-Reduce Execution
April 20, 20238
Phoenix Implementation It is based on the same principles of
MapReduce But targets shared-memory systems
Consists of a simple API that is visible to application programmers Users define functions like splitter, map, reduce,
and etc.. An efficient runtime that handles low-level
details Parallelization Resource management Fault detection and recovery
April 20, 20239
Phoenix Runtime
Comparing Processing Structures
10
• Reduction Object represents the intermediate state of the execution• Reduce func. is commutative and associative• Sorting, grouping.. overheads are eliminated with red. func/obj.
April 20, 2023
Observations on Processing Structure Map-Reduce is based on functional idea
Do not maintain state This can lead to overheads of managing
intermediate results between map and reduce Map could generate intermediate results of very
large size MATE API is based on a programmer managed
reduction object Not as ‘clean’ But, avoids sorting of intermediate results Can also help shared memory parallelization Helps better fault-recovery
April 20, 202311
April 20, 202312
Apriori pseudo-code using MATE
An Example
April 20, 202313
Apriori pseudo-code using MapReduce
Example – Now with Phoenix
April 20, 202314
System Design and Implementation
Basic dataflow of MATE (MapReduce with AlternaTE API) in Full Replication scheme
Data structures in MATE scheduler Used to communicate between the user code
and the runtime Three sets of functions in MATE
Internally used functions MATE API provided by the runtime MATE API to be defined by the users
Implementation considerations
April 20, 202315
MATE Runtime Dataflow Basic one-stage dataflow (Full Replication
scheme)
April 20, 202316
Data Structures-(I) scheduler_args_t: Basic fields
Field Description
Input_data Input data pointer
Data_size Input dataset size
Data_type Input data type
Stage_num Computation-Stage number (Iteration number)
Splitter Pointer to Splitter function
Reduction Pointer to Reduction function
Finalize Pointer to Finalize function
April 20, 202317
Data Structures-(II) scheduler_args_t: Optional fields for
performance tuning
Field Description
Unit_size # of bytes for one element
L1_cache_size # of bytes for L1 data cache size
Model Shared-memory parallelization model
Num_reduction_workers
Max # of threads for reduction workers(threads)
Num_procs Max # of processor cores used
April 20, 202318
Functions-(I) Transparent to users (R-required; O-optional)
Function Description R/O
static inline void * schedule_tasks(thread_wrapper_arg_t *)
R
static void * combination_worker(void *) R
static int array_splitter(void *, int, reduction_args_t *) R
void clone_reduction_object(int num) R
static inline int isCpuAvailable(unsigned long, int) R
April 20, 202319
Functions-(II) APIs provided by the runtime
Function Description R/O
int mate_init(scheudler_args_t * args) R
int mate_scheduler(void * args) R
int mate_finalize(void * args) O
void reduction_object_pre_init() R
int reduction_object_alloc(int size)—return the object id R
void reduction_object_post_init() R
void accumulate(int id, int offset, void * value) O
void reuse_reduction_object() O
void * get_intermediate_result(int iter, int id, int offset) O
April 20, 202320
Functions-(III) APIs defined by the user
Function Description R/O
int (*splitter_t)(void *, int, reduction_args_t *) O
void (*reduction_t)(reduction_args_t *) R
void (*combination_t)(void*) O
void (*finalize_t)(void *) O
April 20, 202321
Implementation Considerations Focus on the API differences in programming
models Data partitioning: dynamically assigns splits to
worker threads, same as Phoenix Buffer management: two temporary buffers
one for reduction objects the other for combination results
Fault tolerance: re-executes failed tasks Checking-point may be a better solution since
reduction object maintains the computation state
April 20, 202322
For comparison, we used three applications Data Mining: KMeans, PCA, Apriori Also evaluated the single-node performance
of Hadoop on KMeans and Apriori Experiments on two multi-core platforms
8 cores on one WCI node (intel cpu) 16 cores on one AMD node (amd cpu)
Experiments Design
April 20, 202323
Results: Data Mining (I) K-Means: 400MB dataset, 3-dim points, k =
100 on one WCI node with 8 cores
0
20
40
60
80
100
120
1 2 4 8
PhoenixMATEHadoop
Avg
. Tim
e P
er
Itera
tion
(sec)
# of threads
2.0 speedup
April 20, 202324
Results: Data Mining (II) K-means: 400MB dataset, 3-dim points, k =
100 on one AMD node with 16 cores
0
20
40
60
80
100
120
140
1 2 4 8 16
PhoenixMATEHadoop
Avg
. Tim
e P
er
Itera
tion
(sec)
# of threads
3.0 speedup
April 20, 202325
Results: Data Mining (III)
PCA: 8000 * 1024 matrix, on one WCI node with 8 cores
0
50
100
150
200
250
300
1 2 4 8
PhoenixMATE
# of threads
Tota
l Tim
e (
sec)
2.0 speedup
April 20, 202326
Results: Data Mining (IV) PCA: 8000 * 1024 matrix, on one AMD node
with 16 coresTota
l Tim
e
(sec)
0
50
100
150
200
250
300
350
400
450
1 2 4 8 16
PhoenixMATE
# of threads
2.0 speedup
April 20, 202327
Results: Data Mining (V) Apriori: 1,000,000 transactions, 3% support,
on one WCI node with 8 cores
0
20
40
60
80
100
120
140
1 2 4 8
PhoenixMATEHadoop
# of threads
Avg
. Tim
e P
er
Itera
tion
(sec)
April 20, 202328
Results: Data Mining (VI) Apriori: 1,000,000 transactions, 3% support,
on one AMD node with 16 cores
0
20
40
60
80
100
120
140
160
1 2 4 8 16
PhoenixMATEHadoop
# of threads
Avg
. Tim
e P
er
Itera
tion
(sec)
Observations
April 20, 202329
MATE system has between reasonable to significant speedups for all the three datamining applications.
In most of the cases, it outperforms Phoenix and Hadoop significantly, while slightly better in only one case.
Reduction object can help to reduce the memory overhead associated with managing the large set of intermediate results between map and reduce
Related Work
April 20, 202330
Lots of work on improving and generalizing MapReduce’s API… Industry: Dryad/DryadLINQ from Microsoft, Sawzall
from Google, Pig/Map-Reduce-Merge from Yahoo! Academia: CGL-MapReduce, Mars, MITHRA,
Phoenix, Disco, Hive… Address MapReduce limitations
One input, two-stage data flow is extremely rigid Only two high-level primitives
April 20, 202331
Conclusions MapReduce is simple and robust in
expressing parallelism Two-stage computation style may cause
performance losses for some subclasses of applications in data-intensive computing
MATE provides an alternate API that is based on generalized reduction
This variation can reduce overheads of data management and communication between Map and Reduce