Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017....
Transcript of Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017....
![Page 1: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/1.jpg)
Parallel Databasesand
Map ReduceIntroduction to DatabasesCompSci 316 Spring 2017
![Page 2: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/2.jpg)
Announcements (Wed., Apr. 17)• Homework #4 due Monday, April 24, 11:55 pm
• No other extra credit problem
• Project Presentations• A few project groups still have not signed up – please sign up
soon
• Google Cloud code• Please redeem your code asap, by May 11
• Wednesday April 19• Guest Lecture by Prof. Jun Yang• OLAP, Data warehousing, Data mining• Included in the final
![Page 3: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/3.jpg)
Where are we now?We learntüRelational Model, Query
Languages, and Database Designü SQLüRAü E/R diagramüNormalization
üDBMS Internalsü Storageü IndexingüQuery Evaluationü External sortü Join AlgorithmsüQuery Optimization
ü Transactionsü Basic concepts and SQLü Concurrency controlü Recovery
ü XMLü DTD and XML schemaü XPath and XQueryü Relational Mapping
Next• Parallel DBMS• Map Reduce• Data Warehousing,
OLAP, mining• Distributed DBMS• NOSQL
Today
Wednesday
Next Monday
![Page 4: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/4.jpg)
Parallel DBMS
![Page 5: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/5.jpg)
Why Parallel Access To Data?
At 10 MB/s1.2 days to scan
1,000 x parallel1.5 minute to scan.
Parallelism:divide a big problem into many smaller onesto be solved in parallel.
1 TB Data
1 TB Data
![Page 6: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/6.jpg)
Parallel DBMS• Parallelism is natural to DBMS processing
• Pipeline parallelism: many machines each doing one step in a multi-step process.
• Data-partitioned parallelism: many machines doing the same thing to different pieces of data.
• Both are natural in DBMS!
PipelineAny
SequentialProgram
AnySequentialProgram
Partition SequentialSequential SequentialSequential Any SequentialProgram
Any SequentialProgram
outputs split N ways, inputs merge M ways
![Page 7: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/7.jpg)
DBMS: The parallel Success Story
• DBMSs are the most successful application of parallelism
• Teradata (1979), Tandem (1974, later acquired by HP),..• Every major DBMS vendor has some parallel server
• Reasons for success:• Bulk-processing (= partition parallelism)• Natural pipelining• Inexpensive hardware can do the trick• Users/app-programmers don’t need to think in parallel
![Page 8: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/8.jpg)
Some || Terminology
• Speed-Up• More resources means
proportionally less time for given amount of data.
• Scale-Up• If resources increased in
proportion to increase in data size, time is constant.
#CPUs(degree of ||-ism)
#ops
/sec
.(t
hrou
ghpu
t)
Ideal:linear speed-up
sec.
/op
(res
pons
e tim
e)
#CPUs + size of databasedegree of ||-ism
Ideal:linear scale-up
Ideal graphs
![Page 9: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/9.jpg)
Some || Terminology
• Due to overhead in parallel processing
• Start-up cost
Starting the operation on many processor, might need to distribute data
• Interference
Different processors may compete for the same resources
• Skew
The slowest processor (e.g. with a huge fraction of data) may become the bottleneck
#CPUs(degree of ||-ism)
#ops
/sec
.(t
hrou
ghpu
t)
#CPUs + size of databasedegree of ||-ism
#ops
/sec
(res
pons
e tim
e)
In practice
Ideal:linear speed-up
Ideal:linear scale-up
Actual: sub-linear speed-up
Actual: sub-linear scale-up
![Page 10: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/10.jpg)
Architecture for Parallel DBMS
• Units: a collection of processors• assume always have local cache• may or may not have local memory or disk (next)
• A communication facility to pass information among processors
• a shared bus or a switch
• Among different computing units• Whether memory is shared• Whether disk is shared
![Page 11: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/11.jpg)
Shared Memory
sharedmemory
e.g. SMP Server
• Easy to program
• Expensive to build
• Low communication overhead: shared memory
• Difficult to scaleup(memory contention)
Interconnection Network
P P P
D D D
Global Shared Memory
![Page 12: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/12.jpg)
Shared Nothing
localmemoryand disk
no two CPU can access the samestorage area
all communicationthrough a network connection
Interconnection Network
P P P
M
D
M
D
M
D
• Hard to program and design parallel algos
• Cheap to build
• Easy to scaleup and speedup
• Considered to be the best architecture
e.g. Greenplum
![Page 13: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/13.jpg)
Shared Disk
localmemory
shared disk
P P P
M
D
M
D
M
D
Interconnection Network
e.g. ORACLE RAC
• Trade-off but still interference like shared-memory (contention of memory and nwbandwidth)
![Page 14: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/14.jpg)
Different Types of DBMS Parallelism• Intra-operator parallelism
• get all machines working to compute a given operation (scan, sort, join)
• OLAP (decision support)
• Inter-operator parallelism• each operator may run concurrently on a
different site (exploits pipelining)• For both OLAP and OLTP (transactiond)
• Inter-query parallelism• different queries run on different sites• For OLTP
• We’ll focus on intra-operator parallelism
⨝
𝝲
⨝
𝝲
⨝
𝝲
⨝
𝝲
Ack:Slide by Prof. Dan Suciu
![Page 15: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/15.jpg)
Data PartitioningHorizontally Partitioning a table (why horizontal?):Range-partition Hash-partition Block-partition
or Round Robin
Shared disk and memory less sensitive to partitioning, Shared nothing benefits from "good" partitioning
A...E F...J K...N O...S T...Z A...E F...J K...N O...S T...Z A...E F...J K...N O...S T...Z
• Good for equijoins, range queries, group-by• Can lead to data skew
• Good for equijoins• But only if hashed
on that attribute• Can lead to data
skew
• Send i-th tuple to i-mod-n processor
• Good to spread load
• Good when the entire relation is accessed
![Page 16: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/16.jpg)
Example
• R(Key, A, B)
• Can Block-partition be skewed?• no, uniform
• Can Hash-partition be skewed?• on the key: uniform with a good hash function• on A: may be skewed,
• e.g. when all tuples have the same A-value
![Page 17: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/17.jpg)
Parallelizing Sequential Evaluation Code
• “Streams” from different disks or the output of other operators
• are “merged” as needed as input to some operator• are “split” as needed for subsequent parallel processing
• Different Split and merge operations appear in addition to relational operators
• No fixed formula for conversion• Next: parallelizing individual operations
![Page 18: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/18.jpg)
Parallel Scans
• Scan in parallel, and merge.• Selection may not require all sites for range or hash
partitioning• but may lead to skew• Suppose sA = 10R and partitioned according to A• Then all tuples in the same partition/processor
• Indexes can be built at each partition
![Page 19: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/19.jpg)
Parallel Sorting
Idea: • Scan in parallel, and range-partition as you go
• e.g. salary between 10 to 210, #processors = 20• salary in first processor: 10-20, second: 21-30, third: 31-40, ….
• As tuples come in, begin “local” sorting on each• Resulting data is sorted, and range-partitioned• Visit the processors in order to get a full sorted order• Problem: skew!• Solution: “sample” the data at start to determine partition
points.
![Page 20: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/20.jpg)
Parallel Joins
• Need to send the tuples that will join to the same machine
• also for GROUP-BY
• Nested loop:• Each outer tuple must be compared with each inner
tuple that might join• Easy for range partitioning on join cols, hard otherwise
• Sort-Merge:• Sorting gives range-partitioning• Merging partitioned tables is local
![Page 21: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/21.jpg)
Parallel Hash Join
• In first phase, partitions get distributed to different sites:
• A good hash function automatically distributes work evenly
• Do second phase at each site.• Almost always the winner for equi-join
Original Relations(R then S)
OUTPUT
2
B main memory buffers DiskDisk
INPUT1
hashfunction
hB-1
Partitions
12
B-1. . .
Phas
e 1
![Page 22: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/22.jpg)
Dataflow Network for parallel Join
• Good use of split/merge makes it easier to build parallel versions of sequential join code.
![Page 23: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/23.jpg)
Jim Gray & Gordon Bell: VLDB 95 Parallel Database Systems Survey
Parallel Aggregates
A...E F...J K...N O...S T...Z
A Table
Count Count Count Count Count
Count
• For each aggregate function, need a decomposition:• count(S) = S count(s(i)), ditto for sum()• avg(S) = (S sum(s(i))) / S count(s(i))• and so on...
• For group-by:• Sub-aggregate groups close to the source.• Pass each sub-aggregate to its group’s site.
• Chosen via a hash fn.
Which SQL aggregate operators are not good for parallel execution?
![Page 24: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/24.jpg)
• Why?• Trivial counter-example:
• Table partitioned with local secondary index at two nodes
• Range query: all of node 1 and 1% of node 2.• Node 1 should do a scan of its partition.• Node 2 should use secondary index.
Best serial plan may not be best ||
N..Z
TableScan
A..M
Index Scan
![Page 25: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/25.jpg)
Example problem: Parallel DBMSR(a,b) is horizontally partitioned across N = 3 machines.
Each machine locally stores approximately 1/N of the tuples in R.
The tuples are randomly organized across machines (i.e., R is block partitioned across machines).
Show a RA plan for this query and how it will be executed across the N = 3 machines.
Pick an efficient plan that leverages the parallelism as much as possible.
• SELECT a, max(b) as topb• FROM R• WHERE a > 0• GROUP BY a
![Page 26: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/26.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topbFROM RWHERE a > 0GROUP BY a
R(a, b)
![Page 27: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/27.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topbFROM RWHERE a > 0GROUP BY a
R(a, b)
scan scan scan
If more than one relation on a machine, then “scan S”, “scan R” etc
![Page 28: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/28.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topbFROM RWHERE a > 0GROUP BY a
R(a, b)
scan scan scan
sa>0 sa>0 sa>0
![Page 29: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/29.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topbFROM RWHERE a > 0GROUP BY a
R(a, b)
scan scan scan
sa>0 sa>0 sa>0
ga, max(b)->b
ga, max(b)->b
ga, max(b)->b
![Page 30: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/30.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topbFROM RWHERE a > 0GROUP BY a
R(a, b)
scan scan scan
sa>0 sa>0 sa>0
ga, max(b)->b
ga, max(b)->b
ga, max(b)->b
Hash on a Hash on a Hash on a
![Page 31: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/31.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topb FROM RWHERE a > 0 GROUP BY aR(a, b)
scan scan scan
sa>0 sa>0 sa>0
ga, max(b)->b
ga, max(b)->b
ga, max(b)->b
Hash on a Hash on a Hash on a
![Page 32: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/32.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topb FROM RWHERE a > 0 GROUP BY aR(a, b)
scan scan scan
sa>0 sa>0 sa>0
ga, max(b)->b
ga, max(b)->b
ga, max(b)->b
Hash on a Hash on a Hash on a
ga, max(b)->topb
ga, max(b)->topb
ga, max(b)->topb
![Page 33: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/33.jpg)
Same Example Problem: Map ReduceExplain how the query will be executed in MapReduce
(recall Lecture-3)
• SELECT a, max(b) as topb• FROM R• WHERE a > 0• GROUP BY a
Specify the computation performed in the map and the reduce functions
![Page 34: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/34.jpg)
Map
• Each map task• Scans a block of R• Calls the map function for each tuple• The map function applies the selection predicate to the
tuple• For each tuple satisfying the selection, it outputs a
record with key = a and value = b
SELECT a, max(b) as topb FROM RWHERE a > 0GROUP BY a
•When each map task scans multiple relations, it needs to output something like key = a and value = (‘R’, b) which has the relation name ‘R’
![Page 35: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/35.jpg)
Shuffle
• The MapReduce engine reshuffles the output of the map phase and groups it on the intermediate key, i.e. the attribute a
SELECT a, max(b) as topb FROM RWHERE a > 0GROUP BY a
•Note that the programmer has to write only the map and reduce functions, the shuffle phase is done by the MapReduce engine (although the programmer can rewrite the partition function), but you should still mention this in your answers
![Page 36: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/36.jpg)
ReduceSELECT a, max(b) as topbFROM RWHERE a > 0GROUP BY a
• Each reduce task• computes the aggregate value max(b) = topb for each group
(i.e. a) assigned to it (by calling the reduce function) • outputs the final results: (a, topb)
• Multiple aggregates can be output by the reduce phase likekey = a and value = (sum(b), min(b)) etc.
• Sometimes a second (third etc) level of Map-Reduce phase might be needed
A local combiner can be used to compute local max before data gets reshuffled (in the map tasks)
![Page 37: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/37.jpg)
Benefit of hash-partitioning
• What would change if we hash-partitioned R on R.abefore executing the same query on the previous parallel DBMS
SELECT a, max(b) as topbFROM R
WHERE a > 0GROUP BY a
![Page 38: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/38.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topb FROM RWHERE a > 0 GROUP BY aPrev: block-partition
scan scan scan
sa>0 sa>0 sa>0
ga, max(b)->b
ga, max(b)->b
ga, max(b)->b
Hash on a Hash on a Hash on a
ga, max(b)->topb
ga, max(b)->topb
ga, max(b)->topb
![Page 39: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/39.jpg)
• It would avoid the data re-shuffling phase• It would compute the aggregates locally
SELECT a, max(b) as topbFROM R
WHERE a > 0GROUP BY a
![Page 40: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/40.jpg)
1/3 of R 1/3 of R 1/3 of R
Machine 1 Machine 2 Machine 3
SELECT a, max(b) as topb FROM RWHERE a > 0 GROUP BY aHash-partition on a for R(a, b)
scan scan scan
sa>0 sa>0 sa>0
ga, max(b)->topb
ga, max(b)->topb
ga, max(b)->topb
![Page 41: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/41.jpg)
Map Reduce
Slides by Junghoon Kang
![Page 42: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/42.jpg)
Big Datait cannot be stored
in one machine
store the data sets on multiple machines
e.g. Google File SystemSOSP 2003
it cannot be processed in one machine
parallelize computation on multiple machines
e.g. MapReduceOSDI 2004
![Page 43: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/43.jpg)
Where does Google use MapReduce?
MapReduce
Input
Output
● crawled documents● web request logs
● inverted indices● graph structure of web documents● summaries of the number of pages
crawled per host● the set of most frequent queries in a
day
![Page 44: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/44.jpg)
What is MapReduce?It is a programming model
that processes large data by:apply a function to each logical record in the input (map)
categorize and combine the intermediate results into summary values (reduce)
![Page 45: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/45.jpg)
Google’s MapReduce is inspired by map and reduce functions in functional programming
languages.
![Page 46: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/46.jpg)
For example, in Scala functional
programming language,
scala> val lst = List(1,2,3,4,5)
scala> lst.map( x => x + 1 )
res0: List[Int] = List(2,3,4,5,6)
![Page 47: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/47.jpg)
For example, in Scala functional
programming language,
scala> val lst = List(1,2,3,4,5)
scala> lst.reduce( (a, b) => a + b )
res0: Int = 15
![Page 48: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/48.jpg)
Ok, it makes sense in one machine.
Then, how does Google extend the functional idea
to multiple machinesin order to
process large data?
![Page 49: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/49.jpg)
Understanding MapReduce(by example)
I am a class president
![Page 50: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/50.jpg)
An English teacher asks you:“Could you count the number of occurrences of
each word in this book?”
![Page 51: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/51.jpg)
Um… Ok...
This image cannot currently be displayed.
![Page 52: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/52.jpg)
Let’s divide the workload among classmates.
map
![Page 53: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/53.jpg)
And let few combine the intermediate results reduce
I will collect from A ~ G
H ~ Q R ~ Z
A:6B: 2….G: 12
H:8….Q: 1
R:6….Z: 2A:5
B:7….G:2
A: 11B: 9….G:14
![Page 54: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/54.jpg)
Why did MapReduce become so popular?
![Page 55: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/55.jpg)
Is it because Google uses it?
![Page 56: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/56.jpg)
Distributed Computation Before MapReduce
Things to consider:• how to divide the workload among multiple machines?
• how to distribute data and program to other machines?
• how to schedule tasks?
• what happens if a task fails while running?
• … and … and ...
![Page 57: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/57.jpg)
Distributed Computation After MapReduce
Things to consider:• how to write Map function?
• how to write Reduce function?
![Page 58: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/58.jpg)
MapReduce lowered the knowledge barrier
in distributed computation.
Developers needed before MapReduce
Developers needed after MapReduce
![Page 59: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/59.jpg)
Map-Reduce Steps
• Input is typically (key, value) pairs• but could be objects of any type
• Map and Reduce are performed by a number of processes• physically located in some processors
same key
Map ReduceShuffleInputkey-value pairs
outputlistssort by key
![Page 60: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/60.jpg)
Map-Reduce Steps
1. Read Data2. Map – extract some info of
interest in (key, value) form3. Shuffle and sort
• send same keys to the same reduce process
same key
Map ReduceShuffleInputkey-value pairs
outputlistssort by key
4. Reduce– operate on the values of the same
key– e.g. transform, aggregate,
summarize, filter
5. Output the results (key, final-result)
![Page 61: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/61.jpg)
Simple Example: Map-Reduce
• Word counting• Inverted indexes
Ack:Slide by Prof. Shivnath Babu
![Page 62: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/62.jpg)
Map Function
• Each map process works on a chunk of data -- master selects available workers• Input: (input-key, value)• Output: (intermediate-key, value) -- may not be the same as input key value• Example: list all doc ids containing a word
• output of map (word, docid) – emits each such pair• word is key, docid is value• duplicate elimination can be done at the reduce phase
same key
Map ReduceShuffleInputkey-value pairs
outputlistssort by key
![Page 63: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/63.jpg)
Reduce Function
• Input: (intermediate-key, list-of-values-for-this-key) – list can include duplicates• each map process can leave its output in the local disk, reduce process can retrieve
its portion• master selects idle workers for reduce
• Output: (output-key, final-value)
• Example: list all doc ids containing a word• output will be a list of (word, [doc-id1, doc-id5, ….])• if the count is needed, reduce counts #docs, output will be a list of (word, count)
same key
Map ReduceShuffleInputkey-value pairs
outputlistssort by key
![Page 64: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/64.jpg)
Key Players in MapReduce
• A Map-Reduce “Job”• e.g. count the words in all docs• complex queries can have multiple MR jobs
• Map or Reduce “Tasks”• A group of map or reduce “functions”• scheduled on a single “worker”
• Worker • a process that executes one task at a time• one per processor, so 4-8 per machine• Follows whatever the Master asks to do
• One Master● coordinates many workers.
● assigns a task to each worker
![Page 65: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/65.jpg)
Fault Tolerance
Although the probability of a machine failure is low,
the probability of a machine failing among thousands of
machines is common.
![Page 66: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/66.jpg)
How does MapReduce handle machine failures?
Worker Failure
● The master sends heartbeat to each worker node.
● If a worker node fails, the master reschedules the tasks
handled by the worker.
Master Failure
● The whole MapReduce job gets restarted through a
different master.
End of Lecture 24
![Page 67: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/67.jpg)
Same Example Problem: Map ReduceExplain how the query will be executed in MapReduce
• SELECT a, max(b) as topb• FROM R• WHERE a > 0• GROUP BY a
Specify the computation performed in the map and the reduce functions
![Page 68: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/68.jpg)
Map
• Each map task• Scans a block of R• Calls the map function for each tuple• The map function applies the selection predicate to the
tuple• For each tuple satisfying the selection, it outputs a
record with key = a and value = b
SELECT a, max(b) as topb FROM RWHERE a > 0GROUP BY a
•When each map task scans multiple relations, it needs to output something like key = a and value = (‘R’, b) which has the relation name ‘R’
![Page 69: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/69.jpg)
Shuffle
• The MapReduce engine reshuffles the output of the map phase and groups it on the intermediate key, i.e. the attribute a
SELECT a, max(b) as topb FROM RWHERE a > 0GROUP BY a
•Note that the programmer has to write only the map and reduce functions, the shuffle phase is done by the MapReduce engine (although the programmer can rewrite the partition function)
![Page 70: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/70.jpg)
ReduceSELECT a, max(b) as topbFROM RWHERE a > 0GROUP BY a
• Each reduce task• computes the aggregate value max(b) = topb for each group
(i.e. a) assigned to it (by calling the reduce function) • outputs the final results: (a, topb)
• Multiple aggregates can be output by the reduce phase likekey = a and value = (sum(b), min(b)) etc.
• Sometimes a second (third etc) level of Map-Reduce phase might be needed
A local combiner can be used to compute local max before data gets reshuffled (in the map tasks)
![Page 71: Parallel Databases and Map Reduce · Map Reduce Introduction to Databases CompSci316 Spring 2017. Announcements (Wed., Apr. 17) •Homework #4due Monday, April 24, 11:55 pm •No](https://reader034.fdocuments.net/reader034/viewer/2022042307/5ed374c935091779073b1ac5/html5/thumbnails/71.jpg)
Any benefit of hash-partitioning for Map-Reduce?• For MapReduce
• Logically, MR won’t know that the data is hash-partitioned
• MR treats map and reduce functions as black-boxes and does not perform any optimizations on them
• But, if a local combiner is used• Saves communication cost:
• fewer tuples will be emitted by the map tasks• Saves computation cost in the reducers:
• the reducers would have to do anything
SELECT a, max(b) as topbFROM R
WHERE a > 0GROUP BY a