Big Data, Fast Data - MapReduce in Hazelcast
-
Upload
christoph-engelbert -
Category
Engineering
-
view
1.083 -
download
7
Transcript of Big Data, Fast Data - MapReduce in Hazelcast
![Page 1: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/1.jpg)
BIG DATA - FAST DATAUSING MAPREDUCE IN HAZELCAST
Source: http://www.newscientist.com/gallery/dn17805-computer-museums-of-the-world/11
www.hazelcast.com
![Page 2: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/2.jpg)
WHO AM IChristoph Engelbert (@noctarius2k)8+ years of Java WeirdonessPerformance, GC, traffic topicsApache DirectMemory PMCPrevious companies incl. Ubisoft and HRSCastMapR MapReduce for Hazelcast 3
www.hazelcast.com
![Page 3: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/3.jpg)
TOPICSHazelcastDistributed ComputingMap & ReduceDemonstrationQuestions
www.hazelcast.com
![Page 4: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/4.jpg)
HAZELCASTA SHORT SPACE TRIP
www.hazelcast.com
![Page 5: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/5.jpg)
WHAT IS HAZELCAST?In-Memory Data-GridData Partioning (Sharding)Java Collections ImplementationDistributed Computing Platform
www.hazelcast.com
![Page 6: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/6.jpg)
WHY HAZELCAST?
www.hazelcast.com
![Page 7: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/7.jpg)
WHY IN-MEMORYCOMPUTING?
www.hazelcast.com
![Page 8: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/8.jpg)
TREND OF PRICES
Data Source: http://www.jcmit.com/memoryprice.htm
www.hazelcast.com
![Page 9: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/9.jpg)
SPEED DIFFERENCE
Data Source: http://i.imgur.com/ykOjTVw.png
www.hazelcast.com
![Page 10: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/10.jpg)
DISTRIBUTEDCOMPUTING
OR
MULTICORE CPU ON STEROIDS
www.hazelcast.com
![Page 11: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/11.jpg)
THE IDEA OF DISTRIBUTED COMPUTING
Source: https://www.flickr.com/photos/stefan_ledwina/1853508040
www.hazelcast.com
![Page 12: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/12.jpg)
THE BEGINNING
Source: http://en.wikipedia.org/wiki/File:KL_Advanced_Micro_Devices_AM9080.jpg
www.hazelcast.com
![Page 13: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/13.jpg)
MULTICORE IS NOT NEW
Source: http://en.wikipedia.org/wiki/File:80386with387.JPG
www.hazelcast.com
![Page 14: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/14.jpg)
CLUSTER IT
Source: http://rarecpus.com/images2/cpu_cluster.jpg
www.hazelcast.com
![Page 15: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/15.jpg)
SUPER COMPUTER
Source: http://www.dkrz.de/about/aufgaben/dkrz-geschichte/rechnerhistorie-1
www.hazelcast.com
![Page 16: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/16.jpg)
CLOUD COMPUTING
Source: https://farm6.staticflickr.com/5523/11407118963_e0e0870846_b_d.jpg
www.hazelcast.com
![Page 17: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/17.jpg)
MAP & REDUCETHE BLACK MAGIC FROM PLANET GOOGLE
www.hazelcast.com
![Page 18: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/18.jpg)
USE CASESLog AnalysisData QueryingAggregation and summingDistributed SortETL (Extract Transform Load)and more...
www.hazelcast.com
![Page 19: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/19.jpg)
SIMPLE STEPSReadMap / TransformReduce
www.hazelcast.com
![Page 20: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/20.jpg)
FULL STEPSReadMap / TransformCombiningGrouping / ShufflingReduceCollating
www.hazelcast.com
![Page 21: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/21.jpg)
MAPREDUCE WORKFLOW
www.hazelcast.com
![Page 22: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/22.jpg)
Data are mapped / transformed in a set of key-value pairs
SOME PSEUDO CODE (1/3)
MAPPING
map( key:String, document:String ):Void -> for each w:word in document: emit( w, 1 )
www.hazelcast.com
![Page 23: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/23.jpg)
Multiple values are combined to an intermediate result to preserve traffic
SOME PSEUDO CODE (2/3)
COMBINING
combine( word:String, counts:List[Int] ):Void -> emit( word, sum( counts ) )
www.hazelcast.com
![Page 24: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/24.jpg)
Values are reduced / aggregated to the requested result
SOME PSEUDO CODE (3/3)
REDUCING
reduce( word:String, counts:List[Int] ):Int -> return sum( counts )
www.hazelcast.com
![Page 25: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/25.jpg)
FOR MATHEMATICIANSProcess: (K x V)* → (L x W)* ⇒ [(l1, w1), …, (lm, wm)]
Mapping: (K x V) → (L x W)* ⇒ (k, v) → [(l1, w1), …, (ln, wn)]
Reducing: L x W* → X* ⇒ (l, [w1, …, wn]) → [x1, …,xn]
www.hazelcast.com
![Page 26: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/26.jpg)
MAPREDUCE PROGRAMS INGOOGLE SOURCE TREE
Source: http://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0005.html
www.hazelcast.com
![Page 27: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/27.jpg)
DEMONSTRATION
www.hazelcast.com
![Page 28: Big Data, Fast Data - MapReduce in Hazelcast](https://reader036.fdocuments.net/reader036/viewer/2022062304/55d4975fbb61eba4698b45c8/html5/thumbnails/28.jpg)
@noctarius2k@hazelcast
http://www.sourceprojects.comhttp://github.com/noctarius
THANK YOU!ANY QUESTIONS?
Images: All images are licensed under Creative Commons
www.hazelcast.com