Ruby for soul of BigData Nerds
-
Upload
abhishek-parolkar -
Category
Technology
-
view
2.481 -
download
0
description
Transcript of Ruby for soul of BigData Nerds
![Page 1: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/1.jpg)
Ruby for the soul of BigData Nerds
![Page 2: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/2.jpg)
Who Am I?● Engineering Team Lead
Analytics & Data Platforms @ Viki.com
● Founder of http://BigData.SG
● Contributor to fluentd, pfeed, cartographer, watir
![Page 3: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/3.jpg)
BigData & Its Challenges "big data" is when the size of the data itself becomes part of the problem - Mike Loukides
● Twitter produces over 230 million tweets per day● Wal-Mart is logging one million transactions per hour● Facebook creates over 30 billion pieces of content ranging from web links, news, blogs, photo
![Page 4: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/4.jpg)
Everyone has a big data problem
![Page 5: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/5.jpg)
Evolving Trends
Batch ProcessingHadoop , HPCC, Google BigQuery
Stream Processing STORM (Twitter) & S4 (Yahoo)
![Page 6: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/6.jpg)
Common Engineering Challenges
● Data Collection● Filtering / Segmentation● Data Storage● Analysis● Visualization● Prediction / Extrapolation
![Page 7: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/7.jpg)
Data Collection + Filtering / Segmentation
http://fluentd.org/
![Page 8: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/8.jpg)
Data Collection + Filtering / Segmentation
http://fluentd.org/
You send events as:Http://domain:8080/namespace?key1=value1&key2=value2
Fluent forwards the data as:<timestamp> <namespace> {key1:value1,key2:value2}
![Page 11: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/11.jpg)
Analysis
Hadoop Streaming (Ruby)
Hadoop Hive (Using rbhive)
![Page 12: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/12.jpg)
Visualization
Custom Dashboard (Rails + Google Charts / d3.js)
Some Hosted Services: tableaupublic.com, geckoboard.com, splunkstorm.com
![Page 13: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/13.jpg)
Stream Computing
![Page 14: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/14.jpg)
What is STORM?
![Page 15: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/15.jpg)
STORM terminology●Streams●Spouts●Bolts●Topologies
![Page 16: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/16.jpg)
RedStorm (https://github.com/colinsurprenant/redstorm)
$ rvm use jruby-1.6.3 $ bundle install redstorm $ bundle exec redstorm install
![Page 17: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/17.jpg)
Visualizing average bandwidth experienced by users while watching videos on viki.com across the globe.
![Page 18: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/18.jpg)
![Page 19: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/19.jpg)
![Page 20: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/20.jpg)
![Page 21: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/21.jpg)
![Page 22: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/22.jpg)
![Page 23: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/23.jpg)
![Page 24: Ruby for soul of BigData Nerds](https://reader034.fdocuments.net/reader034/viewer/2022051609/547a505eb4af9fda158b4aaa/html5/thumbnails/24.jpg)
Thank you!
Let's stay in touch :)
● Signup for my newsletter at http://parolkar.com● Visit BigData.SG Meetup in Singapore.