Big Data and Hadoop - key drivers, ecosystem and use cases
-
Upload
jeff-kelly -
Category
Technology
-
view
576 -
download
4
description
Transcript of Big Data and Hadoop - key drivers, ecosystem and use cases
![Page 1: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/1.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
[[The Wikibon Project]]
Big Data and Hadoop: Key Drivers, Ecosystem and Use Cases November 2011
![Page 2: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/2.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
What is Big Data?
2
Big Data n Data sets whose size, type and/or speed make them impractical to process and analyze with traditional database technologies and related data management tools.
![Page 3: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/3.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Why is Big Data Important?
3
Big Data is the new de.initive source of competitive advantage across industries …
… For those organizations that embrace Big Data, the possibilities for innovation, improved agility, and increased pro.itability are nearly endless.
![Page 4: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/4.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Three Key Big Data Drivers
4
1. Volume, Variety, Velocity
2. Hardware Commoditization
3. Cloud Computing
![Page 5: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/5.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Characteristics of Big Data
5
![Page 6: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/6.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Sources of Big Data
6
![Page 7: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/7.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Hadoop
7
Open source framework for processing, storing and analyzing Big Data.
Fundamental concept: Rather than banging away at one, huge block of data with a single machine, Hadoop breaks up Big Data into multiple parts so each part can be processed and analyzed in parallel.
![Page 8: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/8.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Hadoop: The Pros and Cons
8
First the pros … Hadoop is a time- and cost-effective approach to store, process and analyze large volumes of unstructured data allowing for new and unprecedented types of analytics.
Now the cons … Hadoop is complex and difficult to deploy and manage; there’s a dearth of Hadoop-savvy engineers and Data Scientists on the job market; the risk of forking and vendor lock-in remains.
![Page 9: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/9.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Hadoop: The Pros and Cons cont.
9
More pros … Many bright minds contributing to Hadoop resulting in rapid development and an ecosystem of vendors emerging to make Hadoop enterprise-ready.
![Page 10: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/10.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
The Big Data Ecosystem
10
![Page 11: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/11.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Big Data Pioneers
11
• Largest Hadoop instance on the planet … 40,000 nodes handling 200+ PB of data.
• Used to support research
for ad systems and Web search.
• Match ads with users, detect spam in Yahoo! Mail, pick relevant top stories.
![Page 12: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/12.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Big Data Pioneers cont.
12
• Two major clusters processing and storing over 30 PB of data.
• Uses HDFS to store copies of internal log and dimension data.
• Developed Hive to perform large-scale analytics on user data.
• Using HBase to store, manage and retrieve Facebook Messenger data.
![Page 13: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/13.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Big Data Pioneers cont.
13
• Uses Hadoop to support “People You May Know” feature.
• Tailors its search engine to return most relevant results for recruiters, employers and job seekers.
• Created a visualization tool to allow users to explore their professional network to discover hidden patterns.
![Page 14: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/14.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Big Data in Financial Services
14
• Over 30,000 databases and 15,000 applications spread across 7 business units.
• Using Hadoop as the basis of its Common Data Platform.
• Looking to establish 360 degree view of customer for upsell and cross-sell opportunities.
![Page 15: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/15.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Big Data in Financial Services cont.
15
• Risk management and analysis to understand financial exposure.
• Detecting fraudulent transactions and potentially criminal activity.
• Conduct sentiment analysis on social media data.
![Page 16: Big Data and Hadoop - key drivers, ecosystem and use cases](https://reader034.fdocuments.net/reader034/viewer/2022042813/54c3857e4a7959297b8b4578/html5/thumbnails/16.jpg)
© Wikibon 2008 © Wikibon 2011 | Confidential www.wikibon.org
Thank You
16
Jeffrey F. Kelly Principal Research Contributor
The Wikibon Project
[email protected] @jeffreyfkelly
www.wikibon.org www.siliconangle.com