Marketing intelligence voor managers – data lake of data warehouse?
Data Lake,beyond the Data Warehouse
-
Upload
data-science-thailand -
Category
Data & Analytics
-
view
2.518 -
download
0
Transcript of Data Lake,beyond the Data Warehouse
Data Lake, beyond the Warehouse
1 Cheow Lan Lake, Thailand
โกเมษจันทวิมลFebruary, 3, 2016
Komes Chandavimol
Data Science Thailand Meetup#4
Shifting to the 3rd gen platform with Data Lake
2http://www.adweek.com/prnewser/how-many-times-do-the-worlds-social-media-users-click-every-minute/117427
https://www.domo.com/learn/data-never-sleeps-3-0
The Growth of Data
3http://www.adweek.com/prnewser/how-many-times-do-the-worlds-social-media-users-click-every-minute/117427
https://www.domo.com/learn/data-never-sleeps-3-0
4http://www.adweek.com/prnewser/how-many-times-do-the-worlds-social-media-users-click-every-minute/117427
https://www.domo.com/learn/data-never-sleeps-3-0
Can these tools support Big Data?
Spreadsheet? Database? Data Mart? Data Warehouse?
5Source: Forrester Research’s James Kobielus
The Emergence of Big Data Tools
6http://blogs.forrester.com/category/hadoop
http://solutions.forrester.com/Global/FileLib/webinars/Big_Data_-_Gold_Rush_or_Illusion.pdf
HADOOP
7http://opensource.com/life/14/8/intro-apache-hadoop-big-data
Analytics 3.0
Data Mining Tools
8
Data Discovery and Visualization Tools
Tableu.com, RapidMiner.com
How to apply to current environment?
9http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
Traditional Data Warehouse
10http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
New Data Management Architecture
11http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
New Data Management Architecture
12http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
Data Lake
13
https://www.digitalnewsasia.com/business/forget-data-warehousing-its-data-lakes-now
Data Lake
A single place to store every type of data in its native format with no fixed limits on account size or file size, high throughput to increase analytic performance and native integration with the Hadoop ecosystem.
15
Reference: James Serra's Blog
Data Lake Development with Big Data , Pradeep Pasupuleti (2015)https://www.digitalnewsasia.com/business/forget-data-warehousing-its-data-lakes-now
Data Lake Processes
16
www.emc.com
Data Lake and Data Warehouse
17Hadoop Distributed Compared,BlazeClan Technology,2015
Data Lake and Data Warehouse
18Hadoop Distributed Compared,BlazeClan Technology,2015
Data Lakes
19
http://www.kdnuggets.com/2015/09/data-lake-vs-data-warehouse-key- differences.html
Data Lake
Type of Data Raw Data Derived Data Aggregated Data
Type of Environment Discovery Environment Production Environment
20The Definition of Data Lake, John O’Brien(2015)
How the Data Lake works?
21http://www.clearpeaks.com/blog/category/tableau
Traditional Enterprise Data warehouse
New Data Management Architecture
22http://hortonworks.com/blog/optimize-your-data-architecture-with-hadoop/
23http://www.kdnuggets.com/2014/05/big-data-landscape-v30-
analyzed.html
Data Lake Maturity
25The Definition of Data Lake, John O’Brien(2015)
4 Maturity Stages of Data Lake
Stage 1 – Pilot Project (Understand the Technology) Stage 2 – Productionize Hadoop and its capabilities Stage 3 – Proactive consolidate data to (Big) Data Analytics Stage 4 – Platform the Data Lake to Core Competency
26The Definition of Data Lake, John O’Brien(2015)
Putting the Data Lake to Work, Teradata, Hortonworks (2015)
Stage 1 – Pilot Project
Handling data at scale Involves getting the plumbing in place and learning to acquire
and transform data at scale. The analytics may be quite simple, but much is learned about
making Hadoop work the way you desire.
27The Definition of Data Lake, John O’Brien(2015)
Putting the Data Lake to Work, Teradata, Hortonworks (2015)
Stage 2– Productionize Hadoop and its capabilities
Involves improving the ability to transform and analyze data. Find the tools that are most appropriate to their skillset Acquiring more data and build applications.
28The Definition of Data Lake, John O’Brien(2015)
Putting the Data Lake to Work, Teradata, Hortonworks (2015)
Stage 3 – Proactive consolidate data to (Big) Data Analytics
Involves getting data and analytics into the hands of as many people as possible.
It is in this stage that the data lake and the enterprise data warehouse start to work in unison, each playing its role.
Started with a data lake eventually added an enterprise data warehouse to operationalize its data.
29The Definition of Data Lake, John O’Brien(2015)
Putting the Data Lake to Work, Teradata, Hortonworks (2015)
Big Data Analytics
30http://dataofthings.blogspot.com/2014/04/the-bbbt-sessions-hortonworks-big-data.html
Data Lake and Big Data Analytics
31http://hortonworks.com/blog/big-data-refinery-fuels-next-generation-data-architecture/
Stage 4 – Platform the Data Lake to Core Competency
Enhance Enterprise Capabilities are added to the data lake. Few companies have reached this level of maturity, but many
will as the use of big data grows, Require Data governance, compliance, security, and auditing
(and incorporate to Company Data Strategy)
32
The Technology of the Business Data Lake, Capgemini (2013)
Business Data Lake
33
The Technology of the Business Data Lake, Capgemini (2014)
34https://shefsite.files.wordpress.com/2014/04/where.jpg
35
36
http://image.slidesharecdn.com/mapr-db-in-hadoop-nosql-overview-150929062856-lva1-
app6892/95/maprdb-the-first-inhadoop-document-database-12-638.jpg?cb=1443536326
37http://www.predictiveanalyticstoday.com/waterline-data-
self-service-for-the-hadoop-data-lake/
The Data Lake Unifies Data Discovery, Data Science, and BI 3.0
38
Big Data
Self Serve BusinessData Science
Machine Learning
Visual AnalyticsBusiness Discovery
Deep Learning
Self Serve Business
Hadoop
Feature Engineering
Spark
Business Intelligence 3.0
YARN
Predictive AnalyticsHive
Data Lake
Data Visualization
Graph Analytics
Big Data
20+ posts relates to “Data Lake” Type “Data Science Thailand” “Data Lake”
40
41
42http://www.clearpeaks.com/blog/category/tableau
Traditional Enterprise Data warehouse
Questions?
43
44