Big data wonderland
-
Upload
sarah-daniels -
Category
Documents
-
view
311 -
download
1
Transcript of Big data wonderland
Big Data Wonderland:�Two Views on the Big Data Revolution
Mark Madsen Third Nature, Inc. [email protected] @markmadsen
Marc Demarest Noumenal, Inc. [email protected]
Strata New York October 2012
2 Third Nature, Inc. || Noumenal, Inc.
Preamble Twenty Years On
• We came up together in this industry in the early 1990s, as pointy-headed advocates of data and star schema design, trained by the deity himself, Ralph Kimball
• Back then, it was a simpler world...big iron, big DBMS, hand-coded ETL, star schema, a thousand rinky-dink query tools
• Mostly, conversation was dominated by ETL and schema design
• “There will never be a decisional database larger than 10 GB...”
St. Ralph
Our Alma Mater
3 Third Nature, Inc. || Noumenal, Inc.
Preamble Twenty Years On
• Twenty years on, we find ourselves with opposing view on what is either the biggest con, or the biggest sea-change, in our data warehousing odyssey
• Question: Is the big data revolution big, or a revolution?
• Question: do we have to change? and if so, how?
• Not a round table. A slugfest....
Demarest as Shana Alexander?
Madsen as Jack Kilpatrick?
4 Third Nature, Inc. || Noumenal, Inc.
Regular Programming Is Suspended
Demarest Madsen
5 Third Nature, Inc. || Noumenal, Inc.
Compromise
Demarest Madsen
You take the blue pill. The story ends, you wake up in your bed
and believe whatever you want to believe.
You take the red pill, you stay in
Wonderland, and I show you how deep the rabbit hole goes.
Remember, all I am offering is the truth:
nothing more.
6 Third Nature, Inc. || Noumenal, Inc.
The Issues 1. Data As A Factor of Production
RED BLUE
Hype.
For most companies, data is an asset
supporting process, not a factor in the production of its
products or services.
Execute vs manage the business.
7 Third Nature, Inc. || Noumenal, Inc.
The Issues 1. Data As A Factor of Production
RED BLUE
Amen.
This change has been in process for
more than a decade. Social media leads
the way, but we’re all affected.
Hype.
For most companies, data is an asset
supporting process, not a factor in the production of its
products or services.
Execute vs manage the business.
8 Third Nature, Inc. || Noumenal, Inc.
The Issues 2. The Reality of Big Data
RED BLUE
No company escapes.
Text, social, sensors, streaming -- the
instrumentation of the real world
transforms company decision-making
processes.
9 Third Nature, Inc. || Noumenal, Inc.
The Issues 2. The Reality of Big Data
RED BLUE
No company escapes.
Text, social, sensors, streaming -- the
instrumentation of the real world
transforms company decision-making
processes.
Few companies transformed.
Social media help up a lot with no
quantification of benefits.
Management consultants? Asleep
at the switch.
10 Third Nature, Inc. || Noumenal, Inc.
The Issues 3. The Commodity Hardware Revolution & Radical Scale-Out
RED BLUE
The current topology is alive and well.
These commodity building blocks are, after all, just SMP
platforms.
Real problems are under-investment,
bad design.
11 Third Nature, Inc. || Noumenal, Inc.
The Issues 3. The Commodity Hardware Revolution & Radical Scale-Out
RED BLUE
The new topology.
Cheap compute, unintelligent direct-attach storage and free comms make
large scale-out grids the future.
The current topology is alive and well.
These commodity building blocks are, after all, just SMP
platforms.
Real problems are under-investment,
bad design.
12 Third Nature, Inc. || Noumenal, Inc.
The Issues 4. Merchant DBMSs
RED BLUE
Increasingly irrelevant.
We’ve been over-structured and
under-resourced for 20 years.
CSV is still the international
standard.
13 Third Nature, Inc. || Noumenal, Inc.
The Issues 4. Merchant DBMSs
RED BLUE
Increasingly irrelevant.
We’ve been over-structured and
under-resourced for 20 years.
CSV is still the international
standard.
Will rise to the challenge.
Any worthwhile innovation will be absorbed by the merchant DBMS
players.
Even the big players use these things.
14 Third Nature, Inc. || Noumenal, Inc.
The Issues 5. Query, Reporting & Dashboarding Tools
RED BLUE
Will rise to the challenge.
We have two generations of
analysts trained to feed using these
tools.
Big data offers no last-mile answers.
15 Third Nature, Inc. || Noumenal, Inc.
The Issues 5. Query, Reporting & Dashboarding Tools
RED BLUE
Ineffective, now and in the future.
Can’t do real-time, can’t visualize large
data sets, can’t support discovery and exploration.
Will rise to the challenge.
We have two generations of
analysts trained to feed using these
tools.
Big data offers no last-mile answers.
16 Third Nature, Inc. || Noumenal, Inc.
The Issues 6. Structured Query Language
RED BLUE
Toast.
Too complex, too hard to code, too
hard to debug. A way of ensuring
dependency on merchant DBMSs.
17 Third Nature, Inc. || Noumenal, Inc.
The Issues 6. Structured Query Language
RED BLUE
Toast.
Too complex, too hard to code, too
hard to debug. A way of ensuring
dependency on merchant DBMSs.
Tasty.
Powerful, expressive language for
complex analytical problems.
Why do noSQL vendors reinvent it
all the time?
18 Third Nature, Inc. || Noumenal, Inc.
The Issues 7. New Programming Models
RED BLUE
The “new model” looks a lot like SAS, only with java and
no support.
Open source doesn’t mean free. Or easy.
The skills gap here is huge, we can’t fill it.
19 Third Nature, Inc. || Noumenal, Inc.
The Issues 7. New Programming Models
RED BLUE
Say hello to Pig.
New analytical problems
(decisioning, discovery,
exploration) require new languages, new
tools and new programming
models.
The “new model” looks a lot like SAS, only with java and
no support.
Open source doesn’t mean free. Or easy.
The skills gap here is huge, we can’t fill it.
20 Third Nature, Inc. || Noumenal, Inc.
The Issues 8. Conventional DW Architecture
RED BLUE
A relic.
Overly complex. Difficult to implement.
Controlled by the supply side of the market, anyway.
21 Third Nature, Inc. || Noumenal, Inc.
The Issues 8. Conventional DW Architecture
RED BLUE
A relic.
Overly complex. Difficult to implement.
Controlled by the supply side of the market, anyway.
Perfectly viable. No need to change.
Some new technologies may
play roles, but we’re good to go, generally.
Built by developers for users. The new is built by developers
for developers.
22 Third Nature, Inc. || Noumenal, Inc.
The Issues 9. The Cloud
RED BLUE
Don’t go there.
Your inside-the-firewall apps remain the core information
asset.
Where is “there” anyway?
23 Third Nature, Inc. || Noumenal, Inc.
The Issues 9. The Cloud
RED BLUE
We all go there.
Most of the interesting data is
there; it’s more effective to move our
data, and our analyses, to where the data is, already.
Don’t go there.
Your inside-the-firewall apps remain the core information
asset.
Where is “there” anyway?
24 Third Nature, Inc. || Noumenal, Inc.
The Issues 10. New Technologies
RED BLUE
Save Us.
Best of breed integration led by in-house designers ins
back, with a vengeance.
25 Third Nature, Inc. || Noumenal, Inc.
The Issues 10. Emerging Technologies
RED BLUE
Save Us.
Best of breed integration led by in-house designers ins
back, with a vengeance.
Distract Us.
We’ve already seen what best-of-breed gives us: a circus.
26 Third Nature, Inc. || Noumenal, Inc.
What We Really Think 1. Data As A Factor of Production
2. The Reality of Big Data
3. The Commodity Hardware Revolution
4. Merchant DBMSs
5. Query, Reporting & Dashboarding Tools
6. Structured Query Language
7. New Programming Models
8. Conventional DW Architecture
9. The Cloud
10. New Technologies