Big data wonderland

26
Big Data Wonderland: Two Views on the Big Data Revolution Mark Madsen Third Nature, Inc. [email protected] @markmadsen Marc Demarest Noumenal, Inc. [email protected] Strata New York October 2012

Transcript of Big data wonderland

Page 1: Big data wonderland

Big Data Wonderland:�Two Views on the Big Data Revolution

Mark Madsen Third Nature, Inc. [email protected] @markmadsen

Marc Demarest Noumenal, Inc. [email protected]

Strata New York October 2012

Page 2: Big data wonderland

2 Third Nature, Inc. || Noumenal, Inc.

Preamble Twenty Years On

•  We came up together in this industry in the early 1990s, as pointy-headed advocates of data and star schema design, trained by the deity himself, Ralph Kimball

•  Back then, it was a simpler world...big iron, big DBMS, hand-coded ETL, star schema, a thousand rinky-dink query tools

•  Mostly, conversation was dominated by ETL and schema design

•  “There will never be a decisional database larger than 10 GB...”

St. Ralph

Our Alma Mater

Page 3: Big data wonderland

3 Third Nature, Inc. || Noumenal, Inc.

Preamble Twenty Years On

•  Twenty years on, we find ourselves with opposing view on what is either the biggest con, or the biggest sea-change, in our data warehousing odyssey

•  Question: Is the big data revolution big, or a revolution?

•  Question: do we have to change? and if so, how?

•  Not a round table. A slugfest....

Demarest as Shana Alexander?

Madsen as Jack Kilpatrick?

Page 4: Big data wonderland

4 Third Nature, Inc. || Noumenal, Inc.

Regular Programming Is Suspended

Demarest Madsen

Page 5: Big data wonderland

5 Third Nature, Inc. || Noumenal, Inc.

Compromise

Demarest Madsen

You take the blue pill. The story ends, you wake up in your bed

and believe whatever you want to believe.

You take the red pill, you stay in

Wonderland, and I show you how deep the rabbit hole goes.

Remember, all I am offering is the truth:

nothing more.

Page 6: Big data wonderland

6 Third Nature, Inc. || Noumenal, Inc.

The Issues 1. Data As A Factor of Production

RED BLUE

Hype.

For most companies, data is an asset

supporting process, not a factor in the production of its

products or services.

Execute vs manage the business.

Page 7: Big data wonderland

7 Third Nature, Inc. || Noumenal, Inc.

The Issues 1. Data As A Factor of Production

RED BLUE

Amen.

This change has been in process for

more than a decade. Social media leads

the way, but we’re all affected.

Hype.

For most companies, data is an asset

supporting process, not a factor in the production of its

products or services.

Execute vs manage the business.

Page 8: Big data wonderland

8 Third Nature, Inc. || Noumenal, Inc.

The Issues 2. The Reality of Big Data

RED BLUE

No company escapes.

Text, social, sensors, streaming -- the

instrumentation of the real world

transforms company decision-making

processes.

Page 9: Big data wonderland

9 Third Nature, Inc. || Noumenal, Inc.

The Issues 2. The Reality of Big Data

RED BLUE

No company escapes.

Text, social, sensors, streaming -- the

instrumentation of the real world

transforms company decision-making

processes.

Few companies transformed.

Social media help up a lot with no

quantification of benefits.

Management consultants? Asleep

at the switch.

Page 10: Big data wonderland

10 Third Nature, Inc. || Noumenal, Inc.

The Issues 3. The Commodity Hardware Revolution & Radical Scale-Out

RED BLUE

The current topology is alive and well.

These commodity building blocks are, after all, just SMP

platforms.

Real problems are under-investment,

bad design.

Page 11: Big data wonderland

11 Third Nature, Inc. || Noumenal, Inc.

The Issues 3. The Commodity Hardware Revolution & Radical Scale-Out

RED BLUE

The new topology.

Cheap compute, unintelligent direct-attach storage and free comms make

large scale-out grids the future.

The current topology is alive and well.

These commodity building blocks are, after all, just SMP

platforms.

Real problems are under-investment,

bad design.

Page 12: Big data wonderland

12 Third Nature, Inc. || Noumenal, Inc.

The Issues 4. Merchant DBMSs

RED BLUE

Increasingly irrelevant.

We’ve been over-structured and

under-resourced for 20 years.

CSV is still the international

standard.

Page 13: Big data wonderland

13 Third Nature, Inc. || Noumenal, Inc.

The Issues 4. Merchant DBMSs

RED BLUE

Increasingly irrelevant.

We’ve been over-structured and

under-resourced for 20 years.

CSV is still the international

standard.

Will rise to the challenge.

Any worthwhile innovation will be absorbed by the merchant DBMS

players.

Even the big players use these things.

Page 14: Big data wonderland

14 Third Nature, Inc. || Noumenal, Inc.

The Issues 5. Query, Reporting & Dashboarding Tools

RED BLUE

Will rise to the challenge.

We have two generations of

analysts trained to feed using these

tools.

Big data offers no last-mile answers.

Page 15: Big data wonderland

15 Third Nature, Inc. || Noumenal, Inc.

The Issues 5. Query, Reporting & Dashboarding Tools

RED BLUE

Ineffective, now and in the future.

Can’t do real-time, can’t visualize large

data sets, can’t support discovery and exploration.

Will rise to the challenge.

We have two generations of

analysts trained to feed using these

tools.

Big data offers no last-mile answers.

Page 16: Big data wonderland

16 Third Nature, Inc. || Noumenal, Inc.

The Issues 6. Structured Query Language

RED BLUE

Toast.

Too complex, too hard to code, too

hard to debug. A way of ensuring

dependency on merchant DBMSs.

Page 17: Big data wonderland

17 Third Nature, Inc. || Noumenal, Inc.

The Issues 6. Structured Query Language

RED BLUE

Toast.

Too complex, too hard to code, too

hard to debug. A way of ensuring

dependency on merchant DBMSs.

Tasty.

Powerful, expressive language for

complex analytical problems.

Why do noSQL vendors reinvent it

all the time?

Page 18: Big data wonderland

18 Third Nature, Inc. || Noumenal, Inc.

The Issues 7. New Programming Models

RED BLUE

The “new model” looks a lot like SAS, only with java and

no support.

Open source doesn’t mean free. Or easy.

The skills gap here is huge, we can’t fill it.

Page 19: Big data wonderland

19 Third Nature, Inc. || Noumenal, Inc.

The Issues 7. New Programming Models

RED BLUE

Say hello to Pig.

New analytical problems

(decisioning, discovery,

exploration) require new languages, new

tools and new programming

models.

The “new model” looks a lot like SAS, only with java and

no support.

Open source doesn’t mean free. Or easy.

The skills gap here is huge, we can’t fill it.

Page 20: Big data wonderland

20 Third Nature, Inc. || Noumenal, Inc.

The Issues 8. Conventional DW Architecture

RED BLUE

A relic.

Overly complex. Difficult to implement.

Controlled by the supply side of the market, anyway.

Page 21: Big data wonderland

21 Third Nature, Inc. || Noumenal, Inc.

The Issues 8. Conventional DW Architecture

RED BLUE

A relic.

Overly complex. Difficult to implement.

Controlled by the supply side of the market, anyway.

Perfectly viable. No need to change.

Some new technologies may

play roles, but we’re good to go, generally.

Built by developers for users. The new is built by developers

for developers.

Page 22: Big data wonderland

22 Third Nature, Inc. || Noumenal, Inc.

The Issues 9. The Cloud

RED BLUE

Don’t go there.

Your inside-the-firewall apps remain the core information

asset.

Where is “there” anyway?

Page 23: Big data wonderland

23 Third Nature, Inc. || Noumenal, Inc.

The Issues 9. The Cloud

RED BLUE

We all go there.

Most of the interesting data is

there; it’s more effective to move our

data, and our analyses, to where the data is, already.

Don’t go there.

Your inside-the-firewall apps remain the core information

asset.

Where is “there” anyway?

Page 24: Big data wonderland

24 Third Nature, Inc. || Noumenal, Inc.

The Issues 10. New Technologies

RED BLUE

Save Us.

Best of breed integration led by in-house designers ins

back, with a vengeance.

Page 25: Big data wonderland

25 Third Nature, Inc. || Noumenal, Inc.

The Issues 10. Emerging Technologies

RED BLUE

Save Us.

Best of breed integration led by in-house designers ins

back, with a vengeance.

Distract Us.

We’ve already seen what best-of-breed gives us: a circus.

Page 26: Big data wonderland

26 Third Nature, Inc. || Noumenal, Inc.

What We Really Think 1. Data As A Factor of Production

2. The Reality of Big Data

3. The Commodity Hardware Revolution

4. Merchant DBMSs

5. Query, Reporting & Dashboarding Tools

6. Structured Query Language

7. New Programming Models

8. Conventional DW Architecture

9. The Cloud

10. New Technologies