Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

36
© 2011 IBM Corporation 1 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics [email protected] January 18th, 2011

description

Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood. Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics [email protected] January 18th, 2011. The data will find the data … and the relevance will find you. My Background. - PowerPoint PPT Presentation

Transcript of Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

Page 1: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation1

Big Data. New Physics.And Why Geospatial Data is Analytic SuperFood

Jeff Jonas, IBM Distinguished EngineerChief Scientist, IBM Entity Analytics

[email protected]

January 18th, 2011

Page 2: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation2

The data will find the data … and the relevance

will find you.

Page 3: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation3

My Background

Early 80’s: Founded Systems Research & Development (SRD), a custom software consultancy

1989 – 2003: Built numerous systems for Las Vegas casinos including a technology known as Non-Obvious Relationship Awareness (NORA)

2005: IBM acquires SRD, now chief scientist of IBM Entity Analytics

Personally designed and deployed +/- 100 systems, a number of which contained multi-billions of transactions describing 100’s of millions of entities

Today: My focus is in the area of ‘sensemaking on streams’ with special attention towards privacy and civil liberties protections

Page 4: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation4

Sensemaking on Streams

1) Evaluate new information against previous information … as it arrives.

2) Determine if what is being observing is relevant.

3) Deliver this relevant, actionable insight fast enough to do something about it … as it’s happening.

4) Do this with sufficient accuracy and scale to really matter.

Page 5: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation5

Time

Com

pu

tin

g P

ow

er

Gro

wth

Sensemaking

Algorithms

Available Observation

Space

Context

Trend: Organizations Are Getting Dumber

• Your transactional data (inc. logs)• Available reference data• Plus, shared third party data• And an avalanche of open source=

Page 6: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation6

Simply Overwhelming

“Every two days now we create as much information as we did from the dawn of civilization up until 2003.”

~ Eric Schmidt, CEO Google

Page 7: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation7

Time

Com

pu

tin

g P

ow

er

Gro

wth

Sensemaking

Algorithms

Available Observation

Space

Context

Trend: Organizations Are Getting Dumber

WHY?

Page 8: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation8

Algorithms at Dead End.

You Can’t Squeeze Knowledge

Out of a Pixel.

Page 9: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation9

[email protected]

No Context

Page 10: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation10

Context, definition

Better understanding something by taking into account the things around it.

Page 11: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation11

Information in Context … and Accumulating

Top 200Customer

Job Applicant

IdentityThief

CriminalInvestigation

[email protected]

Page 12: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation12

From Pixels to Pictures to Insight

Observations

Contextualization

Information inContext

Relevance

Consumer(An analyst, a system, the sensor itself, etc.)

Page 13: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation13

The Puzzle Metaphor

Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes and colors

What it represents is unknown (there is no picture on hand)

Is it one puzzle, 15 puzzles, or 1,500 different puzzles?

Some pieces are duplicates, missing, incomplete, low quality, or have been misinterpreted

Some pieces may even be professionally fabricated lies

Point being: Until you take the pieces to the table and attempt assembly, you don’t know what you are dealing with

Page 14: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation14

How Context Accumulates

With each new observation … one of three assertions are made: 1) Un-associated; 2) placed near like neighbors; or 3) connected

Must favor the false negative

New observations sometimes reverse earlier assertions

Some observations produce novel discovery

As the working space expands, computational effort increases

Given sufficient observations, there can come a tipping point, at which time: 1) confidence begins to improve; and 2) computational effort begins to decrease!

Page 15: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation15

One Form of Context Is “Expert Counting”

Is it 5 people each with 1 account … or is it 1 person with 5 accounts?

Is it 20 cases of H1N1 in 20 cities … or one case reported 20 times?

If one cannot count … one cannot estimate vector or velocity (direction and speed).

Without vector and velocity … prediction is nearly impossible.

Page 16: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation16

Counting: Degrees of Difficulty

Exactly Same

Fuzzy

IncompatibleFeatures

Deceit

Bob Jones123455

Bob Jones123455

Bob Jones123455

Robert T Jonnes000123455

Bob Jones123455

bjones@hotmail

Bob Jones123455

Ken Wells550119

Page 17: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation17

“Key Features” Enable Expert Counting

People Cars Router

Name Make Device IDAddress Model MakeDate of Birth Year ModelPhone License Plate No. Firmware Vers.Passport VIN Asset IDNationality Owner Etc.Biometric Etc.Etc.

Page 18: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation18

Consider Lying Identical Twins

#123Sue3/3/84UberstanExp 2011

PASSPORT#123Sue3/3/84UberstanExp 2011

PASSPORT

Fingerprint

DNAMost Trusted

Authority

“Same person –

trust me.”

Most TrustedAuthority

Page 19: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation19

The same thing cannot be in two places … at the same time.

Two different things cannot occupy the same space … at the same time.

Page 20: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation20

Space & Time Enables Absolute Disambiguation

People Cars RouterName Make Device IDAddress Model MakeDate of Birth Year ModelPhone License Plate No. Firmware Vers.Passport VIN Asset IDNationality Owner Etc.Biometric Etc.Etc.

When When WhenWhere Where Where

Page 21: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation21

“Life Arcs” Are Also Telling

Bill Smith4/13/67

Salem, Oregon

Bill Smith4/13/67

Seattle, Washington

Address History

Tampa, FL 2008-2008

Biloxi, MS 2005-2008

NY, NY 1996-2005

Tampa, FL 1984-1996

Address History

San Diego, CA 2005-2009

San Fran, CA 2005-2005

Phoenix, AZ 1990-2005

San Jose, CA 1982-1990

Page 22: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation22

Space-Time-Travel

Page 23: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation23

Space-Time-Travel

Cell phones are generating a staggering amount of geo-locational data – 600B transactions per day being created in the US alone

This data is being “de-identified” and shared with third parties – in volume and in real-time

Your movement quickly reveals where you spend your time (e.g., evenings vs. working hours) and who you spend your time with

Re-identification (figuring out who is who) is somewhat trivial

Page 24: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation24

Analytic Superfood for Prediction

Route suggestions pushed to drivers, just-in-time, to avert significant traffic events

Search results optimized using personalized life arc forecasts

A nation able to work right through an extreme global pandemic

Page 25: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation25

And Other Predictions …

Prediction with 87% certainty where you will be next Thursday at 5:35pm

Names of the top 10 people you co-locate with, not at home and not at work

The Uberstan intelligence service preempts the next mass protest in real-time

A political opponent is crushed and resigns two days after announcing their candidacy

Page 26: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation26

Consequences

Space-time-travel data is the ultimate biometric

It will enable enormous opportunity

It will unravel one’s secrets

It will challenge existing notions of privacy

And, it’s here now and more to come

Page 27: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation27

Surveillance society

is irresistible.

And you are doing it.GPS-enhanced search, free email, Facebook, etc.

Page 28: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation28

Responsible innovation

Privacy by design

Better data protectionData anonymization, active audit logs, etc.

Page 29: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation29

Closing Thoughts

Page 30: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation30

Time

Com

pu

tin

g P

ow

er

Gro

wth

Sensemaking

Algorithms

Available Observation

Space

Context

Wish This On The Adversary

Page 31: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation31

Time

Com

pu

tin

g P

ow

er

Gro

wth

Context Accumulation: The Way Forward

Sensemaking

Algorithms

Available Observation

SpaceContext Context

Accumulation

Page 32: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation32

Geospatial-Enabled Intelligence ... Today

GeospatialAnalytics

GeospatialVisualization

Current Focus

Page 33: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation33

GeospatialVisualization

GeospatialAnalytics

Future Focus

Geospatial-Enabled Intelligence … Tomorrow

Page 34: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation34

Big Data. New Physics.

More Data: Better prediction– Less false positives

– Less false negatives

More Data: Bad data good

More Data: Less compute effort

Page 35: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation35

Related Blog Posts

Algorithms At Dead-End: Cannot Squeeze Knowledge Out Of A Pixel

Puzzling: How Observations Are Accumulated Into Context

Big Data. New Physics.

Smart Sensemaking Systems, First and Foremost, Must be Expert Counting Systems

Your Movements Speak for Themselves: Space-Time Travel Data is Analytic Super-Food!

Big Data Flows vs. Wicked Leaks

Data Finds Data

“Macro Trends: The Privacy and Civil Liberties Consequences … and Comments on Responsible Innovation” – My DHS DPIAC Testimony, September 2008

Page 36: Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood

© 2011 IBM Corporation36

Big Data. New Physics.And Why Geospatial Data is Analytic SuperFood

Jeff Jonas, IBM Distinguished EngineerChief Scientist, IBM Entity Analytics

[email protected]

January 18th, 2011