Why Big Data is Really about Small Data

27
Why Big Data is Really About Small Data: The Big Data Paradox Judith Hurwitz President & CEO, Hurwitz & Associates

description

This presentation explains how big data is transforming the way data is managed and provides a context on why it is essential to get to the data that matters.

Transcript of Why Big Data is Really about Small Data

Page 1: Why Big Data is Really about Small Data

Why Big Data is Really

About Small Data: The Big Data Paradox

Judith Hurwitz

President & CEO, Hurwitz & Associates

Page 2: Why Big Data is Really about Small Data

Agenda

§  What is so big about Big Data? §  What is a data scientist §  Data at rest, data in motion §  Is Big Analytics more important? §  Rethinking data modeling in a big data world §  A couple of examples §  What you should think about §  Questions?

Page 3: Why Big Data is Really about Small Data

Meet the Speaker

§  Judith Hurwitz §  President and CEO of Hurwitz & Associates, Inc., a strategy consulting and research firm

focused on distributed computing technologies. A pioneer in anticipating technology innovation and adoption, Judith advocates for a pragmatic adoption of an architectural and business approach to the emerging market for cloud computing, service orientation, and service management. She has served as a trusted advisor to many industry leaders over the years. Judith has helped these companies make the transition to a new business model focused on the business value of emerging platforms. Judith is an accomplished author and most recently co-author of Big Data for Dummies.

Page 4: Why Big Data is Really about Small Data

4

Our Team’s Latest Book

Page 5: Why Big Data is Really about Small Data

What is so big about big data?

§  Definition of Big Data §  Volume – How much data §  Variety – Various types of data (structured, unstructured) §  Velocity – Speed that data moves from one location to another §  Veracity – Accuracy (Do the results of a big data analysis make

sense?)

§  Big Data is not new §  So, why now?

§  Impacting the way you collect, store, manage, analyze, and visualize data

Page 6: Why Big Data is Really about Small Data

What is the Purpose of Big Data?

§  Gather, store, manage, and manipulate vast amounts of data at the right speed, at the right time to get the right results

§  Gather enough data so that you can find patterns

§  Put those patterns to work to gain insights in context

6

Page 7: Why Big Data is Really about Small Data

Examples of Big Data

§  Analyze multiple data sources to detect and protect against insider trading, money laundering, credit card theft

§  Monitoring market feeds §  Managing risk models §  Log files §  Spatial data from sensors §  Medical device data – data from sensors connected to

medical equipment §  GPS data §  Unstructured data in emails, text messages, call center

notes 7

Page 8: Why Big Data is Really about Small Data

Why do we need to think about Big Data?

§  What big data means to business

§  More data for better decision making

§  Integration of data across business units and silos

§  Detecting risks in real time

§  Focus on putting information in context with supporting business decisions

§  Improving the customer experience by leveraging customer feedback from many different sources

8

Page 9: Why Big Data is Really about Small Data

From Big to Small

•  Big data is only the first step in the journey

•  Big data requires that you reduce the amount of data to a subset so that your organization can take a deeper look

•  Once this subset of data is cleansed and verified, it can help analyze, predict, and prepare to address the future

9

Page 10: Why Big Data is Really about Small Data

The Role of a Data Scientist?

§  Combining computing science, math, statistics, and business (domain) knowledge

§  Looking for answers when you don’t know the question you want to ask

§  Asking new types of questions: finding nuggets of actionable information in huge volumes of data

§  Making analytics consumable: real-time analysis to help the business take the right action at the right time

§  Predictive analytics: What is the next best action?

10

Page 11: Why Big Data is Really about Small Data

Representation Technology Stack

11

Big$Data$Tech$Stack$

Redundant$Physical$Infrastructure$Security$Infrastructure$

“Organizing”$Databases$and$Tools$

Analy@cal$Data$Warehouses$and$Data$Marts$

Interfaces$and

$feed

s$from/to$the$Internet$

Interfaces$and$feeds$from/to$internal$applica@ons$

Big$Data$Applica@ons$

Repor@ng$&$Visualiza@on$Analy@cs$(Tradi@onal$and$Advanced)$

Opera@onal$Databases$(Structured,$Unstructured,$SemiMstructured)$

Page 12: Why Big Data is Really about Small Data

12

Where Most of This Began

TransactionalSystem

(Production  Data)

Data  WarehouseDataMart

Page 13: Why Big Data is Really about Small Data

Then It Got “Better”

13

TransactionalSystem

(Production  Data)

Data  WarehouseDataMart

TransactionalSystem

(Production  Data)

Data  WarehouseDataMart

Page 14: Why Big Data is Really about Small Data

Then It Got “More Better”

14

Operational  System

Operational  System

TransactionalSystem(s)

Data  Warehouse

LOBDataMart

LOBDataMart

LOBDataMart

Page 15: Why Big Data is Really about Small Data

And Better Still

15

Operational  System

StagingArea

Operational  System

TransactionalSystem(s)

Data  Warehouse

LOBDataMart

LOBDataMart

LOBDataMart

Page 16: Why Big Data is Really about Small Data

Oops. Data at rest vs. data in motion

16

Operational  System

StagingArea

Operational  System

TransactionalSystem(s)

????

Page 17: Why Big Data is Really about Small Data

Data At Rest, Data In Motion

§  Data in motion is no longer a bad thing §  Trend is combining “traditional” with

streaming §  Instant analysis isn’t fast enough

§  It’s all about real-time

§  What data to keep?

Page 18: Why Big Data is Really about Small Data

Is Big Analytics More Important?

§  In a word YES §  We are looking for answers to questions we haven’t

asked yet §  Patterns, patterns, patterns §  But…

§  Current generation analytics engines can be overwhelmed §  Results may be too difficult to understand even with visualization §  You may be looking in the wrong place or at the wrong things

Page 19: Why Big Data is Really about Small Data

Is Hadoop the New EDW?

§  No one type of Big Data platform is optimal for all requirements

§  Hadoop is changing the economics of storing and analyzing large volumes and variety of data

§  Results of Hadoop analytics needs to be understood in context

§  Increasing importance of hybrid big data architectures – combine Hadoop with your systems of record

§  Hadoop for specific roles §  Exploratory data-science sandboxes §  Staging platform for unstructured data

19

Page 20: Why Big Data is Really about Small Data

Rethinking Data Modeling

§  Traditional data models assume: §  Relational data §  Clean data §  A few clearly identifiable data sources

§  Next generation data model – the rules have changed §  Some relational data, some NoSQL §  Some of the data is dirty §  Lots of data sources coming from many different places §  Some of the data you will keep and some you will not

§  Design your data model to account for new world of large and varied data sources

20

Page 21: Why Big Data is Really about Small Data

Big Data Use Cases

§  “Voice of the Customer”, 360-degree view of customer §  Strengthen brand and increase customer loyalty §  Improve operational analytics §  Target and reduce fraud and improve security §  Use sensors to provide real-time information about rivers

and oceans to predict impact of environmental changes

21

Page 22: Why Big Data is Really about Small Data

Correlating Varied Data Sources in Finance

§  Financial services is highly competitive and highly regulated. Financial services needs to create innovative customer experience while protecting IP. Companies need to anticipate the next best action.

§  What type of data is needed? §  Transaction data §  Threat data §  Log data §  Customer survey data §  Customer support data §  Customer social media data §  Partner data §  News and event data, ……

§  Need to be able to correlate all types of structured and unstructured data to predict the future and provide opportunities for growth and expansion

22

Page 23: Why Big Data is Really about Small Data

Advanced Security Analytics to Predict and Protect

§  Government agency needed more visibility into all system traffic

§  Concern about the unknown – needed to look for and protect from malicious activity

§  Used advanced security analytics to correlate data across seemingly unrelated events

§  Real-time §  Analyze variety data sources- emails, documents, social

media data, business process data, DNS transactions §  Analyze massive amounts structured and unstructured

data

23

Page 24: Why Big Data is Really about Small Data

Matching Capabilities to Business Problems

§  Text Analytics §  Next Best Action §  Data in Motion §  Adding business

process and rules §  Anamoly Detection §  Data Visualization

§  Correlation between customer service, comments in the market, customer management

§  Putting a lot of data types together to determine best actions

§  Detecting Fraud 24

Page 25: Why Big Data is Really about Small Data

How Do You Manage Big Data?

§  Big data is not clean – it is massive and much is unstructured

§  Resulting patterns from big data analytics needs to be culled, cleaned and matched to enterprise data

§  Culled data now must be analyzed in context with your systems of record

§  Apply data visualization and best practices to determine how to apply data to actions

25

Page 26: Why Big Data is Really about Small Data

You need to think about the following:

§  Where are the sources of the data that could be important?

§  How often do you need access to particular types of data?

§  How long and how much data do you need to keep?

§  Can you trust the data and its sources?

§  Use Big Data analytics to overcome conventional wisdom and conventional thinking.

§  If you already know the questions to ask you aren’t moving forward.

26

Page 27: Why Big Data is Really about Small Data

Q & A

§  Thank you!

§  Contact info: §  Judith Hurwitz: [email protected]

27