What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification...

50
Q2 Insights Presentation to San Diego AMA Art of Marketing 2015 Speaker: Kirsty Nunez Big Data January 16, 2015

Transcript of What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification...

Page 1: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

Q2 Insights Presentation to San Diego AMA Art of Marketing 2015

Speaker: Kirsty Nunez

Big Data

January 16, 2015

Page 2: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

2

What is Big Data?

Page 3: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

3

The exact definition of Big Data is open to interpretation. Fuzzy thinking about Big Data goessomething like this – It’s something big, it also has to do with data.

Big Data

Any collection of data sets SO LARGE and complex that it becomes difficult to process them using traditional data

processing applications

Large data sets that may be analyzed to reveal patterns, trends, and

associations, especially relating to human behavior and interactions

Analysis is usually planned and may involve multiple

sources of data (data sets)

What is Big Data?

By 2015 4.4 million IT jobs will be created globally to support big

data, with 1.9 million in the US

Source: http://www.ibmbigdatahub.com/enlarge-infographic/1642

Page 4: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

4

Big Data is a Rapidly Growing Industry

Source: Big Data Universe v3.. Matt Turck, Sutian Dong & FirstMark Capital, 2013

Page 5: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

5

Volume Velocity Variety Veracity

The four key characteristics of Big Data are:

The Four V’s of Big Data

Source: http://www.ibmbigdatahub.com/enlarge-infographic/1642

Page 6: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

6

40 Zettabytes (43 Trillion Gigabytes) of data will be created

by 2020, an increase of 300 times from 2005

It is estimated that 2.5 Quintillion Bytes (2.3 trillion

Gigabytes) of data are created each day

Most companies in the US have at least 100 Terabytes(100,000 Gigabytes) of data stored

6 billion people have cell phones (world population: 7

billion)

Volume

VolumeScale of Data

Source: http://www.ibmbigdatahub.com/enlarge-infographic/1642

Page 7: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

7

The New York Stock Exchange captures 1 Terabyte of Trade

information during each trading session

Modern cars have close to 100 sensors that monitor items

such as fuel level and tire pressure

By 2016, it is projected there will be 18.9 Billion network

connections (almost 2.5 connections per person on earth)

Velocity

VelocityAnalysis of Streaming

Data

Source: http://www.ibmbigdatahub.com/enlarge-infographic/1642

Page 8: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

8

Variety

As of 2011, the global size of data in healthcare was estimates

to be 150 Exabytes (161 Billion Gigabytes)

By 2014, it is anticipated there will be 420 million,

wearable, wireless health monitors

400 million tweets are sent per day by about 200million monthly active users

30 billion pieces of content are share on Facebook every

month

VarietyDifferent Forms of

Data

Source: http://www.ibmbigdatahub.com/enlarge-infographic/1642

Page 9: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

9

One in three business leaders do not trust the information

they use to make decisions

Poor data quality costs the US economy around $3.1 trilliona year

27% of respondents in one survey were unsure of how much

of their data was inaccurate

Veracity

VeracityUncertainty

of Data

Source: http://www.ibmbigdatahub.com/enlarge-infographic/1642

Page 10: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

10

Big Data Veracity refers to the biases, noise and abnormality in data. Is the data that is being stored,and mined meaningful to the problem being analyzed?

Veracity

Source: http://www.ibmbigdatahub.com/enlarge-infographic/1642

Data Accuracy

Data Fidelity

Data Truthfulness

Page 11: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

11

Types of Big Data

Page 12: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

12

Types of Big Data

Structured Data

Unstructured Data

Page 13: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

13

Structured Data

Relational Databases

Spreadsheets

Within a Record or File

A data model is necessary to create structural data; this includes defining what will be stored andhow.

Data in a Fixed Field

Page 14: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

14

Structured Data

Structured data is generated by computers or machines and humans.

HumansMachines

Sensory Data

Web Log Data

Point of Scale Data

Financial Data

Input Data

Click-stream Data

Gaming - Related Data

Page 15: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

15

Structured Data

Structured data is generated by computers or machines and humans.

HumansMachines

Satellite Images

Scientific Data

Photographs and Video

Radar or Sonar Data

Text Internal to a Company

Social Media Data

Mobile Data

Website Content

Page 16: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

16

Unstructured Data

Text documents Email Video

Audio Stock Ticker DataFinancial

Transactions

Unstructured data is growing faster than structured data and it is predicted to account for 90% of alldata created this decade.

Page 17: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

17

Uses of Big Data

Page 18: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

18

Uses of Big Data

Customer EngagementCustomer Retention and

LoyaltyMarketing Optimization

Profiling Consumers Tailored AdvertisingHow to Gain A

Competitive Edge

More Strategic, Actionable Insights

Identify Valuable Marketing Opportunities

Monitoring Google Trends

Clearly Define Your Audience

Create Real-Time Personalization to Buyers

Identify Specific Content that Moves Buyers Down

the Sales Funnel

Tailored PricingTracking and Extracting Meaning From Social Network Information

Retail Habits

Page 19: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

19

Big Data Analysis Applications

Big Data Analytics

Social Media Analytics

Online Advertising

Display Marketing

Test AnalyticsRetail

AnalyticsCustomer Analytics

Forecasting

Pricing and Revenue

OptimizationPredictive Modeling

Custom Insights

Custom Reporting

Custom Dashboards

Page 20: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

20

Big Data Analysis

Page 21: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

21

Statistics Econometrics Machine Learning

Data MiningArtificial

IntelligenceOperations Research

Natural Language Processing

Data Sciences Involved in Big Data Analysis

Page 22: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

22

Structured Big Data

Big Data Analysis Road Map

Unstructured Big Data

Predictive AnalysisTracking and Extraction of

Meaning

Classification

Regression

ClusteringPattern

Recognition / Clustering

Big Data

Page 23: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

23

Structured Data Analysis

Big Data is increasingly utilized in an attempt to predict consumer behavior. This is called PredictiveAnalytics. Predictive models exploit patterns in historical and transactional data to identifyopportunities and risks.

A mortgage company uses their data to

generate a list of good loan candidates

Amazon predicts customer product

interests based on past behavior

Page 24: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

24

Structured Data Analysis Techniques

Classification Regression Clustering

Page 25: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

25

Structured Data Analysis: Classification

Classification is a data mining application where the variable of interest, the variable we want topredict, is categorical in nature.

Categorical data is used to distinguish between groups. Classification data mining techniques often take on descriptive and predictive aspects.

For example, we might want to find new categories of behavior that are strongly related to the main variable of interest.

Gender

Age Group

Location

Page 26: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

26

Structured Data Analysis: Regression

Regression analysis is a tool used to understand the effect of one variable onanother variable and understanding relative strengths of effects. An examplewould include:

The goals of a regression problem are similar to that of a classification project.We would like to find the best predictors related to the variable of interest.

Determining the effect on sales if prices were increased by 5%

Page 27: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

27

Structured Data Analysis: Clustering

Clustering has quite a different goal than classification or regressions. With clustering, a variable ofinterest does not exist. Instead, we attempt to sort the data into clusters. Example include:

Cluster individuals for a marketing

campaign

Cluster products purchased based

on customer survey responses

The Netflix model clusters

customers into movie categories

and makes recommendations

based on their movie watching

history

Page 28: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

28

Unstructured Data Analysis

To make sense of unstructured data, different methodologies are employed to identify patterns orclusters in the data including: Data Text Mining, Text Analysis, Neural Network Analysis and SocialNetwork Analysis. For example, Social Network Analysis might be deployed to:

See who is talking about the

brand

Determine who are major

influencers or connectors and what they are

saying

Understand not only what is

being said in the social media

sphere but also identify the most

efficient messengers

Produce social network maps or hyperlink maps

Page 29: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

29

Big Data Analysis

Structured Data Unstructured Data

Classification

Regression

Pattern Recognition / Clustering

Other

Pattern Recognition / Clustering

Factor Analysis

Principal Components Analysis

Linear / Non-Linear Regressions

Logistic Regression

Neural Network Analysis

Clustering

Data Text Mining

Text Analysis

Neural Network Analysis

Social Network Analysis

A / B Testing

Time Series Analysis

Optimization

Structural Equation Models

Discrete Choice Models

Page 30: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

30

One Approach to Big Data Analysis: Data Mining

Page 31: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

31

Data Mining and Big Data

Data Mining is an analytic

process

Designed to explore data

In search of consistent

patterns, and / or

relationships

And then to validate

findings by applying to

new subsets of data

Page 32: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

32

Data Mining

To decipher Big Data, we have to data mine. Data mining is a powerful set of methodologies thatwhen successfully applied, will:

Increase business revenue

Cut costsOther actions

to improve the bottom line

Page 33: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

33

Ultimate Goal of Data Mining

Predictive data mining is the most common type of data mining and one that has the most directbusiness applications.

Prediction

Page 34: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

34

Big Data and Prediction

Modeling Customer actions and interactions

Predicts

Statistical Techniques Market size for a product or service

Analytics Market share, products and pricing

Brand Perceptions Brand Choice

Discrete Choice Conjoint Techniques

Best mix of products or services

Advanced Analytics and Modeling

Specific facets of a company’s customers

Page 35: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

35

Model Building ModelingPredictive Data

Mining

Exploration

Deployment

Stages of Data Mining

The process of data mining consists of three stages:

Business Understanding

Data Understanding

Data Preparation

Evaluation

Validation

Implementation

Page 36: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

Exploration: Business Understanding

Outline Project Objectives and Requirements

Convert Objectives and Requirements into a Data Mining Problem Definition

Using the Problem Definition, a preliminary plan to achieve the business objectives is developed

Page 37: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

Exploration: Data Understanding

Before working with any data set, the data mining expert must become familiar with the data by:

Identifying data quality problems

Detecting preliminary insights into the data

Exploring the possibility of interesting data subsets in which useful information may be hidden

Page 38: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

Exploration: Data Preparation

A final dataset is developed that will be used during modeling. Raw data may be manipulatedmultiple times to achieve the final data set. Techniques in this phase include:

Based on the analytic problem the process of data mining ranges from straightforward predictors fora regression model to exploratory analyses using a variety of graphical and statistical methods toidentify the most relevant variables and determine the complexity and / or nature of models.

Tabling RecordingAttribute Selection

Transformation of data

Cleaning of Data

Page 39: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

Model Building: Validation

Modeling techniques are dictated by the nature of the data. Typically, several modeling techniqueswill be employed. As some techniques have specific data format requirements on the form of data,returning to the data preparation phase is often required.

The process of considering various models and choosing the best one based on predictive performance

Page 40: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

Model Building: Predictive Data Mining

There are a variety of techniques developed to achieve validation - many of which are based on“competitive evaluation of models," which applies different models to the same data set and thencompares performance to choose the best.

Bagging (Voting,

Averaging)Boosting Stacking

Meta-Learning

Page 41: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

41

Deployment

The application of the model to new data in order to generate predictions

Page 42: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

Evaluation and Implementation

Data mining and the resulting knowledge are powerful tools. This process canprovide marketers with actionable knowledge to inform and drive marketingstrategy which in turn can have significant impact on business profitability.

Before deploying a model it is critical to thoroughly evaluate and review the steps executed to be certain it addresses business objectives.

The knowledge gained from the modeling must be organized and presented in a way that management can understand.

Deployment can be as simple as generating a report or as complex as implementing a repeatable data mining process.

Page 43: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

43

Big Data and Market Research

Page 44: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

44

Big Data and Market Research

Marketers need to ensure their insights and results are valid and determine whether they are beingapplied in a valid manner. Whenever we need to gather information, there are essentially fivequestions we answer.

Who What When Where Why

Page 45: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

45

Big Data and Market Research

Four components of the Five W’s are provided by structured Big Data and other data sources.

Who

What

When

Where

The who question identifies the various players in a problem or solution.

The what question tries to ascertain what consumers are buying, trends, and services used.

The when question considers various time based events and activities such as when customers are buying products or services, e.g. day part, date range, or life stage, etc.

The where question addresses geographic and/or logistical aspects of a solution.

Page 46: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

46

Big Data and Market Research

With the increasing prevalence and accessibility of Big Data, businesses are already provided theWho, What, When, and Where. But Big Data cannot provide the Why. This is where MarketResearch comes in. As long as humans continue to be inconsistent, impulsive, dynamic, and subtle,breakthroughs solely dependent on Big Data will be elusive.

The emotional, rational, and irrational drivers of customers cannot be explained by Big Data. And asthe prevalence of Big Data increases, the number of questions that are raised will increase; thesequestions are best addressed by traditional market research.

Who What When Where Why

Page 47: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

47

Answering the why question is most effectively achieved by the integration of quantitative andqualitative research methodologies with qualitative being used to answer the why question andquantitative being used to verify and quantify the findings.

Applied appropriately, these methods will result in a collection of textual, visual and oral data thatwill need to be analyzed through textual analysis. This qualitative analysis provides insight intocustomers’ attitudes, behaviors, and their thought processes.

Focus Groups In-Depth InterviewsObservation

(Ethnography)

Social Networks Guided Online Chats

Qualitative Methods to Answer Why

Page 48: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

48

Big Data and Market Research

Nothing beats knowing why people make the choices

they do.

Big Data finds the patterns,

market researchers test the hypotheses.

Page 49: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

49

While Big Data is sometimes touted as the magic bullet to address all market research questions, itis not the answer to all questions and insights to be obtained. Big Data has its place in the array ofmarket research methodologies and is an ever growing presence.

Big Data and Market Research

Before Big Data, primary research conducted by

market researchers focused on what was happening.

Now with that requirement increasingly solved by Big Data, market researchers

can focus on why there are deviations from trends.

Page 50: What is Big Data? - AMA San Diego€¦ · Structured Data Analysis: Classification Classification is a data mining application where the variable of interest, the variable we want

50

Thank You

Q2 Insights, Inc.

San Diego2236 Encinitas Blvd., Suite F

Encinitas, CA 92024Phone: 760-230-2950

Fax: 760-230-2951

New Orleans1070-B West Causeway Approach

Mandeville, LA 70471Phone: 985-867-9494

[email protected]