Analytics from 330 million smartphones Sean Byrnes CTO & Co-founder

Post on 22-Feb-2016

30 views 0 download

Tags:

description

Analytics from 330 million smartphones Sean Byrnes CTO & Co-founder. Flurry Overview. Flurry Analytics Better apps on iOS, Android, BB, WP, HTML5. App Developers:. 60,000. Live Applications:. 160,000. Devices per month:. 480M. Sessions per month:. 33B. Events per month:. 300B. - PowerPoint PPT Presentation

Transcript of Analytics from 330 million smartphones Sean Byrnes CTO & Co-founder

Analytics from 330 million

smartphonesSean Byrnes

CTO & Co-founder

Flurry Overview

60,000

160,000

App Developers:

Live Applications:

Flurry Analytics Better apps on iOS, Android, BB, WP, HTML5

480MDevices per month:

33BSessions per month:

AppCircle NetworkAcquisition & Monetization: iOS, Android

6,200App Developers:

200MDevices per month:

300BEvents per month:

3MDaily Completed Views

How Flurry Works

Flurry’s Scale

1.2 Billions Sessions / Day

900 Servers

1.56 PB

Topics

1. Big Data Collection (HDFS)

2. Big Data Processing (Hadoop)

3. Data Mining at Scale (Hbase)

BIG DATA COLLECTION

Incoming Data

Peak Connections per Second: 25,000 Data per day: 1.5 TB

Data Collection

Reports

Load BalancerLoad Balancer

Load BalancerData Collector Load BalancerData Collector Load BalancerData Collector

File File File

HDFS

Data Collection

Reports Reports

HDFS HDFS

Location A Location B

BIG DATA PROCESSING

11

Normalization

Data Correction

Metrics Computation

Agent Report

De-duplication

Portfolio Analysis

Benchmarking Clustering

Identify Device, Country,Carrier, etc.

Bad Phone ClocksPartial Session Reports

Handle duplicate reports

Flexible calculationConfigurable Dimensions

Data mining and analysis

Audience Segmentation

Industry Trends Application Analytics

MerchandisingAnalytics

Analytics Processing

Large-scale Data Processing

Input Data

NoSQL DataStoreReal-Time Batch

Collectors

Consumer/ ProducerSystems

MapReduce(jobs)

External Action

External Action

Map/Reduce Management

• Challenge: Task Starvation

• Challenge: Task Roadblocking

• Challenge: Network Connection Waiting

Network Topology: Chained

Rack 1 Rack 2

Switch 1 Switch 2

Rack 3

Switch 3

Network Topology: Star

Rack 3 Rack 2

Switch 3 Switch 4

Switch 1 Switch 2

Trunk

Rack 1 Rack 2

DATA MINING AT SCALE

Stages of Data

Normalized OLAP CubeRaw Data

80 Billion Rows160 Billion Rows500 Billion Records

NoSQL Tables

111111111 Data Data

Index Column Family A Column Family B

222222222 Data Data

333333333 Data Data

NoSQL OLAP

metric.dimension

Index Column Family A

#

metric.dimensionA

metric.dimensionB

metric.dimensionC

metric.dimensionA.dimensionB.dimensionC

metric.dimensionA.dimensionB

metric.dimensionA.dimensionC

...

Lexicographical Ordering

metric dimensionA dimensionB index3 1 1 3113 1 11 31113 11 1 3111

metric.dimensionA.dimensionB

Lexicographical Ordering

metric dimensionA dimensionB index3 001 001 30010013 001 011 30010113 011 001 3011001

metric.dimensionA.dimensionB

NoSQL OLAP

metric.dimension.date

metric.dimension.1_1_12metric.dimension.3_1_12

Index

Row Scan

metric1/1/12

3/1/12

blog.flurry.com

Sean Byrnessean@flurry.com

Flurry, Inc. 282 2nd St. Suite 202

San Francisco, CA 94105http://www.flurry.com