Phases of Big Data Challenges @ Nokia

21
Yekesa Kosuru HERE.com Nokia Hadoop Innovation Summit February 20 & 21, San Diego 2013 Phases of Big Data Challenges @ Nokia 1 1

description

In this presentation Yekasa Kosuru talks about challenges associated with Big Data at Nokia. As well as discussing the challenges, Kosuru also talks through solutions that Nokia use across the different platforms they use there. The solutions are broken into phases which Kosuru talks through in detail with the use of stats and flow charts.

Transcript of Phases of Big Data Challenges @ Nokia

Page 1: Phases of Big Data Challenges @ Nokia

Yekesa KosuruHERE.com

Nokia

Hadoop Innovation Summit February 20 & 21, San Diego 2013

Phases of Big Data Challenges@ Nokia

11

Page 2: Phases of Big Data Challenges @ Nokia

• Phases of Big Data Challenges @Nokia

– Who we are

– Big data platform

– Use case data flows

– High level architecture

–Challenges• Phases of challenges

Agenda

22

Page 3: Phases of Big Data Challenges @ Nokia

Accelerometer

GPS

Water

Proof

12h

Battery

Bluetooth 2GB Storage

Barometer

NFC

Gyroscope

Magnetometer

Who we are – disrupting the future

3

Page 4: Phases of Big Data Challenges @ Nokia

Apps

Smart Data

Platform

Content

PositionsMaps TrafficPlaces Directions Guidance

Location Platform, Enabling Contextually Rich Mobile Experiences

44

Page 5: Phases of Big Data Challenges @ Nokia

5

Big DataAnalytics

…to Be MadeAvailable for Analysis

Enabling feedback loops for continuous improvement,Location Optimized Experience, CRM, etc..!

Big Data Flows and Differentiates

…on All SupportedPlatforms…

NokiaAccount

We CollectUser Data…

5

Page 6: Phases of Big Data Challenges @ Nokia

Click to edit Master title style

Phase 0

66

2008 – ‘10Build Technology

Platform,Get Data

Page 7: Phases of Big Data Challenges @ Nokia

7

Business Challenges

• Data silos, no unique identifiers, missing semantics

• Multiple sources - overlapping, conflicting

• Timely processing of large volumes & velocity of data

• Partial, insufficient, inaccurate, inconsistent.. data

• Data/wire formats, Security, privacy and other policies unknown

Central Big Data Platform created

Page 8: Phases of Big Data Challenges @ Nokia

8

…to verify Map accuracy and create Motion Graph

Using different big data sets

Page 9: Phases of Big Data Challenges @ Nokia

Reports

AnalyticalDBMS

Analytics Cluster

Data AssetCatalog

AnalyticalDBMS

Dashboards

Data Discovery

InteractiveQueries

BatchQueries

Web Applications

Activity Logs

VShards(NoSQL)

Reference Data

Device Applications

Probes

3rd Party

Device

User Profile

POI, Map

ActivitySensor

Dat

a In

take

ETL,

dat

a cr

un

chin

g,

attr

ibu

tio

n, M

L A

lgo

rith

ms

Agg

rega

tio

n

HDFS

9

AnalyticalDBMS

Big Data Analytics Platform Data Flows

Page 10: Phases of Big Data Challenges @ Nokia

Technology Platform

10

Hadoop RVShards

(KV)SDK,

Scribe, FTPHive, Pig

AnalyticalDBMS

Export/Import

Workflow Engine

Config./Deploy

Monitor AlertsData

PipelineScheduler

Security/Kerberos & ACL

On-Premise & Cloud Infrastructure

Page 11: Phases of Big Data Challenges @ Nokia

11

Data Platform

Self ServeTools

ETL, AggMachine Learning

Data QualityData Asset

Catalog

Data, Metadata, Operational Data

Collect Ingest Organize Analyze Deliver

Technology Platform

Page 12: Phases of Big Data Challenges @ Nokia

Click to edit Master title style

Phase 1 – 2012

1212

2008 – ‘10Build Technology

Platform,Get Data

2011Enhance Platform,

More Data,Simple Analytics,Data Crunching

2012PB’s of Data,

Hundreds of UsersThousands of JobsComplex Analytics,Multiple Clusters

Page 13: Phases of Big Data Challenges @ Nokia

13

2012 Production Statistics

• 10’s PB of data all across Nokia

• Multi-tenant, multi-petabyte analytics cluster

• 10-20K+ jobs per day

• 600+ internal users

• 300M+ KV queries

• Terabytes flowing in every day

• Multiple data centers around the world

Page 14: Phases of Big Data Challenges @ Nokia

14

Challenges With Big Data• Complex eco-system of technologies - many moving

parts, slower deploy cycles, data integration is complex

• Capacity & Scale Issues – Provision for peaks or sustained, storage or compute ?

• DBMS great for performance & data management, but cant scale - price/performance & ACIDity

• Hadoop great for ETL, but poor on query performance & data management, not interactive

• Data and Metadata fragmentation

Page 15: Phases of Big Data Challenges @ Nokia

15

Big Data Capacity Issues

• Spikey Workloads

• Capacity Provisioning– Peaks

– Sustained loads

• How many clusters ? – SLA/Adhoc/Research

– Multiple data centers

– Data duplication

• Tenancy – single/multi

• TOC – Hadoop can get expensive -

storage & computed tightly coupled, idle machines

Page 16: Phases of Big Data Challenges @ Nokia

16

Cloud helps with some issues• Operational & IT complexity reduced – API based spin up

& tear down – rapid deployments, faster cycles

• Pay for what is used

• Capacity issues mitigated - idle machines or peaks not an issue – elastically scale up and down

• De-coupled Storage and Compute makes sense

• Stateless architecture, recycle slow/bad machines, no need for rolling upgrades, instead do rolling replace

Page 17: Phases of Big Data Challenges @ Nokia

Click to edit Master title style

Phase 2

1717

2012PB’s of Data,

Hundreds of UsersThousands of JobsSimple & Complex

Analytics

2008 – ‘10Build Technology

Platform,Get Data

17

2011Enhance Platform,

More Data,Simple Analytics

2013Still Pending Challenges

Page 18: Phases of Big Data Challenges @ Nokia

18

Still Pending

• Data and Metadata fragmentation, need deeper integration into all tools/frameworks

• Advanced Analytics - Data science problems are hard & inefficient to implement in Map Reduce/RDBMS

Page 19: Phases of Big Data Challenges @ Nokia

19

Complex Analytics

• Mathematicians think terms of Arrays not Map Reduce

• Data science tools can’t efficiently handle big data

• Data partitioning is naïve, indexing wont scale

Page 20: Phases of Big Data Challenges @ Nokia

Big Data Technologies for Future

Page 21: Phases of Big Data Challenges @ Nokia

21

THANK YOU

Yekesa Kosuru [email protected]