Vantrix hunk

27
Mobile “Big” Data Analy2cs Mark Hopper – Vantrix Raanan Dagan Splunk 1

description

Vantrix use case of Hadoop with Hunk - Splunk Analytics for Hadoop

Transcript of Vantrix hunk

Page 1: Vantrix hunk

Mobile  “Big”  Data  Analy2cs  Mark  Hopper  –  Vantrix  Raanan  Dagan  -­‐  Splunk  

1  

Page 2: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Hunk  -­‐  Integrated  Analy1cs  Pla5orm  

Full-­‐featured,  Integrated  Product  

Insights  for  Everyone  

Works  with  What  You  Have  Today  

2  

Explore   Visualize   Dashboards  

Share  Analyze  

Hadoop  Clusters   NoSQL  and  Other  Data  Stores  

Hadoop  Client  Libraries   Streaming  Resource  Libraries  

Page 3: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Hunk  –  Unique      

3  

1.   Run  Na1vely  in  Hadoop:  –  Use  Hadoop  MapReduce    

2.   Mixed  Mode:    –  Allows  for  data  Preview  

3.   Auto  deploy  SplunkD  to  DataNodes:  –  On  the  fly  Indexing  

4.   Access  Control:  –  Allows  for  many  users  /  many  Hadoop  directories  /  support  Kerberos      

5.  Schema  On  the  Fly  

Page 4: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

What  is  New  in  Hunk  6.1  

4  

1.   Report  Accelera1on:  –  Get  results  in  seconds    

2.   Hive  Schema:    –  Expose  User  Created  Schema  

3.   Mul1ple  File  Formats:  –  Parquet,  Sequence,  ORC,  RC  

4.   Pass-­‐Through  Authen1ca1on:  –  Splunk  users  iden2fied  in  Hadoop  

5.   Streaming  Resource  Libraries        

Page 5: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Hunk  Demo  

Page 6: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Our  Company  Global  Capability  

6  

•  Established  2004  •  Mobile  media  experts  –  transcoding,  delivery,  op1miza1on  •  40+  patent  families  –  leading  media  university  research  rela1onship  •  HQ  in  Montreal;  offices  in  Seaale,  London,  Sydney  •  65+  operator  deployments  globally  

Page 7: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Our  Product  Lines  

7  

Bandwidth  Op1mizer  

Mobile  Message    Op1mizer  

Mul1screen  Video  Pla5orm  

Op1mize  delivery  and  Quality  of  Experience  of  ‘Over  The  Top’  services  across  Mobile  Networks  

Assure  Quality  of  Experience  and  interoperability  of  Mobile  Mul1media  Messaging  

High  Density  Transcoding  and  Op1mized  Delivery  of  Video  across  all  Devices  and  Networks  

Page 8: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

The  opportunity  for  Analy1cs  

8

Vantrix  Gateway  

KPI  Dashboard  

Analy1cs  

Average  Operator  –  25  Million  records  /  day  

Page 9: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

KPIs  answer  WHAT?  

Date  and  Time  

Device  

Session  Volumes  

Data  Volume  

Video  Session  Quality  

Bitrates  

Media  Codec  

Media  Container  

Media  Size  

Web  &  Video  sites  

Video  resolu1on  

Video  frame  rates  

Video  dimensions  

Video  length  Delivery  protocol  

Loca1on  

Media  Types  

Session  Length  

Video  stall  1me  

Page 10: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Analy1cs  explores  WHY?  

Date  and  Time  

Device  

Session  Volumes  

Data  Volume  

System  Topology  

Bitrates  

Media  Codec  

Media  Container  

Media  Size  

Web  &  Video  sites  

Video  resolu1on  

Video  frame  rates  

Video  dimensions  

Video  stall  1me  

Video  length  Delivery  protocol  

Loca1on  

Media  Types  

Session  Length  

Helping  our  customers  plan  their  business  strategies  

Video  Session  Quality  

Page 11: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Product  Management  requirements  to  Engineering  

11

Scale  to  Manage  lots  and  lots  of  data  

High  Performance  

Low  Hardware  Footprint  

Highly  Flexible  Data  Structure  

Flexible  and  Simple  Repor1ng  UI  

Low  Cost  of  Goods  

Oh..  and  be  ready  to  go  commercial  in  90  days  

Manage  1yr  of    data  80  GB  or  25M  Records/Day  30TB  or  10  Bil  Recs/year  

Fit  within  ½  Telco  Rack  

TL  –  <1hr  for  a  day  of  data  Queries  average  <  30  secs  

Future  proof  design  for  new  use  cases  

Support  solu1on  margin  targets  

Easily  explore  data  in  new  ways  via  a  

simple  UI  

Use  Cases  Required  

Page 12: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Enter  Splunk  –  Product  Manager’s  best  friend  

12

Out  of  the  box  worked  on  our  inconsistent  record  structure  

Immediately  iden1fied  the  key  record  fields  

Automa1cally  created  new  fields  (e.g.  URL  tags)  

Automa1cally  indexed  and  counted  field  value  occurrences  

Proved  invaluable  for  iden1fying  and  exploring  use  cases  

Page 13: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Example  of  the  exploratory  power  

13 No  query  scrip2ng,  No  pre-­‐determined  searches:  point  and  click  explora2on  

Page 14: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Point  and  click  explora1on  of  complex  query  results  

14 Rapidly  explore  and  uncover  the  story  behind  the  story  

Page 15: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

The  Vantrix  BigData  pla5orm  architecture  

15

High  Density  compu2ng  cluster    

Virtualiza1on,  Orchestra1on  

SymKloud  Cluster  (16RU)  144  CPU(Hadoop)  Nodes,    

69  TB  SSD  Storage  

Big  Data  Processing  

Analy1cs  Applica1on  

Private  cloud  management  layer  

BigData  Filesystem  and  MapReduce  architecture    

Data  explora2on  and  repor2ng  applica2on  layer    

Page 16: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Hadoop  and  Hunk  

16

Hadoop  Cluster  

Log  Files  

Hunk  Hunk   Hunk  Hunk  

Hunk  Hunk  Hunk  Hunk  

•  Hadoop  §  An  open-­‐source  Framework  §  Used  by  Facebook,  Google,  Yahoo  §  Manage  and  query  vast  amounts  

of  unstructured  data  §  Highly  scalable  §  Built  in  redundancy  

•  2  Key  Capabili1es  §  Distributed  Filesystem  §  MapReduce  

 

Transparent  UI  migra2on  from  Splunk  to  Hunk  

Page 17: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Performance  -­‐  Small  Scale  system  

17

•  10  Million  subscribers  generate:  §  80GB  of  raw  session  log  data  /  day  §  26  Million  video  data  session  records    

•  Transform  and  Load  §  1.6TB  in  27  mins  §  1GB/Second    §  Projec2ng  2GB/Second  afer  tuning  

•  Hunk  Query  §  20  sec  –  search  through  27M  events  §  Returning  4.7M  events  

Virtualiza1on,  Orchestra1on  

SymKloud  Cluster  (4RU)  28  CPU  Nodes,  14  TB  SSD  Storage  

Big  Data  Processing  

Analy1cs  Applica1on  

Page 18: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

The  End  Result  

18

Page 19: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Live  Event  impact  

19

The  key  Olympic  Hockey  games  drove  significant  peaks  

Page 20: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Video  site  characteris1cs  

20

YouTube  dominates  the  consump2on  of  low  bitrate  encoded  video  and  higher  encoding  rates.  A  number  of  sites  clearly  focus  on  a  narrow  set  of  encoding  rates  such  as  

Instagram  domina2ng  the  1.2Mbps  bracket    

Page 21: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Mobile  Network  Performance  

21

Page 22: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Video  session  quality  

22

Generally  subscribers  are  receiving  a  good  video  session  quality,  however  some  video  sites  are  problema1c  

Page 23: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Session  quality  distribu1on  by  encoding  rate  

23

Off  Peak   Peak  

On  average  10%  of  sessions  below  1Mbps  (majority)  experience  stalling.  However  comparing  off-­‐peak  vs  peak      

Page 24: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Video  Stall  1mes  

24

Low  Quality  

HD  

HQ  

Stall  2me  for  HD  Videos  twice  that  of  HQ  

Page 25: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

User  engagement/abandonment  

25

Watch  ra1o  declines  for  videos  encoded  beyond  1.6Mbps    

Page 26: Vantrix hunk

Confiden2al  ©2014  Vantrix  –  All  rights  reserved  

Rela1onship  between  engagement  and  video  session  quality  

26

Subscribers  abandon  video  less  than  halfway  through  when  video  session  quality  is  bad  

Page 27: Vantrix hunk

Thank  you  

Mark  Hopper  –  VP  Product  Line  Management  

27