Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF...

26
Copyright © 2016 Splunk Inc. Marianne Faro Daniel Koops Gijs Wobben IClity Machine Learning Using Splunk And R

Transcript of Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF...

Page 1: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Copyright  ©  2016  Splunk  Inc.  

Marianne  Faro  Daniel  Koops    Gijs  Wobben  IClity  

Machine  Learning  Using  Splunk  And  R  

Page 2: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Disclaimer  

2  

During  the  course  of  this  presentaCon,  we  may  make  forward  looking  statements  regarding  future  events  or  the  expected  performance  of  the  company.  We  cauCon  you  that  such  statements  reflect  our  current  expectaCons  and  esCmates  based  on  factors  currently  known  to  us  and  that  actual  events  or  results  could  differ  materially.  For  important  factors  that  may  cause  actual  results  to  differ  from  those  contained  in  our  forward-­‐looking  statements,  please  review  our  filings  with  the  SEC.  The  forward-­‐looking  statements  made  in  the  this  presentaCon  are  being  made  as  of  the  Cme  and  date  of  its  live  presentaCon.  If  reviewed  aUer  its  live  presentaCon,  this  presentaCon  may  not  contain  current  or  

accurate  informaCon.  We  do  not  assume  any  obligaCon  to  update  any  forward  looking  statements  we  may  make.  In  addiCon,  any  informaCon  about  our  roadmap  outlines  our  general  product  direcCon  and  is  

subject  to  change  at  any  Cme  without  noCce.  It  is  for  informaConal  purposes  only  and  shall  not,  be  incorporated  into  any  contract  or  other  commitment.  Splunk  undertakes  no  obligaCon  either  to  develop  the  features  or  funcConality  described  or  to  include  any  such  feature  or  funcConality  in  a  future  release.  

Page 3: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Agenda  

  About  IClity    Machine  Learning  Use  Case    Architecture  And  Challenges    The  New  App    Demo  

3  

Page 4: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

4  

Page 5: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

5  

Our  business  

•  IT  consultancy  –  Data  Science  –  IT  infrastructure  –  SoUware  development  –  Datacenter  automaCon  –  IoT  –  …  

About  IClity  

Page 6: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

6  

Our  Customers  

6  

Page 7: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Data  Science  Goals  

7  

  We  help  our  customers  gain  visibility  into  their  data    We  use  data  to  diagnose  incidents  and  find  root  causes    We  predict  and  forecast  to  anCcipate  upcoming  issues  and  problems    We  automate  decisions  to  act  on  upcoming  issues  and  problems  

  …  and  we  do  so  by  using  everything  between  basic  staCsCcs  and  the  most  advanced  Machine  Learning  algorithms  out  there  

Page 8: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Data  Science  Goals  

8  

   

   

   

   

ACTION  

VALU

E

Data

Diagnos.c    Why  did  it  happen?    

Descrip.ve    What  happened?  

Predic.ve    What  will  happen?  

Prescrip.ve    What  should  I  do?    

DATA  

Analytics Human input Decision Action

   

   

   AggregaCon   Analysis   Modeling   PredicCon   VisualizaCon  

AggregaCon   Analysis   VisualizaCon  

AggregaCon   VisualizaCon  

DECISION  

Decision  AutomaCon  

Decision  Support  

DIFFICULTY Source:  Gartner  

Page 9: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

9  

Use  case:  Mishandled  Bags  (MHB’s)  

Page 10: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Use  Case:  Mishandled  Bags  (Mhb’s)  

10  

  MHB’s  are  bags  that:  –  Are  lost  during  handling  –  Miss  their  flight  –  Are  put  on  the  wrong  airplane  

  On  a  yearly  basis  the  total  cost  of  MHB’s  is  $2,4  bn.  

  CollaboraCon  with  a  large  airport  and  a  manufacturer  of  baggage  handling  systems  how  to  reduce  these  costs  

Page 11: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

The  Data  

11  

Data  of  mulCple  sources  in  Splunk:    Sample  of  over  30.000  unique  bags    3  Month  period    125  Variables  for  each  bag,  i.e:  –  Weight  –  Carrier  –  System  entrypoint  –  Check-­‐in  to  departure  delta  –  Etc.    

  2%  Was  considered  a  MHB  

Page 12: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

The  Approach  

12  

1.  Use  Splunk  to  combine  and  format  the  data  2.  Use  R  to  train  a  Boosted  Decision  Tree  model,  capable  of  

performing  Binary  LogisCc  ClassificaCon  (with  0=ok,  1=MHB)  3.  Export  the  trained  model  as  an  R-­‐package  4.  Use  the  R-­‐app  to  combine  Splunk  and  R-­‐code  5.  Import  the  model  and  quickly  classify  new  bags  with  a  risk  score  6.  Extract  the  feature  importance  to  assess  which  variables  have  

the  biggest  impact  on  bags,  causing  MHB’s  

Page 13: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Risk  Assessment  

13  

  The  model  returns  a  score  between    0  and  1  

  This  is  the  probability  of  a  bag  becoming  a  MHB  

  Use  Splunk  to  filter  low-­‐risk  bags    Use  Splunk  alerCng  for  high-­‐risk  bags  

Page 14: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Feature  Importance  

14  

  “Reverse-­‐engineered”  the  model  to  extract  informaConal  gain  from  each  variable  

  Several  interesCng  finds:  –  Next  staCon  /  Last  locaCon  –  Seat  Number  –  Amount  of  Bag  ID  codes  per  unique  bag  

  Next  steps:  further  invesCgate  important  variables  with  domain  experts  

Page 15: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Architecture  And  Challenges  

15  

  The  R  app  on  Spunkbase  runs  R  on  the  Search  Head    Training  models  can  be  very  CPU  and  Memory  intensive    Scaling  a  Search  Head  cluster  becomes  more  complex  

search  head   search  head   search  head  

Page 16: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

Architecture  And  Challenges  

16  

  The  new  R  app  communicates  with  a  remote  R  server    Training  a  model  no  longer  impacts  search  performance    Scale  R  independently  from  the  Search  Head  cluster  

search  head   search  head   search  head   R  server  

Page 17: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

The  New  App  

17  

  Is  released  TODAY!    Contains  several  example  dashboards  and  a  new  R  code  editor    Contains  new  search  commands  for  interacCng  with  R    Makes  R  an  extenCon  of  SPL  and  allows  you  to  create  business  value  even  faster  

Page 18: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

18  

DEMO  

The  New  App  

Page 19: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

19  

Page 20: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

20  

Page 21: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

21  

Page 22: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

22  

Page 23: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

23  

Page 24: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

24  

Page 25: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

25  

Page 26: Marianne*Faro* Daniel*Koops** GijsWobben - SplunkConf · PDF fileincorporated*into*any*contractor*other*commitment.*Splunk*undertakes*no*obligaon*either*to*develop* ... 9/21/2016 10:13:10

THANK  YOU