Don't Re-write Code to Get Better Analytics

39
Copyright © 2012, Splunk Inc. Listen to your data. Don’t Rewrite Code to Get BeCer AnalyEcs Archana Ganapathi Research Engineer

description

Almost all developers face the challenge of reactively debugging failed business transaction processes. Not only does this require extensive navigation of enormous volumes of log data, but determining root cause becomes a laborious and time-consuming task.Additionally, business managers often request developers and operations to provide analytics on applications, resulting in the tedious task of charting the information, most usually from intangible data. Learn how to capture, extract and analyze your event data by having analytics embedded in the application. Download the white-paper that details how to gain Application Intelligence through effective logging.Check out the webinar here: http://www.splunk.com/goto/analytics_webcast

Transcript of Don't Re-write Code to Get Better Analytics

Page 1: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Don’t  Rewrite  Code  to  Get  BeCer  AnalyEcs  Archana  Ganapathi  Research  Engineer  

Page 2: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

•  Modern  systems  are  distributed  and  heterogeneous  •  Consolidate  informaEon  •  Analyzing  across  a  distributed  architecture    

•  AnalyEcs  is  limited  to  informaEon  that  is  made  “available”  

AnalyEcs  Can  Be  Challenging!  

Page 3: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Typical  Architecture  ApplicaEons  

Data  Warehouse  Direct  Insert  

ETL  

Connector  

Database  

BI,  AnalyEcs,  ReporEng  Tool  

Page 4: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Development  Cycle  

Early  Structure  Binding  Decide  the  quesEons  you  want  to  ask  

Design  the  Schema  

Normalize  the  data  and  Write  DB  inserEon  code  

Create  SQL  &  feed  into  AnalyEcs  Tool  SELECT  customers.*  FROM  customers  WHERE  customers.customer_id  NOT  IN(SELECT  customer_id  FROM  orders  WHERE  year(orders.order_date)  =  2004)  

Page 5: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

A  Paradigm  Change:  Use  Your  Log  Files  

Page 6: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Using  Log  Files  !Log.debug(“orderstatus=error,errorcode=454,!

!user=%s,transactionid=%d”, userId, transId)!

ü   You  already  log  key  informaEon  

Page 7: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

•  DefiniEve  record  of  acEvity  and  behavior  •  Ensure  system  security  •  Meet  compliance  mandates  

They  contain  a  gold  mine  of  informaEon  

10.2.1.44 - [25/Sep/2009:09:52:30 -0700] type=USER_LOGIN msg=audit(1253898008.056:199891): user pid=25702 uid=0

auid=4294967295 msg='acct="TAYLOR": exe="/usr/sbin/sshd" (hostname=?, addr=10.2.1.48, terminal=sshd res=failed)'

User  IP   AcEon   Login   Result  

Using  Log  Files  

Page 8: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

•  Important  insight  for  IT  and  the  business  •  Customer  behavior  and  experience  •  Product  and  service  usage  •  End-­‐to-­‐end  transacEon  visibility  

They  contain  a  gold  mine  of  informaEon  

Using  Log  Files  

10.2.1.80 - - [25/Jan/2010:09:52:30 -0700] "GET /petstore/product.screen

?product_id=AV-CB-01 HTTP/1.1" 200 9967 "http://10.2.1.224/petstore/category.screen?category_id=BIRDS" "Mozilla/5.0 (compatible; Konqueror/3.1;

Linux)”"JSESSIONID=xZDTK81Gjq9gJLGWnt2NXrJ2tpGZb1HyHHV8hJGYFj1DFByvL5L!-1539148667"

User  IP   Product   Category  

Page 9: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

They  Help  You  Find  Problems  Apr 29 19:13:01 45.2.98.7 SentriantGenericAlert: Time="04/29/06 07:12 PM PDT",Host="roach_motel.enet.interop.net",Category="fabric_network_activity",Generator="Response:Slow Scan",Type="NOTICE",Priority="High",Body="Appliance=roach_motel.enet.interop.net,Reporting Segment=ENET network,Action=Response disabled,Response=Slow Scan,Duration=90 seconds,Source Segment=Unprotected,Source IP=88.73.39.200,Source MAC=00:01:30:BC:93:90,Current Target Count=0"!Apr 29 19:13:01 45.2.98.7 SentriantGenericAlert: Time="04/29/06 07:12 PM PDT",Host="roach_motel.enet.interop.net",Category="fabric_network_activity",Generator="Response:Slow Scan",Type="NOTICE",Priority="High",Body="Appliance=roach_motel.enet.interop.net,Reporting Segment=ENET network,Action=Response disabled,Response=Slow Scan,Duration=69 seconds,Source Segment=Unprotected,Source IP=68.163.20.95,Source MAC=00:01:30:BC:93:90,Current Target Count=0"!Apr 29 19:13:01 45.2.98.7 SentriantGenericAlert: Time="04/29/06 07:12 PM PDT",Host="roach_motel.enet.interop.net",Category="fabric_network_activity",Generator="Response:Slow!

45.2.98.7 SentriantGenericAlert:Time="04  

Page 10: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Machine-­‐generated  Events  are  Everywhere  AddiEonal  Sources   Core  IT   Customer-­‐facing  IT  

Page 11: Don't Re-write Code to Get Better Analytics

Copyright  ©  2011,  Splunk  Inc.   Listen  to  your  data.  Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Customer    Facing  Data  

Outside  the  Datacenter  

Applica7ons  "   Web  logs  "   Log4J,  JMS,  JMX  "   .NET  events  "   Code  and  scripts  

Networking  "   ConfiguraEons  "   syslog  "   SNMP  "   neilow  

Databases  "   ConfiguraEons  "   Audit/query  logs  "   Tables  "   Schemas  

Virtualiza7on    &  Cloud  "   Hypervisor  "   Guest  OS,  Apps  "   Cloud  

Linux/Unix  "   ConfiguraEons  "   syslog  "   File  system  "   ps,  iostat,  top  

Windows  "   Registry  "   Event  logs  "   File  system  "   sysinternals  

Logfiles   Configs   Messages   Traps      Alerts  

Metrics   Scripts   Tickets  Changes  

"   Click-­‐stream  data  "   Shopping  cart  data  "   Online  transacEon  data  

"   Manufacturing,  logisEcs…  

"   CDRs  &  IPDRs  "   Power  consumpEon  "   RFID  data  "   GPS  data  

Splunk:  The  Plaiorm  for  Machine  Data  

Page 12: Don't Re-write Code to Get Better Analytics

Copyright  ©  2011,  Splunk  Inc.   Listen  to  your  data.  Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Splunk  Collects  and  Indexes  Any  Machine  Data  

Customer    Facing  Data  

Outside  the  Datacenter  

Applica7ons  "   Web  logs  "   Log4J,  JMS,  JMX  "   .NET  events  "   Code  and  scripts  

Networking  "   ConfiguraEons  "   syslog  "   SNMP  "   neilow  

Databases  "   ConfiguraEons  "   Audit/query  logs  "   Tables  "   Schemas  

Virtualiza7on    &  Cloud  "   Hypervisor  "   Guest  OS,  Apps  "   Cloud  

Linux/Unix  "   ConfiguraEons  "   syslog  "   File  system  "   ps,  iostat,  top  

Windows  "   Registry  "   Event  logs  "   File  system  "   sysinternals  

Logfiles   Configs   Messages   Traps      Alerts  

Metrics   Scripts   Tickets  Changes  

"   Click-­‐stream  data  "   Shopping  cart  data  "   Online  transacEon  data  

"   Manufacturing,  logisEcs…  

"   CDRs  &  IPDRs  "   Power  consumpEon  "   RFID  data  "   GPS  data  No  upfront  schema  

No  custom  connectors  No  RDBMS  No  need  to  filter/forward  

• Any  amount,  any  locaEon,  any  source.  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 13: Don't Re-write Code to Get Better Analytics

Copyright  ©  2011,  Splunk  Inc.   Listen  to  your  data.  Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

A  Single  Plaiorm  for  OperaEonal  Intelligence  

Real-­‐7me  Visibility  •  Live  dashboards  •  Event  correlaEon  •  Monitoring  and  alerEng  •  Performance  issues  •  TransacEon  levels  •  SLA  tracking  

Three  Primary  CapabiliEes  Historical  Analy7cs  •  Baseline  and  thresholds  •  Trending  •  OperaEonal  insights  •  Historical  paCerns  •  Compliance  reports  

Single  Data  Store   Single  UI   Across  Use  Cases  

Search  /  Naviga7on  • Data  drilldown  • “Needle  in  a  haystack”  • Root  cause  analysis  /    troubleshooEng  •  Incident  invesEgaEons  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 14: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Real  Business  Value  with  OperaEonal  Metrics  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 15: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Intelligence  on  your  ApplicaEons  with  Splunk  

ApplicaEon  

Log  Files  

OperaEonal  Intelligence  

+  AnalyEcs  

+    ReporEng  

Java  EE  Server  

Unix  based  OS  Unix  based  OS  

Database  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 16: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Apr 29 19:13:01 45.2.98.7 entriantGenericAlert: Time="04/29/06 07:12 PM PDT”, Host="roach_motel.enet.interop.net",Category="fabric_network_activity",Generator="Response:Slow Scan",Type="NOTICE",Priority="High",Body="Appliance=roach_motel.enet.interop.net,Reporting Segment=ENET network,Action=Response disabled,Response=Slow Scan,Duration=90 seconds,Source Segment=Unprotected,Source IP=88.73.39.200,Source MAC=00:01:30:BC:93:90,Current Target Count=0"!Apr 29 19:13:01 45.2.98.7 SentriantGenericAlert: Time="04/29/06 07:12 PM PDT",Host="roach_motel.enet.interop.net",Category="fabric_network_activity",Generator="Response:Slow Scan",Type="NOTICE",Priority="High",Body="Appliance=roach_motel.enet.interop.net,Reporting!

An  AlternaEve  Development  Cycle  

Late  Structure  Binding  Write  events  to  your  log  files  

Collect  log  files  

Create  searches,  graphs  and  reports  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 17: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

“SemanEc  Logging”  

Events  which  are  wriCen  explicitly  for  the  gathering  of  analyEcs  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 18: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

A  Simple  Example  

void submitPurchase(transctionID) !{ ! log.info("action=submitPurchaseStart, transactionId=%d", transactionID, “ productId=%s”, productId, “ listPrice=%d\n”, listPrice)!! //these calls throw an exception on error! submitToCreditCard(...)! generateInvoice(...)! generateFullfillmentOrder(...)!! log.info("action=submitPurchaseStop, transactionID=%d\n", transactionID)!} !!!

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 19: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

AnalyEcs  QuesEons  Enabled  ü  Purchase  volume  by  hour,  by  day,  by  month  ü  How  long  are  purchases  taking?  ü  Are  my  purchases  taking  longer  than  they  did  last  month?  ü  Are  my  systems  geong  slower?  ü  How  many  purchases  are  failing?    ü  Which  specific  purchases  are  failing?  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 20: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

AnalyEcs  Dashboard  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 21: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Streaming  Radio  Example  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 22: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Group  TransacEons  sourcetype=radiolog | transaction IPAddress startswith="play" endswith="stop"  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 23: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Calculate  Concurrency  " sourcetype=radiolog | transaction IPAddress startswith="play" endswith="stop" | concurrency duration=duration  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 24: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Add  Lookups  and  StaEsEcs  > sourcetype=radiolog | transaction IPAddress startswith="play"

endswith="stop" | concurrency duration=duration | eval key=1 | lookup songs

key | stats first(song) as song max(concurrency) as concurrency by id | stats

sum(concurrency) by song

Page 25: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Developer  Concerns  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 26: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

 Developer  Concern:  Performance  

 

15  sec  

92  sec  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 27: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Developer  Concern:  Infrastructure  Cost  

ü Splunk  Requires  standard  hardware  ü Start  with  an  easy  download  ü Free  Apps  for  domain  specific  analyEcs  ü Proven  in  Big  Data  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 28: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Developer  Concern:  Refactoring  Code  ü Start  gradually  and  grow  organically  ü Develop  future  applicaEons  with  analyEcs  and  Splunk  in  mind  ü Build  closer  relaEonships  with  Ops,  Support  and  QA  ü ROI  can  be  priceless  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 29: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Developer  Concern:  How  Much  to  Log  Two  approaches  to  event  logs:  ü Log  what  is  evidently  required  ü Open  the  flood-­‐gates    QuanEty  and  granularity  can  vary  based  on  task:  -­‐  Diagnosis  -­‐  ReporEng  -­‐  AnalyEcs  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 30: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Logging  Best  PracEces  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 31: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Create  Human  Readable  Events  ü  Log  in  Text  ü  Make  it  easy  for  humans  ü  Categorize  ü  Avoid  XML  or  JSON  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 32: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Clearly  Time  Stamp  Every  Event  

ü  Do  not  use  Fme  offsets  ü  Use  human  readable  Fmestamps  ü  Favor  the  beginning  of  the  line  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 33: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Use  Clear  Key/Value  Pairs  Example  (Bad):  

!Log.debug(“error 454 - %s %d”, userId, transId)!

Example  (Good):   !Log.debug(“orderstatus=error,errorcode=454,! !user=%s,transactionid=%d”, userId, transId)!

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 34: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Break  MulE-­‐Value  InformaEon  Into  Separate  Events  

Example  (Good):  <TS>  phonenumber=415-­‐555-­‐1212,  app=angrybirds,  installdate=xx/xx/xx  <TS>  phonenumber=415-­‐555-­‐1212,  app=facebook,  installdate=yy/yy/yy  

Example  (Bad):  <TS>  phonenumber=415-­‐555-­‐1212,app=angrybirds,facebook  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 35: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Log  Unique  IdenEfiers  ü  Allows  to  track  transacEons  in  detail  ü  Use  TransiEve  Closure  if  you  need  to:    

transid=abcdef,    transid=abcdef,    otherid=  qrstuv,  .  .  .  .  .  otherid=qrstuv  

Transac7on  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 36: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Using  Header  Lines  for  Keys  

<TS>  USER              PID    %CPU  %MEM            VSZ        RSS      TT    STAT  STARTED            TIME  COMMAND  root                41    21.9    1.7    3233968  143624      ??    Rs        7Jul11    48:09.67  /System/Library/foo  rdas              790      4.5    0.4    4924432    32324      ??    S          8Jul11      9:00.57  /System/Library/baz    .  .  .  .  .  .  .  .  

•  Splunk  will  interpret  the  column  headers  as  keys  and  each  line  as  values  

Copyright  ©  2012,  Splunk  Inc.  Burlingame,  March  8,  2012  

Page 37: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Top  Takeaways  

37  

Log  anything  that  can  add  value  when  aggregated  and/or  visualized  

Page 38: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Top  Takeaways  

38  

Simplify  your  life…  Splunk  logs  for  AnalyEcs  

Page 39: Don't Re-write Code to Get Better Analytics

Copyright  ©  2012,  Splunk  Inc.   Listen  to  your  data.  

Thanks!  QuesEons?