Getting Started with IT Service Intelligence
-
Upload
splunk -
Category
Technology
-
view
93 -
download
2
Transcript of Getting Started with IT Service Intelligence
Copyright©2016SplunkInc.
GettingStartedwithITServiceIntelligence
AhmedKira– Sr.SalesEngineer,SFBayArea
WhatisaService?
Service RequestsResponses
InITSI,aService isalogicalgroupoftechnologycomponentsthatauserdeemsneedtobemonitoredtogether.
Itcanoftenbegeneralizedasa“blackbox”whichwesendrequests,andexpectresponses
4
WhatisaService?
DNS RequestsResponses
TechnicalServices
Auth RequestsResponses
Web RequestsResponses
Servicescanbelowerlevel(technical)…
5
WhatisaService?
DNS RequestsResponses
TechnicalServices
CustomerTransactions
RequestsResponses
BusinessServices
Auth RequestsResponses
Web RequestsResponses
SupportDesk RequestsResponses
Servicescanalsobehigherlevel(business)…
6
WhatisaService?
PacketNetwork
HypervisorandHosts
RBMDBs
StorageTier
APIServices
WebServices
CustomerTransactions
Mobile
API/Middlew
are
PartnerPortal
DNS
ServicescanencompassmultipletiersoftheITdomain.Servicesmayalsodependuponotherservices
7
WhatisaKPI?
DNS RequestsResponses
KPI:NumberofrequestsKPI:ErrorrateKPI:AverageresponsetimeKPI:ServerCPUloadKPI:ServernetworkI/Ferrors
CustomerTransactions
RequestsResponses
KPI:NumberoftransactionsKPI:ErrorrateKPI:AverageresponsetimeKPI:CountofIncidentTicketsKPI:SyntheticTransxHealth
KPIsandHealthscoresconstitutethemeansbywhichServicesaremonitored.
8
KeyPerformanceIndicators(KPIs)
10
AKeyPerformanceIndicator(KPI)isaSplunksavedsearchcreatedwithintheITSIUIthathelpsmonitoraspecificfieldlikeCPU,Memory,NumberofErrors
andsoon.KPIsarecontainedwithinServices.
ServiceHealthScores
11
AHealthscoreisascoreform0-100(0beingcriticaland100beingnormal)thathelpsdeterminethehealthofaService.ItiscalculatedbasedonallKPIs
importanceanditsstatus(e.g.green,orange,red),onceeveryminute.
Let’sTalkEntities
12
● Entitiesaretherelevantcomponentsthatsupportaservice(oftenbutnotalwayshosts)
● Selectthecorrectentitieswithfilters,ANDs,ORs
● EntitylistcancomefromaCMDB,aspreadsheet,aSplunksearch…
ServiceDecompositioninITSI
15
Identifytheprocessflowandunderlyingsub-services(Web->Middleware->DB->Middleware->Web)
ServiceDecompositioninITSI
16
Foreachsub-service,identifyKPIsthatwillshowhealthandstatus(Requests,responsetime,errors,OShealth…)
TypicalITSIconfiguration● CreateServices&entities– ImportviaCSVorSplunkSearchorClone– Manual
● DefineKPIs– Selectfromavailabletemplate,DataModelbasedKPIs,orSplunkSearch
● PopulateGlassTables● Extracapabilities– AdaptiveThresholding– AnomalyDetection– MultiKPIAlerting– CustomCorrelationSearches– NotableEvents
20
AnomalyDetection
24
● MachineLearning
● Workswellfordatawithpatterns
● Requiressome“training”
● Candetectentitiesthataren’tbehavingthesameastheothers,notjustasingleKPIhavingissues
UseITSIasaManagerofManagersprocessing3rd partyALERTS
http://docs.splunk.com/Documentation/ITSI/2.4.1/User/Ingestthird-partyalertsasnotableevents
25
NotableEvents
WrapUp- Review
26
● High-valueservicescanbedecomposedandmodeledinITSI,usingmachinedatafromtherelevantsystems
● Services andKPIs canbecreatedinminutes,withsophisticatedthresholdingtechniquestodistinguish“normal”from“notnormal”
● GlassTablesallowservicehealthandKPImetricstobedisplayedinawaythatmakessensetospecificgroups,suchasExecutiveLeadership,BusinessServiceOwners,theNOC,DevOps&Others
● DeepDivesallowKPIstobecomparedside-by-sideacrossanytimerange,acceleratingrootcauseanalysisandsignificantlyreducingMTTR
● Multi-KPIAlertsandNotableEventsreducealertnoise,producingactionableeventsandameanstomanagethem
● …andit’sfuntobuild!
Wanttoexploreonyourown?
27
Signupforyourveryownseven-dayfreesandbox!http://splunk.com/ITSI
Thenclick:
You’llfindaSandboxGuideintheDashboards!IntheITSIappofyoursandbox,gotoSearch->Dashboards->SplunkITSISandboxGuide
SignUpforaGlassTableExerciseHarnessthecreativityanddomainknowledgeofyourorganizationtounlockthevalueofdataandsolveanimportantserviceproblemthroughajoint
serviceintelligenceworkshopwithkeystakeholders
Definemethodsfor:
• Proactiveservicemonitoring
• Reducedriskandfailures
• Fasterissueresolution
• Increasedbusiness
performance
Whatisit?
• 1DayOnsiteWorkshop
• Tightlylinkedwithvalue
• Collaborativeapproach
• BuildyourownSplunk
ITSIGlassTable……
Copyright©2015SplunkInc.
• 5,000+ITandBusinessProfessionals• 175+Sessions• 80+CustomerSpeakers
PLUSSplunk University• Threedays:Sept23-25,2017• GetSplunk CertifiedforFREE!• GetCPEcreditsforCISSP,CAP,SSCP
SEPT25-28,2017WalterE.WashingtonConventionCenterWashington,D.C.CONF.SPLUNK.COM
The8th AnnualSplunkWorldwideUsers’Conference
NewRequirements!
37
● CreateanewKPIfortheDBService:● NetworkUtilization
● ModifytheExecutiveGlassTableinordertoshowofftheservicesyouslaveover
“WEonlyhaveabout15minTODOWHAT???!!???”
Thinkabouthowlongthiswouldtakeyoutoday?
AKPIin5minutes?Absolutely!
39
ClickNew– GenericKPI
Select DataModel● HostOperatingSystem● Network● #bytes● Next
KPIsContinued….
40
SplunkBuildsSearchesforyou–OhYeah,that’shappeningJ
● Select Yesfor Splitby& Filteroptions● Select hostfor EntityLookup& Aliasoptions● Click Next
AlmostThere…
41
Select● KPISearchSchedule:EveryMinute● EntityCalculation:Average● Service/AggCalculation:Average● CalculationWindow:LastMinute● Click Next
● Unit:Bps● Click Next
FinalSteps…
42
Setyourthresholds:● Aggregate(All)● PerEntity
● Click “AddThreshold”TWICE● MaketheNeapolitanicecreamcolors
Yellow,Green,Yellow● Dragtheslidersaroundinordertoget
thecurrentdatagraphentirelyinsidetheGreen(normal) band
● Click Finish● Otheroptionsarealsoavailable,
includingadaptivethresholdsandanomalydetection
NamethatKPI!
44
FromthelistofKPIs,selectyournewone(atthebottom)● Clickonthelittlepencilnexttothename● Callit“NetworkUtilization”,
withyourusernameupfront
● ClickonSave atbottomrightwhenfinished!
ClonetheGlassTable
46
ReturntoSavedGlassTablespage(click onGlassTablesintheuppermenubar)
CLICKEdit for“ButtercupGamesBusinessProcess(INPROGRESS)”• Select Clone• Title:Add yourusername
tothefront• Permissions:SharedinApp• Click ClonePage
• Click onyournewGlassTablefromthelist,toviewit
Edit&HaveFun!
47
ClickonEdit intheupperrightcornerofyourGlassTable
Usethe“Services”panelonthelefttoselectIndividualKPIs,or AggregateServiceHealthScores• Choose2KPIsfromOnlineStore thatwouldbeusefulin
the“OrderProcess”section• Dragtheselectedwidgetsontothecanvas,positioningin
thegrayoval
• What’sthedifferencebetweenthe
and toolsatthetopleft?
MoreFunwiththeGlassTableEditor…
48
UsetheConfigurations panelontherighttoeditaselectedwidget• Canchangethevisualizationtype,drilldown
behavior,andothersettings
• YoushouldhitSave frequently• IwonderwhatAutoLayoutdoes?• (YIKES!)RevertAllChangesmightbehelpful
Finishingup…
49
• AddaServiceHealthScore widgetforOnlineStoreunderButtercup
• ChooseaVizTypewithasparklinegraph,thenresizetomakeitlookpretty
• ModifytheCustomDrilldownactiontogotothesavedglasstable,ButtercupGamesOnlineStore
• BonusPoints:Makethelabelbigger,morereadable
• Click Save• View whendone
ATroubleshootingExercise
50
Let’suseITSItotroubleshootanoutage● StartatyourGlassTable,“<UserName>ButtercupBusinessProcess”● CustomerCarereportsthatunhappycustomersarecomplainingoffailures
andlongdelayswhentryingtopurchase● Thecallsbegancominginataroundtenminutesafterthehour.● IntheupperrightcorneroftheGlassTable,changethetimepickerfromNow
toXX:10:00.0,whereXXistheappropriatehour.Forexample,ifitiscurrently14:05,setthetimepickerto13:10:00.0,thenApply
● Thisishowwecan“timetravel”backtoseeconditionsataparticularoutage– ohyeah!
ATroubleshootingExercise,cont’d
51
● TheOnlineStoreseemstobedegraded,justasCustomerCarereported.ClickonthewidgetunderButtercuptodrilldownfurther
ATroubleshootingExercise,cont’d.
52
● TheOnlineStoreGlassTableshowsamuchmoredetailedview,includingtheimpactedcustomer-facingKPIsatthefarleft(Revenue,etc)
● Basedonthisviewofalltherelevantservices,wheredoyouthinktherootcauselies?
● Whichserviceshouldwetroubleshootfirst?● ClickonHealthwidgetforthatservice,to
drilldowntoaDeepDive
DeepDive
53
● DeepDiveshowsmultipleKPIsandHealthScoresinparallel“swimlanes”.
● TheHealthScoreforthisServiceisthetopswimlane.Canyouseewhenitbeginstodegradefrom100%?
● Mousing overthispointintime,canyouspottheKPIwiththeleadingfaultindication,i.e.,whatfailedfirst?
Multi-KPIAlertsandNotableEvents
54
● Click onNotableEventsReview● MultipleKPIsandHealthscorescan
becombinedinsophisticatedwaystocreateMulti-KPIalerts
● WhenaMulti-KPIalertfires,oneoftheoutcomesisthecreationofaNotableEvent
● NotableEventsallowNOCpersonnelandotherstotriageandcoordinateeventmanagementefforts