Top 5 Vm Performance Problems

download Top 5 Vm Performance Problems

of 22

Transcript of Top 5 Vm Performance Problems

  • 8/2/2019 Top 5 Vm Performance Problems

    1/22

    TheTop5VMPerformanceProblemsA system metricsbased approach to detecting,

    diagnosing and resolving VMware performance issues

    WHITEPAPERBYDAVIDDAVISANDALEX ROSEMBLAT

  • 8/2/2019 Top 5 Vm Performance Problems

    2/22

    TableofContentsIntroduction.................................................................................................................................................. 3

    VirtualizationIncreasesInfrastructureComplexityforVMwarePerformanceTroubleshooting.................3

    IsolatingVMPerformanceProblemAreaswithinaLayeroftheVirtualizationStack.............................. 4

    GuestOSandApplicationIssues............................................................................................................... 5

    HypervisorIssues...................................................................................................................................... 6

    IdentifyingtheRootCauseofaPerformanceIssue...................................................................................... 6

    TheTop5VMPerformanceProblems.......................................................................................................... 6

    1. MemoryContention.......................................................................................................................... 6

    Symptoms............................................................................................................................................. 7

    Causes................................................................................................................................................... 7

    Diagnosis...............................................................................................................................................

    7

    Resolution............................................................................................................................................. 9

    2. HosttoSANFabricIssues................................................................................................................ 10

    Symptoms........................................................................................................................................... 10

    Causes................................................................................................................................................. 10

    Diagnosis............................................................................................................................................. 11

    Resolution........................................................................................................................................... 11

    3. StorageI/OandDiskProcessingContention.................................................................................. 11

    Symptoms........................................................................................................................................... 12

    Causes................................................................................................................................................. 12

    Diagnosis............................................................................................................................................. 13

    Resolution........................................................................................................................................... 13

    4. InsufficientCPUResources............................................................................................................. 14

    Symptoms........................................................................................................................................... 14

    Causes................................................................................................................................................. 14

    Diagnosing...........................................................................................................................................

    14

    Resolution........................................................................................................................................... 16

    5. NetworkContention....................................................................................................................... 16

    Symptoms........................................................................................................................................... 17

    Causes................................................................................................................................................. 17

    Diagnosis............................................................................................................................................. 17

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 1

  • 8/2/2019 Top 5 Vm Performance Problems

    3/22

    Resolution........................................................................................................................................... 18

    AccessingMetricstoDiagnoseanIssue..................................................................................................... 18

    DocumentingaResolution.......................................................................................................................... 20

    Conclusion................................................................................................................................................... 20

    AbouttheAuthor.................................................................................................................................... 21

    AbouttheSponsor.................................................................................................................................. 21

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 2

  • 8/2/2019 Top 5 Vm Performance Problems

    4/22

    IntroductionItisnotuncommonforthefirstindicationofavirtualmachineperformanceindicatortobeacallfrom

    anendusercomplainingofslowperformancewiththeirapplicationorvirtualdesktop.Solvingenduser

    applicationissuesquicklyandefficientlyrequiresknowledgeofanenvironmentsapplications,

    infrastructure,potentialbottlenecks,andthecommonvirtualmachineperformanceproblems.Being

    preparedandknowingwhattolookforcanmakethedifferencebetweensolvingaVMperformance

    problemquicklyandlettingaperformanceproblembringdowntheentirevirtualinfrastructure.This

    whitepaperwillfirstexplainwhylocatingperformanceissuesinavirtualizedenvironmentcanbe

    complicated,andwillthendescribethetopVMperformanceproblemswithafocusonissuediagnosis

    andresolution.

    VirtualizationIncreasesInfrastructureComplexityforVMwarePerformanceTroubleshootingWhenitcomestoperformance,thevirtualinfrastructureismorecomplextounderstandthanthe

    physicalinfrastructure.Thinkofthevirtualinfrastructurelikeafunnelwiththeapplicationsontop,

    runninginsidetheguestOSwhichisinstalledinthevirtualmachines.Thosevirtualmachinesare

    configuredwithvirtualhardwarelikevirtualCPU,virtualmemory,virtualdisk,andvirtualnetwork.The

    virtualmachinesrunonthehypervisorthatprovidessharedaccesstothephysicalhardwareofthe

    server.Thescarcephysicalhardwareisburdenedwithhugepressurefromthesubstantialamountof

    virtualhardware,virtualmachines,guestoperatingsystems,andapplicationsthatareridingontopofit.

    Figure1TheVirtualInfrastructureFunnelThatIncreasestheEfficientUseofPhysicalHardwareUnlikethetypicalphysicalserverinfrastructuredesignwherea1to1mappingexistsbetweenthe

    application,operatingsystem,physicalserver,anddiskstorage,virtualizedserversaremuchdifferent.

    Virtualizationtypicallyrequiressharedstorageandcreatesavirtualnetworkwhenimplemented.Thus,

    whereitwasonceeasytodeterminewhatadminorgroupwasresponsibleforwhat,virtualization

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 3

  • 8/2/2019 Top 5 Vm Performance Problems

    5/22

    forcestheserver,storage,network,virtualizationadmins,andapplication/databaseadmintowork

    togetherinsolvingperformanceproblems.Theidealcapacitymanagementtoolwillisolateperformance

    issuestooneoftheseparticularareas,andallowdatacenterstafftopreventfingerpointingbeforeit

    begins.

    Thewhole

    point

    of

    server

    consolidation

    is

    to

    maximize

    physical

    resources

    to

    get

    the

    most

    return

    on

    investment.However,thereisafinelinebetweenmaximizingthenumberofvirtualmachinesona

    physicalserverandovercommittingthephysicalserver(whichwouldresultinpoorenduserapplication

    performance).TheVMwareadministratormust"walkatightrope"betweenoverandunderprovisioning

    resourcesinanattempttopractice"consolidationinmoderation".Luckily,therearespecificmetricsto

    analyzeandtoolstohelpyouwalkthisfinelineassafelyaspossible.

    IsolatingVMPerformanceProblemAreaswithinaLayeroftheVirtualizationStackBesides

    the

    traditional

    CPU,

    RAM,

    Disk,

    and

    Network

    that

    data

    center

    staff

    typically

    analyze

    to

    solve

    performanceproblems,virtualizedenvironmentsintroduceadditionallayers(likethehypervisorandits

    virtualhardware)whichallowdatacenterstoplacemultiplevirtualmachines,guestoperatingsystems,

    andapplicationsonahost.

    Figure2TheVirtualizationStackBecauseofthesenewlayers,solvingperformanceproblemsinthevirtualenvironmentrequiresanIT

    teamtogainknowledgeaboutwhatexactlyisoccurringwithineachlayerandhowthesecomputing

    actionsinterrelate.Withoutthisunderstanding,analyzingadditionaldataaboutwhatisoccurringwithin

    eachlayertopinpointaproblemcanbechallenging.

    Tosolveaperformanceproblem,thedatacenterteammustfirstidentifythelayerthattheproblemis

    in:For

    instance,

    is

    there

    aphysical

    hardware

    issue?

    Is

    the

    physical

    hardware

    unable

    to

    fulfill

    the

    resourcedemand?Isthereaproblemwiththehypervisor?Arevirtualmachinesunderprovisioned?If

    so,whichvirtualmachineisunderprovisionedandwhatresourceisthebottleneck?Whataboutthe

    StorageAreaNetwork(SAN)orNAS?Hasitreachedmaximumcapacityordidtheenvironmentjusthit

    anI/Obottleneck?Thenthereisthenetworktofretover...

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 4

  • 8/2/2019 Top 5 Vm Performance Problems

    6/22

    Thedepthandbreadthoftheareasthatneedtobeexaminedtodiagnoseaperformanceproblemcan

    seemoverwhelmingatfirst.However,successfullytroubleshootingperformanceissuesisaskillthatcan

    bequicklylearned.Infact,afterfinishingthiswhitepaper,youwillhavewalkedthroughthestepsto

    knowjusthowtoisolatewhichlayerofthevirtualizationstackisexperiencingissuesandwhat

    particularareawithinthatlayeriscausingaperformanceproblem.Betteryet,you'llknowwhattodoto

    solveit.

    GuestOSandApplicationIssuesMuchofthiswhitepapercoversperformanceissuesrelatedtoparticularvirtualmachineresources.

    However,itisimportanttotakeamomenttomentionpotentialissueswiththeguestOSand

    applications.

    Ofcourse,theguestOS,withoutapplicationsdoesnotcauseanyissuesonitsownandverylittle

    resourcesareconsumedbyapoweredonguestOSthatisdoingnothing.

    Onceapplicationsareinitiatedandbegindemandingresources,problemscanbegintooccur.Theteam

    monitoringperformancefortheenvironmentshouldbecomefamiliarwiththeapplicationsrunning

    withinthevirtualinfrastructure,trackingresourcedemandsfromtheseapplicationsastheycanbecome

    asignificantsourceofpressureonphysicalresources.

    Herearepotentialissuestowatchoutfor,relatedtotheguestOSandapplications:

    GuestOperatingSystemIssues Hungprocesses,failedservices,misconfigurationsandother

    issueswithintheguestOScantaxphysicalresourcesunnecessarily.

    Applicationlevelissues:

    o ResourceHogApplications CertainapplicationssuchasSQLServercantakeupallof

    aresourcethatareallocateddespitethefactthattheapplicationwillnotactivelyuse

    thatresource.InthecaseofSQLServer,theapplicationwilltakeupallmemorythatitis

    allocated,nomattertheamountgiven.

    o ChangesinApplicationsCodelevelchangesattheapplicationlevelresultingfrom

    updatesorothermaintenanceactionscanchangeresourceneeds.

    o ChangesinUserBehaviororDemands Userscansuddenlyuseapplicationsdifferently

    ordemandcansuddenlyincreaseforparticularapplications

    Whenitcomesto"knowingyourapps",administratorsshouldspendtimewiththepowerusers,

    applicationsdevelopers,databasemanagers,orwhoeverhasinsightintohowanapplicationworksand

    howitisusedintheinfrastructuretounderstandhowapplicationissuesmayimpactperformance.

    Thesepeoplemaybeabletoexplainwhydemandsuddenlyincreases,whyanapplicationusesthe

    resourcesthatitdoes,orhowmultitieredapplicationsareintertwined.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 5

  • 8/2/2019 Top 5 Vm Performance Problems

    7/22

    HypervisorIssuesOtherpotentialcausesoftroublemaylieintheHypervisoritself.Whilegenerallyveryreliable,thisis

    alwaysalayertobeawareofandtocheckforpotentialissues.

    Forexample:

    DifferentHypervisorsareDesignedDifferently Forexample,ESXServerhasaserviceconsole

    butESXiServerdoesnot.WithinESXServer,abackupapplicationcouldberunningintheservice

    console,usingatremendousamountofresourcesthatcouldslowdownallVMsonthathost.

    HypervisorConfiguration Thepossibilityalwaysexiststhatahypervisorcouldbe

    misconfiguredortherecouldbeasoftwarebug.Theseflawinthehypervisororitsconfiguration

    couldcauseperformanceissuesforguestVMs.Forexample,settingsatthehypervisorlevelcan

    affecthowresourcedistributionoccursonlargememorypagesinESXordynamicmemory

    functionalityinHyperV.

    IdentifyingtheRootCauseofaPerformanceIssueTosolveaperformanceproblem,itisimportanttofindtherootcauseoftheissue.Forexample,ifthe

    solutiontoaperformanceissuebecomestorebootanESXiserver,theproblemmaygoaway

    temporarily,butthathostmayormaynothavebeentherootcauseandtheissuecouldreappear.

    PerhapsaVMonthathostwasoverutilizingtheSAN,causingallVMstoslowdown.Oncethehostis

    rebooted,theperformanceproblemcouldeasilyreturniftherootcausewasneveridentifiedanddealt

    with.

    PerformancemetricsfoundinthevSphereclientandESXtopcanbeusedtoidentifyrootcausesatthe

    hypervisorlevel.OtherrootcausesthatareoutsideofvSpheremaybeidentifiedwithSANornetwork

    performancemonitoring

    and

    troubleshooting

    tools.

    TheTop5VMPerformanceProblemsThesectionsbelowwillgointodetailforeachofthetop5VMperformanceproblems.Eachsectionwill

    describehowissuesoccurwithinVMfunctioning,whatthesymptomsandcausesareforeachkindof

    issue,whichVMperformancemetricstousetodiagnosetheproblemandthenwhatthepossible

    solutionsareoncetherootcauseoftheissuehasbeenidentified.

    1. MemoryContentionIn

    VMware

    vSphere,

    memory

    allocations

    and

    management

    are

    handled

    by

    the

    VMkernel

    (in

    the

    hypervisor).Physicalmemoryisvirtualizedandprovidedtotheguestvirtualmachinewhichbelieves

    thatitisreceivingrealmemory.GuestOSmemorypagetablemapsaremappedtoapmapandshadow

    pagetablesbythehypervisor.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 6

  • 8/2/2019 Top 5 Vm Performance Problems

    8/22

    Memoryvirtualization,initself,isnotasimpressiveasthememoryovercommitmentthatvSphere

    providesandthememoryreclamationtechniquesthatareusedtoreclaimmemoryfromtheGuestOS

    (withouttheGuestOSeverbeingawareofvirtualization).

    Whenmemoryutilizationisnearingitscapacity(contentionisimminent),vSphereperformsmemory

    reclamationby

    using

    transparent

    page

    sharing

    (TPS),

    ballooning

    (thanks

    to

    VMware

    Tools),

    host

    swapping,andmemorycompression(inthatorder).Eachoneofthesetechniqueshasanincreasing

    leveloflatencysothehopeisthatyourarelygettothepointthattheyhavetobeused.

    SymptomsIdentifyinsufficientmemoryresourcesbylookingfor:

    Virtualmachineslowness

    ApplicationsfreezingintheguestOS

    VMnotrespondingonthenetwork

    CausesWhatcausesinsufficientmemoryconditions?

    Misbehavingapplications Runawayapplicationsormisconfiguredapplicationscanusemore

    memorythantheyneedto.

    UnderconfiguredVMRAM ThisresultsinVMguestOSswapping.

    Hostmemoryoverhead DuetooverallocationofRAMacrossmanyVMs.

    Underconfiguredhosts ToolittleRAMwhichresultsinhostswapping

    Resourceconstraintssetbyanadministrator

    DRSnotenabledorbroken(misbalancedload)

    DiagnosisDiagnosingmemorycontentionisnothardwhencomparingaVM'smemoryconsumedvsmemory

    granted.Hereismoredetailonthosestatistics:

    AverageMemoryActiveinKB(mem.active.average) Theamountphysicalmemoryonthe

    ESX/ESXihostthattheguestVMisactivelyusing,basedonrecentlytouchedpages(thesmallest

    ofactive,consumed,andgranted).Watchoutforahighmemoryactivethresholdthatis

    approachingtotalmemorycapacityonthehost.

    AverageMemoryConsumedinKB(mem.consumed.average) Theamountofguestphysical

    memoryconsumedbythevirtualmachineforguestmemorycalculatedastheamountof

    memorygranted

    (configured

    for

    the

    VM)

    less

    the

    amount

    of

    memory

    saved

    by

    memory

    sharing

    techniques.Similartomemoryactive,youshouldwatchoutforahighmemoryconsumed

    thresholdthatisapproachingtotalmemorycapacityonthehost.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 7

  • 8/2/2019 Top 5 Vm Performance Problems

    9/22

    Figure6AMemoryUtilizationChartfromwithinVMwarevCenter AverageOverheadforaVMinKB(mem.overhead.average) Theamountofmemorythatthe

    ESX/ESXihostVMkernelusestomanagearunningVM.Whilethereisn'taparticularthreshold

    forthisvalue,youdoneedtokeepinmindtheamountofmemorythatisbeingusedjustto

    managethevirtualmachines'memory(memorythatcannotbeusedbytheVMs).

    AverageMemorySwappedIninKB(mem.swapin.average) Virtualmemoryswappedinfrom

    disktomemory.Noamountofswappingisagoodthingandindicatesthatthehostistakingone

    ofitslastrestoreactionstocontrolmemorycontention.

    AverageMemorySwappedOutinKB(mem.swapout.average) Virtualmemoryswappedout

    frommemorytodisk.Noamountofswappingisagoodthingandindicatesthatthehostis

    takingoneofitslastrestoreactionstocontrolmemorycontention.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 8

  • 8/2/2019 Top 5 Vm Performance Problems

    10/22

    Figure7vSpherecountersformeasuringmemoryperformance AverageMemorySwappedinKB(mem.swapped.average)Theamountofmemoryswapped

    outtothevirtualmachine'sswapfile.Liketheswapinandswapoutnumbers,thetotalamount

    swappedshould,ideally,bezero.However,periodicswappinginsomeenvironmentsisnormal

    butconstantswappingisbad.Remember,swappingisalastditchmemorymanagement

    techniqueandwillnegativelyaffectperformance.

    AverageMemoryReclaimedbyBallooning(mem.vmmemctl.average)Theamountof

    memorythatthehosthashadtoreclaimthroughballooning.Thiscanindicatethatthehostis

    runningoutofmemory(memorycontention)andishavingtouseunallocatedmemoryfroma

    VM.Ideally,thisnumbershouldbezeroonenvironmentswhereyouwanttoensurethebest

    performance.Still,

    other

    lower

    priority

    environments

    may

    always

    have

    some

    amount

    of

    ballooninghappening.

    ResolutionTheseactionscanresolvememoryoverutilizationissues:

    MigratingtheVMtoahostwithmoreRAMavailable

    IncreasingmemoryforaVM

    Reducingmemoryoverhead(reclaimmemory)byreconfiguringVMsthathaveoverallocated

    memory

    Adding

    more

    hosts

    to

    a

    DRS

    cluster

    Eliminatingthemisbehavingapplication

    Removingoraddinglimitsandreservations

    PlacingsimilarVMstogethertotakeadvantageoftransparentpagesharing(TPS).Formore

    informationonvSphereMemoryResourceManagementsee VMware:UnderstandingMemory

    ResourceManagementinVMwareESXServer.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 9

    http://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdfhttp://www.vmware.com/files/pdf/perf-vsphere-memory_management.pdf
  • 8/2/2019 Top 5 Vm Performance Problems

    11/22

    2. HosttoSANFabricIssuesvSpherehostsareconnectedtostorageareanetworks(SAN)andnetworkattachedstorage(NAS)

    throughaFibreChannel(FC)fabricorEthernetnetwork.Thesefabricsaredesignedtoperformwelland

    performwithresiliencetofailure,buttheycanalsobethecauseofpotentialcapacitybottlenecks.

    Fibrechannel

    uses

    FC

    switches

    and

    HBAs

    to

    connect

    to

    fibre

    connected

    storage

    arrays.

    iSCSI

    uses

    softwarebasedorhardwarebasedinitiatorstoconnecttoiSCSItargets(thestoragearray)through

    standardEthernetswitchesandconnections.Finally,NFSalsousesstandardEthernetadaptors,

    connections,andswitchestoconnecttoaNFSserver.

    NomatterwhetheritisNFS,iSCSI,orFC,commandsaresentoverthefabrictothestorage.Ifstorage

    contentionisoccurring,thesecommandswillqueueupandvirtualmachinesandapplicationswillrun

    slowly(or,inaworsecase,freeze).

    Inmanycases,thereisadedicatedstorageadministratortohandleSANcontention.However,insmall

    tomediumsizecompanies,thevirtualizationadminandthestorageadminmaybethesameperson.

    Figure8ThePathforApplicationCommandsfromServertoSpindle(withintheSAN)SymptomsWhat

    are

    the

    symptoms

    of

    SAN

    to

    Fabric

    issues?

    VMslowness

    VMfreezing

    Slowfiletransfers

    CausesThefabricisconnectingthehosttothediskanderrors,latency,andcontentionareallmajorissues.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 10

  • 8/2/2019 Top 5 Vm Performance Problems

    12/22

    EthernetorFCswitchlimitationscausinglatency

    Errorsonfabricconnections

    EthernetorFClinksmaxedout

    DiagnosisIdentifying

    storage

    I/O

    contention

    has

    alot

    to

    do

    with

    latency

    and

    queue

    depths.

    Highdiskread/writerates(disk.read.average)and(disk.write.average) Highreadandwrite

    ratesaregoingtoberelativetoyourstorageinfrastructuresoabaselineofnormalrateswould

    berequiredtoindicateaproblem.Youwouldcreateamanualbaselinebylookingthesetypesof

    storagestatisticsduringtimeofaveragedemandtogetanideaofwhatthe"normal"loadand

    normalread/writeratesareforyourinfrastructure.

    Figure9

    vSphere

    counters

    for

    measuring

    storage

    performance

    ResolutionToresolvestorageI/Ocontention,firstidentifywherethecontentionisthenyoucantakeaction.

    IdentifywherecontentionisoccurringintheSAN/NASandresolveit(likelyusingtoolsfromthe

    storagevendor)

    svMotionvirtualdisktoaLUNwithlesscontention

    RepairSANorrebuilddiskifyouhadahardwarefailure

    3. StorageI/OandDiskProcessingContentionBesides

    the

    fabric

    that

    connects

    the

    hosts

    to

    the

    SAN/NAS,

    storage

    requests

    can

    also

    experience

    contentionontheactualdiskarray(aka"diskprocessingcontention").Storagearrayscanonlyhandlea

    setnumberof"I/Ospersecond"(orIOPS).WhentherearemorerequestsforI/Osthanthearraycan

    handle,virtualmachines(andtheirapplications)willslowdownorfreeze.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 11

  • 8/2/2019 Top 5 Vm Performance Problems

    13/22

    Figure10AVisualComparisonofDiskI/OTrafficStoragearraysaremadeupofstorageprocessors(whichhaveCPUandcache),abackplanethat

    connectseverythingtogether,andactualdisks(bondedtogetherintoRAIDgroupsorLUNs).The

    performanceofallthesepieceshasaneffectontheperformancethattheenduser'sapplication

    (runningonaVM)receives.

    ItisparticularlyimportanttomatchthetypeofRAIDselectedforaLUNandthecharacteristicsofthe

    workloadthatwillbeusingit(readintensivevswriteintensive).

    Inmanycases,thereisadedicatedstorageadministratortohandleSANcontention.However,in

    mediumsizeandsmallcompanies,thevirtualizationadminandthestorageadminmaybethesame

    person.

    SymptomsStoragearraycontentionandstorageI/Olatencysharethesamesymptoms.

    VMslowness

    VMfreezing

    Slowfiletransfers

    CausesCausesofstoragearraycontentioncanbeanythingfromahardwareissuetoamisconfiguration.

    VMcommandoverloadtoanareaoftheSAN

    Hardwareissues

    RAIDconfiguration

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 12

  • 8/2/2019 Top 5 Vm Performance Problems

    14/22

    ToomanyvirtualmachinesonaLUN

    SANhardware needmorespindlesormorecachetoincreaseIOPS

    DiagnosisTherearefourdifferentcommonstatisticsyouwouldanalyzetodiagnosestoragearraycontention:

    Device latency(disk.deviceLatency.average) HosttodevicelatencycanseriouslydegradeVM

    performance.Thecommonthresholdthatindicatedstoragedevicecontentionis25msor

    greater.

    Kernellatency(disk.kernelLatency.average) timethatstoragerequestsspentinthekernel.If

    thisis>0thenthereislikelyadiskqueuingissue.

    AverageDiskQueueLatency(disk.queueLatency.average) Timethatstoragecommands

    spendinthestoragerequestqueue.Thisisincludedinkernellatency(below).Askernellatency

    shouldn'tbe>2thendiskqueuelatencyshouldn'tbe>2mseither.

    AverageTotalDiskLatency(disk.totalLatency.average) Averageamountoftimetoprocessa

    diskrequest.

    This

    is

    the

    average

    of

    the

    sum

    of

    all

    kernel

    and

    disk

    latency

    for

    both

    read

    and

    write

    requests.Ideally,thisshouldn'tbe>25ms.

    CommandsAborted(disk.commandsAborted.summation) diskcommandabortsissuedby

    virtualmachinesbecausestorageisnotresponding.ForWindowsVMsthishappensafter60

    secondsbydefault.Canbecausedforinstancewhenpathsfailedorarrayisnotacceptingany

    IOforwhateverreason.Anynumber>0forcommandsabortedindicatesastorageissue.

    Figure11ESXTopshowingdiskperformancestatistics DiskBusResets(disk.busResets.summation) Alsoonthephysicaldisk,diskbusresetsshowa

    problemwiththestorageorstorageoverload.Thisshouldn'tbegreaterthanzero.

    ResolutionHowdoyouresolvestoragearraycontention?

    svMotionvirtualdisktoaLUNwithlesscontention

    Solve

    storage

    hardware

    issue

    ChangeRAIDconfigurationtomatchtheworkload

    Upgradestoragehardwarewithmorespindlesormorecache

    Reconfigureapplication

    ImplementvSphere4.1andlaterStorageI/Ocontroltoensurethatcriticalapplicationsreceive

    thebandwidththeyneedbasedonyourresourceallocations

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 13

  • 8/2/2019 Top 5 Vm Performance Problems

    15/22

    4. InsufficientCPUResourcesInVMwarevSphere,CPUinstructionsfromtheguestoperatingsystemsmovefromthevirtualCPUsto

    thephysicalCPUbywayoftheschedulerintheVMkernel(thehypervisor).JustasaWindowsguestOS

    hasitsownCPUschedulersodoesthehypervisor.However,thehypervisorfacesamoreunique

    challengebecauseitisschedulingprocessesnotjustfromoneOSbutfrom1030+operatingsystems,all

    ofwhich

    can

    have

    one

    or

    more

    vCPUs.

    Iftherearen'tsufficientCPUresources,theschedulerwillhavetotellthevCPUsto"wait"untilaslot

    becomesavailable.WhilethevCPUofaVMiswaitingforCPUtime,theguestOSandtheapplications

    willbeslowornonresponsive.

    SymptomsIdentifyinsufficientCPUresourcesbylookingfor:

    Virtualmachineslowness

    ApplicationsfreezingintheguestOS

    VMnotrespondingonthenetwork

    CausesThecausesofinsufficientvirtualmachineCPUresourcesare:

    Underconfiguredhosts NotenoughpCPUsonthehosthardwareforthedemandsofthe

    vCPUs(andtheirassociatedVM,guestOS,andapplications).

    UnderconfiguredVMs NotenoughvCPUsonaVMtofulfilltheneedsofamultithreaded

    applicationormultipleapplicationsdemandingalargeamountofCPUresources.

    VirtualmachinesoverprovisionedwithunnecessaryvCPUs I.E:multipleSMPvCPUsthatare

    notbeing

    used.

    PoorVMplacement Virtualmachinesplacedonhoststhatdon'thavetheresourcestheyneed

    (andDRSnotbeingenabled).

    Misbehavingapplications Unnecessary,misconfigured,ormaliciousapplicationsconsuming

    CPU.

    Resourceconstraintssetby anadministrator aVMCPUlimitcouldbesetbyanadmin.

    DRSnotenabledorbroken(misbalancedload).

    DiagnosingInthegraphbelow,youseeahostthathasmaxedoutitsCPUcapacity,likelyduetodemandsfrom

    VMs.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 14

  • 8/2/2019 Top 5 Vm Performance Problems

    16/22

    Figure3ACPUUtilizationChartfromwithinVMwarevCenterVMssufferingfrominsufficientVMCPUresourceissuescanbediagnosedbylookingatthefollowing

    vSphereClientstatistics(oranyapplicationpullingstatisticsusingthevCenterAPI)

    AverageCPUUsageinPercent(cpu.usage.average) ShowstheaverageusageofavCPUorthe

    entireVM

    (depends

    on

    the

    object

    selected).

    Utilization

    between

    70

    90%

    should

    be

    considered

    a

    primesuspectcontributingtoaperformancebottleneck.

    Figure4vSpherecountersformeasuringCPUperformance

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 15

  • 8/2/2019 Top 5 Vm Performance Problems

    17/22

    AverageCPUUsageinMhz(cpu.usagemhz.average) showstheaveragenumberofmegahertz

    usageforavCPUortheentireVM(dependsontheobjectselected).Thethresholdwherethisis

    consideredabottleneckisdependentonthenumberofpCPUsyouhaveandthenumberof

    MHzavailable.

    AverageCPUReadyinms(cpu.ready.summation) athresholdof10%CPUreadyisconsidered

    abottleneck.

    However,

    the

    vSphere

    client

    (and

    vCenter)

    shows

    CPU

    ready

    in

    milliseconds

    and

    takessamplesevery20ms.Thus,toconvertavSphereclientCPUreadyvalueto%,divideitby

    20,000.CPUreadyisoneofthebestindicatorsofaCPUbottleneckbutyouhavetobecareful

    thatyouarecalculatingthe%valuecorrectlyandthatyouarelookingateitherasystemwith

    just1vCPUorthatyoudividemultivCPUVMCPUreadybythenumberofvCPUs.

    ESXtopdoesshowCPUReadyinpercentage,bydefault.NoticehowtherearetwoVMsinthe

    ESXtopoutputbelowthathaveCPUReadyvalues>10%.

    Figure5ESXTopShowingHighReadyTimeonVirtualMachines

    ResolutionHowisaVMperformanceissuesstemmingfrominsufficientCPUresourcessolved?Herearethe

    options:

    MigratetheVMtoahostwithmoreCPUavailable(likelywithvMotionsothatthereisno

    downtime)

    IncreasethenumberofvCPUsonaVMthathaslegitimateCPUdemands

    RemoveunneededvCPUsonaVMthathaslowCPUutilizationandisoverprovisioned

    AddmorehoststoaDRScluster

    Eliminatethe

    misbehaving

    application

    or

    reduce

    its

    utilization

    of

    CPU

    5. NetworkContentionvSphere'shypervisorcreatesvirtualnetworkadaptorsforeachVM(thevNICs)thatconnecttovirtual

    switches(thevirtualnetwork).Thosevirtualswitcheslikelyhaveuplinkstothephysicalnetworkusing

    pNICs.NetworkrequestsgofromtheguestOStothevNICandthentothepNICtogotothephysical

    network.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 16

  • 8/2/2019 Top 5 Vm Performance Problems

    18/22

    Iftherearen'tenoughnetworkresourceavailabletomeetthedemandsoftheapplicationsintheVMs

    thenthenetworkresponsewillslow.AllVMscouldbeeffectedifthenetworkcontentioneffectsiSCSI

    traffic.

    SymptomsHow

    do

    you

    know

    if

    you

    have

    new

    network

    congestion?

    VMslowness

    VMfreezing(becausecommandsrunacrossaSANorNAS)

    InabilitytopingVMs

    AlossofconnectivitybetweenvCenterandtheESX/ESIservers

    CausesWhatcouldcausenetworkcongestion?

    Maxedoutnetworkpaths

    Switchhit

    maximum

    throughput

    PhysicalNIC/cableissue

    NotenoughpNICuplinkstothevirtualnetwork

    Applicationtrafficstorm(possiblyanenduserrunningatorrent?)

    DiagnosisWhatdoyoudotofindoutifyouhaveanetworkcongestionproblem?

    CheckoutthefollowingstatisticsinthevSphereclient:

    Figure12vSpherecountersformeasuringnetworkperformance

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 17

  • 8/2/2019 Top 5 Vm Performance Problems

    19/22

    AverageNetworkKBReceived(net.received.average) ChecktheKilobytes(KB)receivedtosee

    ifthetrafficisinbound.Makesurethatyouaren'texceedingthethroughputofyournetwork

    adaptor(ie:1GBEthernetadaptorshouldn'tbesendingandreceivingmorethan1GBoftraffic)

    AverageNetworkKBTransmitted(net.transmitted.average) ChecktheKBsenttoseeifthe

    trafficisoutbound.Makesurethatyouaren'texceedingthethroughputofyournetwork

    adaptor(ie:

    1GB

    Ethernet

    adaptor

    shouldn't

    be

    sending

    and

    receiving

    more

    than

    1GB

    of

    traffic)

    AverageNetworkUsageinKB(net.usage.average) Verifytheaveragenetworkutilizationand

    makesurethatyouaren'texceedingyourphysicalNICthroughputcapacity.(ie:1GBEthernet

    adaptorshouldn'tbesendingandreceivingmorethan1GBoftraffic)

    ResolutionSolvingnetworkcongestionisrelativelystraightforwardandusuallyrequireshardwarechanges:

    UpgradeNetworkHardware Replace1GBEthernetwith10GBEthernetandupgradeEthernet

    switchestomatch

    Add

    more

    NICs

    and

    other

    host

    hardware

    Add

    more

    pNICs

    to

    the

    server

    and,

    potentially,

    bond

    themtogether

    ResolvevSwitchIssues ReconfigurevSwitchsecurityandbandwidthpoliciesasneeded

    AccessingMetricstoDiagnoseanIssueThemetricsyou'llusetoidentifyandsolveperformanceproblemsarefoundin:

    vCenterthroughthevSphereClient theperformancetabisavailableatjustabouteverylevel

    ofthevSphereclient.Thecustomperformancechartsallowyoutographeverystatisticshown

    inthiswhitepaper

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 18

  • 8/2/2019 Top 5 Vm Performance Problems

    20/22

    Figure13CreatingacustomvSphereperformancechartformeasuringnetworkperformance

    ESXtoprunonanESX/ESXihostconsole AvailableonbothESXandESXI,ESXtopgivesgreater

    detailintotheperformancestatisticsthatwhatwascoveredinthiswhitepaper

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 19

  • 8/2/2019 Top 5 Vm Performance Problems

    21/22

    Figure14ESXTopshowingvSphereperformancestatistics Thirdpartyapplication ThirdpartyapplicationslikeVKernel'svOperationsSuitecanpullout

    thenecessaryVMmetricsdataandthenautomaticallyidentifycurrentperformanceproblems,

    forecastemergingissues,andprovideactionablerecommendationsonhowtosolvetheissue.

    DocumentingaResolutionOnceVMperformanceproblemhasbeenidentified,diagnosed,andresolved,avirtualization

    administratorneedstoensurethattheresolutionisunderstoodbyallstakeholdersanddocumented.By

    doingso,thedatacenterstaffhasabetterchanceatpreventingtheprobleminthefutureandensuring

    thatotheradministratorscansolvesimilarissuesquicklyiftheyoccur.

    Forexamplesomeresolutiondocumentationmaymentiondetailssuchas:

    VMrunningproductiondatabasereached80%memoryutilization

    Usedmetrics

    to

    determine

    how

    much

    additional

    memory

    needed

    to

    be

    added

    (factor

    in

    some

    additionalmemoryforexpectedgrowthrateoftheVM)

    UsedvSpherehotaddtoaddadditionalmemorytorunningVM

    ReviewedresourceutilizationintheclustertoensurethatadditionofresourcestothisVM

    didn'tnegativelyaffectothers

    ConclusionVMperformanceproblemsareatopconcernformanyapplicationadministratorswhenavirtualization

    initiativebeginsandcanbecomeaquagmireforavirtualizationadministratorifanissuestrikes.With

    theright

    knowledge

    about

    where

    problems

    may

    lie,

    how

    to

    diagnose

    them

    and

    then

    resolve

    them,

    virtualizeddatacentersandcloudinitiativescanquicklyeliminateandVMperformanceissuesthat

    occur.Further,byinstitutingprocessestomonitorchangesintheVMsystemmetricsdescribedinthis

    whitepaper,emergingissuescanbeidentifiedandresolvedbeforetheybecomenoticeable

    performanceissues.

    2011 VKernel Corporation. All rights reserved. http://www.vkernel.com 20

  • 8/2/2019 Top 5 Vm Performance Problems

    22/22

    AbouttheAuthorDavidDavis is the author of the bestselling VMware vSphere video training library from

    Train Signal. He has written hundreds of virtualization

    articles on the Web, is a vExpert, VCP, VCAPDCA, and CCIE

    #9369 with more than 18 years of enterprise IT experience.His personal Website is VMwareVideos.com.

    AbouttheSponsorVKernel is the number one provider of performance and capacity management products for

    virtualized data centers and cloud environments. The companys awardwinning, easytouse and powerful products simplify the complex and critical tasks of realtime VMware

    performance monitoring, capacity planning, resource optimization, reporting and

    chargeback for virtual environments. Used by over 50,000 virtualization administrators,

    VKernels products have proven their ability to immediately identify and resolve VM

    performance problems, maximize capacity utilization, and reduce virtualization costs.

    http://www.trainsignal.com/VMware-Training.aspxhttp://www.vmwarevideos.com/http://www.vmwarevideos.com/http://www.trainsignal.com/VMware-Training.aspx