Drs Performance Best Practices Wp

download Drs Performance Best Practices Wp

of 19

Transcript of Drs Performance Best Practices Wp

  • 8/7/2019 Drs Performance Best Practices Wp

    1/19

    Performance Study

    Copyright 2008 VMware, Inc. All rights reserved. 1

    DRS Performance and Best PracticesVMware Infrastructure 3

    VMwareInfrastructure3providesasetofdistributedinfrastructureservicesthatmaketheentireIT

    environmentmoreserviceable,available,andefficient.WorkingwithVMwareESX3,VMwareVirtualCenter

    2,andVMwareVMotion,VMwareDistributedResourceScheduler(DRS)dynamicallyallocatesresourcesto

    enforceresourcemanagementpolicieswhilebalancingresourceusageacrossmultipleESXhosts.

    ThispaperisintendedforreaderswhounderstandthearchitectureandbasicconceptsofDRS.Seethepaper

    ResourceManagementwithVMwareDRSforthesedetails.(SeeResourcesonpage 19foralink.)This

    paperidentifiesvariousscenariosinwhichyoucanbenefitfromDRSandexplainshowtoconfigureyour

    environmenttotakebestadvantageofDRS.

    Thepapercoversthefollowingtopics:

    OverviewofDRSPerformanceonpage 1

    ResourceAllocationforVirtualMachinesandResourcePoolsonpage 3

    ScalingtheNumberofHostsandVirtualMachinesonpage 9

    DRSAggressivenessonpage 13

    IntervalBetweenDRSInvocationsonpage 14

    CostBenefitAnalysisonpage 15

    HeterogeneousHostEnvironmentsonpage 15

    ImpactofIdleVirtualMachinesonpage 16

    DRSPerformanceBestPracticesonpage 16

    MonitoringDRSClusterHealthonpage 17

    Conclusionsonpage 19

    Resourcesonpage 19

    Overview of DRS PerformanceVMwareDistributedResourceSchedulerimprovesresourceallocationacrossallhostsandresourcepoolsin

    acluster.WhenyouenableaclusterforDRS,VirtualCentercontinuouslymonitorsthedistributionofCPUand

    memoryresourceusageforallhostsandvirtualmachinesinthatcluster.DRScomparesthesemetricstoideal

    resourceutilizationthatis,thevirtualmachinesentitlements.Theseentitlementsaredeterminedbasedon

    theresourcepoliciesoftheresourcepoolsandvirtualmachinesintheclusterandtheircurrentdemands.

    VirtualCenterusesthisanalysistoperforminitialplacementofvirtualmachines,virtualmachinemigration

    forloadbalancing,enforcementofrulesandpolicies,anddistributedpowermanagement,ifdistributed

    powermanagementisenabled.

    ThisstudyfocusesonunderstandingtheeffectivenessandscalabilityofDRSalgorithms.

  • 8/7/2019 Drs Performance Best Practices Wp

    2/19

    Copyright 2008 VMware, Inc. All rights reserved. 2

    DRS Performance and Best Practices

    Effectiveness of DRS Algorithms

    OurgoalinthesetestswastoseeifDRSiseffectiveinimprovingperformancewhenthereisresource

    contentionandifDRScanallocateresourceswhilebalancingloadinproportiontotheresourcepolicies

    allocatedtovirtualmachines.WeevaluatedvariousconfigurationsoftheDRSalgorithmandcomparedhow

    effectivetheyareinvarioususecases.

    Weexaminedwhethertheshares,reservations,andlimitsofavirtualmachineorresourcepoolareeffectively

    enforced

    by

    DRS

    and

    how

    the

    migrations

    recommended

    by

    DRS

    affect

    the

    performance

    of

    the

    affected

    cluster

    asawhole.Weusedvirtualmachinethroughput(orthethroughputofmultiplevirtualmachines)asa

    measureofhoweffectiveDRSis,inthepresenceofvariousworkloadswithvariouspolicysettings.

    TheloadsshouldachievethroughputthatisapproximatelyproportionaltotheirCPUandmemory

    entitlements.TheseresourceentitlementsareameasureofhowmanyCPUcyclesandhowmuchmemorythe

    virtualmachineorresourcepoolshouldreceiveandarecalculatedbasedontheshares,reservations,and

    limitstheresourcepoliciessetinVirtualCenterandtheestimatedCPUandmemoryresourcesdemanded

    bythevirtualmachineorresourcepool.Byplacingormigratingthevirtualmachineseffectivelybasedon

    theseentitlements,DRSensuresthatthevirtualmachineorresourcepoolgetswhatitdeserveswithout

    violatingtheshares,reservationsandlimitpolicy.

    Scalability of DRS Algorithms

    DRSsupportsacertainnumberofhostsandvirtualmachinesinaDRScluster.Thegoalinourstudywasto

    seehowDRSscaledasweincreasedthenumberofhostsandvirtualmachines.Asthenumberofvirtual

    machinesandhostincreases,thepossibilitiesforbalancingwithvirtualmachinemigrationsalsoincrease.On

    theotherhand,eachbalancingmigrationdoeshavearesourceoverhead.Thismighteffectivelyimproveor

    worsenclusterthroughputaswescaledtheenvironmenttoincludemorehostsandvirtualmachines.

    Furthermore,throughputimprovementdependsheavilyonthecharacteristicsoftheworkloadforexample,

    placementofthevirtualmachinesandvariabilityofworkloads.Weexploredanumberofdifferentworkloads,

    atvariousscales,toseehowDRShandledeachofthem.

    Dimensions for DRS Performance Analysis

    TobetterunderstandtheperformanceofDRS,wecharacterizedDRSalongseveraldimensions.These

    dimensionsinclude:

    Resourceallocationforvirtualmachinesandresourcepools

    Numberofhostsandvirtualmachines

    Degreeofaggressiveness

    DRSperiodicfrequency

    Heterogeneoushostenvironments

    CostbenefitanalysisofVMotion

    Impactofidlevirtualmachines

  • 8/7/2019 Drs Performance Best Practices Wp

    3/19

    Copyright 2008 VMware, Inc. All rights reserved. 3

    DRS Performance and Best Practices

    Resource Allocation for Virtual Machines and Resource Pools

    DRSensuresthatresourcepoliciesaresatisfiedandbalancesloadacrossaclusterwhenappropriate.Inorder

    toevaluateitsperformanceimpact,weusedthroughputofvariousworkloadsasameansofcomparisonin

    variouscaseswitharealisticload.DRSalgorithmsshouldresponddynamicallytotheresourceallocation

    changesforvirtualmachinesandresourcepools.Wesetupseveralexperimentstoverifythisbehaviorand

    evaluateDRSeffectiveness,seekingtoanswerthefollowingquestions:

    Can

    DRS

    automatically

    reduce

    the

    load

    on

    an

    overcommitted

    host

    when

    additional

    resources

    are

    added

    oravailableintheclusterandconsequentlyalsoimprovethethroughputfortheworkloads?

    HowdoesDRSrespondtoahighlyimbalancedcluster?HowdoesDRSaggressivenessaffectthe

    placementofvirtualmachines?

    CanDRSeffectivelyenforceresourceallocationpoliciesacrosshostsinaDRSclusterbysettingdifferent

    sharevaluesforindividualvirtualmachines?

    Test Methodology

    Thissectiondescribesthetypesofbenchmarkworkloadsused,resourcesdemanded,andmetricsusedtotest

    theimpactofresourceallocationonDRS.

    We

    setup

    a

    homogenous

    cluster

    of

    hosts,

    each

    configured

    with

    two

    2.8GHz

    Intel

    Xeon

    CPUs

    with

    hyperthreadingenabled.Eachhosthad4GBofmemory.WeusedadedicatedGigabitEthernetnetworkfor

    VMotionandconfiguredDRStorunatthedefaultmigrationthreshold.

    Forthisfirstsetoftests,weusedbenchmarkstoreflectrealisticworkloads.Thestableperiodfortheseloads

    wasanhourduringwhichwetookmeasurements.Table1liststheseworkloads.

    Test Results

    ToverifythatDRScanautomaticallyreducetheloadsonanovercommittedhost,weranthefiveworkloads

    showninTable1,eachinitsownvirtualmachine,ononehost.Allvirtualmachineshadequalsharesandno

    reservationsorlimits.Figure1showsthiscase.Eachcirclerepresentsonevirtualmachine,withdifferent

    lettersrepresentingthedifferentworkloads,andtheboxesrepresentingseparatephysicalhosts.Thesizeof

    eachcirclerepresentshowmanysharesaregiventothatvirtualmachine,nottheCPUormemoryresource

    demandsofthevirtualmachine.DetailsontheresourcedemandsarelistedinTable1.

    Tests for Load Balancing with Increase in Available Resources

    WeusedamixofvirtualmachinesthatdemandedmoreCPUresourcesthanwereavailableonthehoston

    whichweplacedthem.Wemeasuredthethroughputforeachvirtualmachineasweaddedorpoweredona

    newhostinthecluster.

    Table 1. Workloads for Resource Allocation Tests

    Virtual Machine Workload Benchmark Resource Demand Metric

    Databaseserver Swingbench CPU:850MHzMemory:700MB

    Transactions/sec

    Webserver SPECweb2005 NetworkI/O:30Mb/sCPU:1000MHz

    Accesses/sec

    Fileserver Dbench DiskI/O:120Mb/sCPU:550MHz

    MB/sec

    Javaserver SPECjbb2005 Memory:1.7GBCPU:550MHz

    Neworders/sec

    Idlevirtualmachine None None None

  • 8/7/2019 Drs Performance Best Practices Wp

    4/19

    Copyright 2008 VMware, Inc. All rights reserved. 4

    DRS Performance and Best Practices

    ThevirtualmachineworkloadsinFigure1andsubsequentdiagramsareidentifiedusingthefollowingkey:

    Ddatabaseserver

    Ffileserver

    Iidle

    JJavaapplicationserver

    WWebserver

    Figure 1. Initial Workload Distribution with DRS Disabled

    Figure 2. Workload Distribution with DRS Enabled

    WhenweenabledDRS,itmovedsomeoftheloadontheovercommittedhosttotheidlehostasshownin

    Figure2.Usingthedefaultmigrationthreshold(moderate),DRSimprovedoverallsystemthroughput

    (geometricmeanoftheindividualthroughputimprovements)by27percent.Theseresultsconfirmthatsimple

    balancingprovidedbyDRScanautomaticallybringsignificantimprovementsinthroughputwithouttheneedtoplacevirtualmachinesmanually.

    CPUintensiveloadsimprovedsignificantly,becausesystemCPUresourceswereovercommittedbeforewe

    enabledDRS.Figure3showsthenormalizedthroughputimprovementforindividualvirtualmachinesused

    inthistest.Thedatabaseserverthroughputimprovedby25percentbecausemoreCPUresourceswere

    availableafterweenabledDRS.WebserverthroughputimprovedsignificantlyafterweenabledDRS.Because

    theWebserverisanetworkintensiveload,itbenefitedfromDRSbecausethenetworkI/Ocausesincreased

    CPUload.Ontheotherhand,theJavaapplicationserverthroughputremainedthesame,becausetheJava

    serverworkloadismemoryintensive,andthesystemdidnotexperienceanymemorycontentionduringthe

    test.Wedidnotuseafileserverworkloadforthistest,becausefileserverdemandsarelativelylowlevelof

    CPUresources.Instead,weusedasecondinstanceoftheWebserverworkloadforthistest,whichrequires

    moreCPUresourcesthanthefileserverworkloaddoes.

    D W

    J

    IW

    D

    I

    W

    J

    W

  • 8/7/2019 Drs Performance Best Practices Wp

    5/19

    Copyright 2008 VMware, Inc. All rights reserved. 5

    DRS Performance and Best Practices

    Figure 3. Throughput Normalized to Baseline Individual Workload Throughput

    Tests for Robust Response to Highly Imbalanced Cluster

    TofurthervalidatetherobustnessoftheDRSalgorithms,wethenran10workloads,twoofeachoftheloads

    describedinTable1,ontwoseparateESXhosts.Onceagain,wegaveallvirtualmachinesequalsharesandno

    reservationsorlimits.Figure4depictsthescenariosthatweattemptedtovalidateduringthistest.

    Figure 4. Distribution of Workloads When DRS Aggressiveness Is Adjusted

    Scenario1showsthebaselineorbalancedthroughputconfiguration.Wechosethisasthebalanced

    arrangementbecauseputtingoneofeachvirtualmachineperhostbalancestheCPU,memory,andI/O

    demandsevenlywhileignoringlocalityorcachingeffect.WeusedthisasourbaselineandweexpectedDRS

    toachieveathroughputclosetothis.

    Scenario2depictstheworstpossibleconfiguration.WestartwiththisconfigurationtoseehowDRSbalances

    theloadofthesevirtualmachines.

    Weperformedthetestusingdefault(moderate)andaggressivemigrationthresholdsettingstotestwhether

    theresourceallocationthatDRSgeneratescanmatchthebaselineconfiguration.Theworstpossible

    configurationcouldnotbesustainedononehostwithoutfailuresinthebenchmarkmetricsbecauseof

    resourceconstraints.SoweinsertedadelaybetweenthestartofonevirtualmachineandthenexttoallowDRS

    tobalanceresourcesasweaddedeachnewloadtothefirsthost.

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    DB server Java server Web server 1 Web server 2

    DRS disabled DRS enabled

    D W

    J

    IF

    Scenario 2 DRS enabled (before)Scenario 1 baseline (optimal)

    D W

    J

    IF

    D W

    J

    IF

    D W

    J

    FS

    D W

    J

    IF

    Scenario 2b aggressive DRS (after)

    DD W

    J

    W

    J

    IF

    Scenario 2a moderate DRS (after)

  • 8/7/2019 Drs Performance Best Practices Wp

    6/19

    Copyright 2008 VMware, Inc. All rights reserved. 6

    DRS Performance and Best Practices

    OurresultsshowedthatthevirtualmachineplacementaswellassystemthroughputwithDRSwasveryclose

    tothatofthebaselineconfiguration.Theoverallsystemthroughputwas95percentofthatofthebaseline

    configurationfortheDRSconfigurationwithamoderatethresholdand99percentofthatofthebaseline

    configurationwiththeaggressivethresholdsetting.Thisincludestheimpactonthroughputcausedbythe

    virtualmachinemigrationsneededtogettothefinalstateofthecluster.Becausetheloadswereconstant,the

    finalstatewasachievedquicklyaftertheloadsreachedstablestateandnofurthermigrationsweremade.

    Forthetestwiththedefaultthresholdsetting(moderate),DRSperformedfourmigrationsthatis,itmoved

    oneWebserver,oneJavaserver,andthetwodatabaseservers.

    Thetestusingtheaggressivethresholdyieldedanalmostbaselineconfiguration.Thistestperformedmore

    migrationsatotalofeighttogettothisstate.ThenumberofmigrationsreflectsthefactthatDRSload

    balancingoccurswhiletheloadisrampingup,andhenceinthemoreaggressivecase,somemigrationswere

    revertedafterallloadsreachedsteadystate.Theidlevirtualmachinewasnotmigratedbecausethecostof

    migratingthevirtualmachinecouldnotbejustified,giventhattherewasnoload.Becausetheloadsare

    relativelystableovertime,weobtainedbetterthroughputwithanaggressivethreshold.However,thedefault

    settinginDRSismoderateinordertoaccommodatechangingworkloads.

    Figure5comparesthesystemthroughputforindividualworkloadsinthebaselinecase,usingDRSwitha

    moderatethreshold,andusingDRSwithanaggressivethreshold.Wenormalizedallthroughputresultstothe

    baselinecaseforeasiercomparisonandmeasuredtheresultsoverastableperiodofanhour.

    Inthemoderatecase,allindividualthroughputsarewithin80percentofbaselineandoverallwithin95

    percent(geometricmeanofalltherelativethroughputs)ofbaseline.ThisistruedespitethefactsthatDRShas

    anopportunitytorebalanceonlyonceeveryfiveminutes(default)andthatitisaffectedbyVMotionoverhead.

    WhenweusedDRSwithamoderatethreshold,thedatabaseserverswerethemostaffectedbecauseDRS

    placedbothdatabasevirtualmachinesonthesamehost.Migratingoneofthedatabaseloadstoadifferenthost

    wasnotjustifiedunderthemoderatethresholdbecausetheimbalanceacrosshostswasnotgreatenoughto

    justifyit.Someworkloadsachievedbetterthroughputthanthebaselinebecauseadditionalresourceswere

    availableonthehostonwhichtheworkloadwasrunning.

    Inthemoreaggressivecase,theresultsaredifferent.Becauseeachworkloadwasfairlyconstant,overtime

    DRSaggressiverebalancingproducedbetterthroughput.

    Iftheloadsvary,DRSwithamoderatethresholdmayachievebetterthroughputinthelongrunbecauseit

    would

    recommend

    fewer

    migrations

    and

    hence

    incur

    a

    lower

    VMotion

    cost

    compared

    to

    DRS

    with

    an

    aggressivethreshold.

    Whenweincreasedthenumberofhosts,asdescribedinScalingtheNumberofHostsandVirtualMachines

    onpage 9,DRSwithamoderatethresholdachievedthroughputcomparabletothatofDRSwithanaggressive

    thresholdanddidsowithfewermigrations.

    Figure 5. Relative Change in Throughput of Workloads

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    File server 1 File server 2 DB server 1 DB server 2 Java server 1 Java server 2 Web server 1 Web server 2

    Baseline DRS moderate (default) DRS aggressive

  • 8/7/2019 Drs Performance Best Practices Wp

    7/19

    Copyright 2008 VMware, Inc. All rights reserved. 7

    DRS Performance and Best Practices

    Enforcing Resource Allocation Policies

    InaVMwareInfrastructure3environment,youcanallocateresourcesusingabsolutememoryandCPU

    bounds(reservationandlimit),proportionalvalues(shares),oracombinationofthetwoforthevirtual

    machinesandresourcepoolsinthecluster.Areservationenforcesaguaranteedminimumforresourcesand

    alimitenforcesahardmaximum.Sharesettingscanbelow(1),normal(2),high(4),orcustom.Youcan

    assigndifferentsharevaluesforCPU,memory,orbothtoensureproportionalresourceallocation.ESX

    providesresourceswhileensuringtheallocationdoesnotviolatetheassignedreservationsandlimits(inHz

    orbytes).

    TotesttheeffectivenessofDRSincombinationwithresourceallocationfeatures,weranvarioustypesof

    workloadvirtualmachineswithdifferentsharevalues.Figure6showstheninevirtualmachinesrunningon

    twohostsbeforeDRSisenabled.Thesizeofeachcircleinthisfigurereflectsthenumberofstaticshares

    assignedtotheworkload,nottheactualresourceusageofeachworkload.Hence,workloadswithdifferent

    resourceusagecouldhavecirclesofthesamesize.Thesharesareassignedintheratioof1:2:4.Welocatedall

    low(1)sharevirtualmachinesonthesamehostatthestartofthetestandlocatedthevirtualmachineswith

    normal(2)andhigh(4)sharesonadifferenthost.

    Figure 6. Initial Distribution of Virtual Machines with High Resource Contention

    WeassignedthethreeWebservervirtualmachineshigh,normal,andlowshares.WeplacedthetwoWeb

    serverswithhighandnormalsharesonthesamehost,hencetheyshareresourcefromonehost.Thevirtual

    machinewiththelowsharesettinghasnootherWebservertocontendwith,henceitgetsrelativelymoreCPU

    resourcesbecauseitencounterslesscontention.

    Afterweenabledit,DRSbalancedtheworkloadstogivethefinalconfigurationasshowninFigure7.DRS

    detectedthattheWebserverwithhighshareswasnotreceivingtheresourcesitwasentitledtoreceive.Infact,

    asshowninFigure8,itwasreceivingalowerallocationthanthevirtualmachineconfiguredforalowshare

    wasreceiving.DRSswappedtheWebservervirtualmachineswithlowandhighshares.ThisallowedtheWeb

    serverconfiguredforahighsharetotakeadvantageoftheresourcesavailableontheotherhost.Theswapalso

    freedresourcesfortheothervirtualmachineswithhighshares.Thus,DRSbalancedtheWebserverloads

    acrossthetwohostsandsatisfiedtheoverallsystemsharesforallninevirtualmachines.

    Figure 7. Balanced Distribution of Virtual Machines After DRS Swaps Two Web Servers

    D W

    J I

    W

    D

    J

    W

    I

    D

    W

    J I

    WD

    J

    W

    I

  • 8/7/2019 Drs Performance Best Practices Wp

    8/19

    Copyright 2008 VMware, Inc. All rights reserved. 8

    DRS Performance and Best Practices

    Figure 8. Web Server Throughput

    Figure8showsthechangeinWebservervirtualmachinethroughputbeforeandafterweenabledDRS.Before

    weenabledDRS,theWebservervirtualmachinewithlowsharesachievedtwicethethroughputoftheWeb

    servervirtualmachinewithnormalshares.WithoutDRS,thescopeofresourcesharingislimitedtoasingle

    host.BecausetheWebservervirtualmachineswithhighandnormalsharesranonthesamehost,their

    respectivethroughputnumbersreflectedtheirrelativesharesasmanagedbythelocalESXresourcescheduler,

    whichcannotscheduleresourcesacrosshosts.However,afterweenabledDRS,thevirtualmachine

    throughputnumbersreflectedtherelativevirtualmachinesharesforallvirtualmachines.DRSenforces

    resourcesharingacrosshostsbycontrollingthehostlevelresourcesettings.

    TheidleandJavaapplicationworkloadsdidnotchangebecausetheydonothaveanysignificantCPUusage

    andthereisnomemoryovercommitmentorimbalance.BecauseCPUistheresourcefacingcontentioninthis

    scenario,theWebserveranddatabaseworkloadsweregoodcandidatesformigration.DRSmigratedtheWeb

    serverwithhighsharestothehostwherevirtualmachineswereentitledtofewerresourcesbecauseDRS

    estimatedthatthemigrationwouldbringbetterbalanceatminimalmigrationcost.Later,DRSbalancedCPU

    usagebymovingthelowsharesWebservervirtualmachinetothesecondhosttofurtherbalancethe

    entitlements.

    Resource Allocation Recommendations

    ThefollowingrecommendationscanhelpyouimproveresourceallocationinyourDRSclusters.

    Sharestakeeffectonlywhenthereiscontentionforresources.Hence,ifCPUresourcesonahostarenot

    fullycommitted,eachvirtualmachineorresourcepoolonthathostgetsalltheCPUresourcesitdemands.

    Similarly,ifresourcesonaDRSclusterarenotfullycommitted,virtualmachinesorresourcepoolsreceive

    theresourcestheydemand,solongasthefollowingconditionsaremet:

    TheDRSclusterhasnounappliedmanualrecommendations.

    Novirtualmachineorresourcepoollimitsaresettovaluesbelowwhatthevirtualmachineworkload

    demands.

    Novirtualmachineorresourcepoolreservationsaresetinsuchawaythattheypreventany

    balancingmigrationfromanovercommittedhost.

    Noaffinityandantiaffinityrulesaresetinsuchawaythattheypreventanybalancingmigrationfrom

    anovercommittedhost.

    NohostVMotionincompatibilitiessuchasCPUincompatibilities,lackofasharedvirtualmachine

    orVMotionnetwork,orlackofasharedvirtualmachinedatastorepreventanybalancingmigration

    fromanovercommittedhost.

    0

    200

    400

    600

    800

    1000

    1200

    1400

    1600

    1800

    Web server 1 (high) Web server 2 (normal) Web server 3 (low)

    Throughput without DRS Throughput with DRS

  • 8/7/2019 Drs Performance Best Practices Wp

    9/19

    Copyright 2008 VMware, Inc. All rights reserved. 9

    DRS Performance and Best Practices

    Youmightwasteidleresourcesifyouspecifyaresourcelimitforoneormorevirtualmachines.The

    systemdoesnotallowvirtualmachinestousemoreresourcesthanthelimit,evenwhenthesystemis

    underutilizedandidleresourcesareavailable.Specifyalimitonlyifyouhavespecificreasonsfordoing

    so.

    Forresourcepools,resourceallocationsaredistributedamongtheirsiblingvirtualmachinesorresource

    poolsbasedontheirrelativeshares,reservations,andlimits.

    Theexpandablereservationsettingforresourcepoolsallowsthereservationtogoaboveitsinitialvalue

    ifthereservationofthechildvirtualmachinesorresourcepoolsincrease.Settingafixedreservation

    preventsanyincreaseinthechildvirtualmachineorresourcepoolreservations.

    Virtualmachineshavesomeoverheadmemorythatthesystemmustaccountforinadditiontothe

    memoryusedbytheguestoperatingsystem.Thismemoryischargedasadditionalreservationontopof

    theuserconfiguredvirtualmachinereservation.Hence,whenyousetthereservationsandlimitsfor

    resourcepools,youshouldkeepabufferfortheextraoverheadreservationneededbythevirtual

    machines.SeetheResourceManagementGuideforestimatedmemoryoverhead.(SeeResourcesonpage 19foralink.)Theamountdependsonvirtualmachinememorysize,thenumberofvirtualCPUs,

    andwhetherthevirtualmachineisrunninga32bitor64bitguestoperatingsystem.Forinstance,a

    virtualmachinewithonevirtualCPUand1024MBofmemoryhasavirtualmachineoverheadof

    approximately98MBfora32bitguestoperatingsystemorapproximately118MBfor64bitguest

    operatingsystem.Also,theoverheadreservationmaygrowwithtimedependingontheworkload

    runninginthevirtualmachine.Thisoverheadgrowthisdesignedtoensureoptimalperformancewithin

    theguestoperatingsystem.

    Resourcepoolsmakemanagingresourceallocationsmucheasierbygroupingvirtualmachines,resource

    pools,orbothwithineachresourcepool.Thisallowsadministratorstoapplysettingsacrossmany

    resourceentitieseasilyandtobuildahierarchicalstructurethatmakesiteasytoallocatetheirresources.

    Thispaperdoesnotaddresstheuseofresourcepoolhierarchiesbecausetheydonothaveanyimpacton

    virtualmachineperformance.

    Scaling the Number of Hosts and Virtual Machines

    Scalingofthenumberofhostsandvirtualmachinesaffectstheresourcedistributionandloadbalancing

    opportunities

    available

    to

    DRS.

    Our

    testing

    examined

    the

    impact

    of

    size

    of

    the

    cluster

    on

    the

    throughput

    of

    the

    system.IncreasingthenumberofhostsinaclustergivesDRSmorelocationswhereitcanplacevirtual

    machines,andconsequentlyaffectstheoutputofDRSalgorithms.Furthermore,asystemwithmorehosts

    normallyhasmorevirtualmachines,aswell.IncreasingthenumberofvirtualmachinesmeansDRScanmove

    moreentitiesaround.Eachofthesevirtualmachinesmightberunningdifferentworkloadsatdifferenttimes.

    IncreasingthenumberofhostsandvirtualmachinesdoesincreasethecomplexityofvariousDRS

    computations,butatthesametime,theincreaseprovidesopportunityformoreefficientresourcedistribution.

    However,astheseopportunitiesincrease,thenumberofmigrationsmayincreaseandasaresultmayincur

    additionaloverhead.

    Weaccountedforthisoverheadinourexperimentsinordertoevaluatethescaling.Resourcepoolscan

    simplifythetaskofallocatingresourcesacrosstheDRScluster,especiallywhendealingwithalargenumber

    ofvirtualmachinesandhosts.However,scalingthenumberofresourcepoolsdoesnotaffectperformanceof

    thevirtualmachinessolongastheresourcepoliciesareeffectivelyallocatedidentically.Hence,thissection

    doesnotconsiderresourcepoolscaling.

    Test Methodology

    TheobjectiveofthetestwastomeasuretheeffectivenessoftheDRSalgorithmsaswescaledtolarger

    environments,whileensuringthattheoverallclusterutilizationremainedconstantthatis,thattheratioof

    overallvirtualmachinedemandtototalclusterresourcesremainedconstant.

  • 8/7/2019 Drs Performance Best Practices Wp

    10/19

    Copyright 2008 VMware, Inc. All rights reserved. 10

    DRS Performance and Best Practices

    Wesetupahomogenousclusterofhostseachconfiguredwithtwo2.8GHzIntelXeonCPUswith

    hyperthreadingenabled.Eachhosthad4GBofmemory.WededicatedaGigabitEthernetnetworkfor

    VMotionandconfiguredDRStorunatthedefaultmigrationthreshold(moderate).Eachhosthadthree

    workloadvirtualmachines,eachwithdifferentworkloads.

    WeconfiguredeachvirtualmachinewithasinglevirtualCPUand1GBofRAMandinstalledRedHat

    EnterpriseLinux3astheguestoperatingsystem.Forsimplicity,wegaveeachvirtualmachinethesameshares

    andsetnoreservationsorlimits.Weranoneofthefollowingcategoriesofworkloadsinsideeachvirtual

    machine:

    ConstantworkloadusedafixedamountofCPUresourcesof10percent,50percent,or90percentof

    CPU(ofonevirtualCPU).

    VaryingworkloadusedavaryingamountofCPUresources,changingwithtime.Foreasyanalysis,the

    workloadvariedamongthreesteps,using10percent,50percent,or90percentofCPU,witheachstep

    lastingfiveminutes.Weusedallpossiblecombinationsofthethreesteploadtogeneratesixdifferent

    workloadtypes,asshowninFigure9.OurresultswiththesevaryingworkloadsdemonstratethatDRS

    performseffectivelyevenifloadsconstantlychangeovertimeandnostablestateisreached.Asourtests

    showed,thisisoneofthemainadvantagesofusingDRS.

    Figure 9. Patterns of Varying Workloads During Tests

    Wedistributedvariouscombinationsofthevaryingworkloadvirtualmachinesacrosshoststoachieve

    differentpatternsthatreflectthreehostusagescenarios.

    Table 2. Test Environment for Scaling Tests

    Number of

    Hosts

    Number of Virtual

    Machines

    Number of Each Constant

    Workload

    Number of Each Varying

    Workload

    2 6 2 1

    4 12 4 2

    8 24 8 4

    16 48 16 8

    1

    2

    3

    4

    5

    6

  • 8/7/2019 Drs Performance Best Practices Wp

    11/19

    Copyright 2008 VMware, Inc. All rights reserved. 11

    DRS Performance and Best Practices

    Figure 10. Workload on a Host with Optimal Placement

    Intheoptimalplacementscenario,showninfigure10,weplacedcompensatingloadstogetheroneachhost.

    Withthisplacement,allpeaksmatchedthetroughsofothervirtualmachinesonthesamehost.Asaresult,

    noneofthehostsareovercommittedandtheloadisbalancedacrossallhostsinthecluster.Thethroughput

    achievedinthisscenarioisusedasthebaselineforoptimalthroughputintheteststhatfollow.

    Figure 11. Workload on a Host with Bad Placement

    Inthebadplacementscenario,showninFigure11,weplacedsimilarworkloadstogetheroneachhost

    whereverpossible.Asaresult,peaksandvalleysoccurtogetheronahost,resultinginahighlyimbalanced

    cluster.Consequently,thehostsareeitherovercommittedorhighlyundercommitted.Thisresultsinnotonly

    majorloadimbalanceacrosshostsbutalsopoorperformanceinthevirtualmachinesrunningonovercommittedhosts.AsshowninTable2,weusedincreasingnumbersofvirtualmachineswithsimilar

    workloadsasweincreasedthesizeofthecluster.Asaresult,wesawgreaterimbalanceforabadplacement

    astheclusterscaledup.

    Figure 12. Workload on a Host with Symmetric Placement

    Inthesymmetricplacementscenario,theclusterismoderatelyimbalanced,withamixofhoststhatare

    somewhatovercommittedandhoststhataresomewhatundercommitted.Wecallthissymmetricplacement

    becauseweusesthesameloaddistributionforeachpairofhostsaswescaledup.Asaresult,theload

    imbalanceremainedthesameforallstagesofourtests.

    Withthesmallerconfigurations,therearefewerworkloadsofthesametypeandhencetheloaddistribution

    cannotbeasflexibleasitiswiththelargersetups.

    1

    3

    5

    +

    +

    =

    Underutilized

    Host capacity

    Total virtual

    machine load

    Host loadVirtual machine loads

    1

    1

    1

    +

    +

    =

    Underutilized

    Host capacity

    Total virtualmachine load

    Host load

    Overcommitted

    Virtual machine loads

    1

    2

    3

    +

    +

    =

    Underutilized

    Host capacity

    Overcommitted

    Host loadVirtual machine loads

  • 8/7/2019 Drs Performance Best Practices Wp

    12/19

    Copyright 2008 VMware, Inc. All rights reserved. 12

    DRS Performance and Best Practices

    Test Results

    OurresultsshowthatDRSimproveshostloadbalanceacrossacluster.Theloadbalanceimprovement

    improvesasthenumberofhostsandvirtualmachinesintheclusterincreases.Furthermore,throughputalso

    improvesasthenumberofhostsandvirtualmachinesincreases.DRSperformsslightlybetterwithmorehosts

    becausetherearemoreopportunitiesforbalancingloads.Asthesizeoftheclusterincreased,thenumberof

    migrationsincreasedproportionally,butDRSensuredthatthemigrationswouldbenefitthroughput,hence

    theaveragevirtualmachinethroughputimproved.DRSdidnotaffectthroughputwhenthevirtualmachines

    wereplacedintheoptimalconfiguration,becausetheclusterwasalreadybalanced.Thekeyresultspresented

    inthissectionhighlightthesefindings.

    Figure13showstheresultswhenvirtualmachinesweredistributedinabadplacementthatis,ahighly

    imbalancedclusteratthebeginningoftheexperimentwitheachofthevirtualmachineworkloadsvarying

    everyfiveminutes.BecauseDRScomputedloadsandoptionsforbalancingtheloadseveryfiveminutes,this

    wasaworstcaseworkload.SoonafterDRSfinishedbalancingtheload,thevirtualmachineloadschanged.

    AsshowninFigure13,DRSimprovesvirtualmachinethroughputevenwhenitmustbalanceworkloadsthat

    varygreatly.Atallenvironmentsizes,theaveragevirtualmachinethroughputstayedwithin96percentofthe

    optimalthroughput.DRSachievedtheseresultsdespitethefactthatitwasusingVMotion,withitsattendant

    overhead,tobalancetheload.Furthermore,thethroughputimprovementcomparedtotheinitialcondition,

    withoutDRS,increasedaswescaledtolargerenvironments.Thisresultmainlyreflectsthefactthataswe

    scaled

    up,

    the

    number

    of

    similar

    workloads

    increased.

    The

    higher

    number

    of

    similar

    workloads,

    in

    turn,

    causedgreaterimbalancewithbadplacementinlargerenvironments.

    Figure13showstheresultswhenwesetDRStothedefaultaggressiveness(moderate).WhenwesetDRStoa

    moreaggressivethreshold,weobservedmanymoremigrations,butthroughputimprovementsweresimilar

    tothoseachievedwiththemoderatethreshold.Inaseparatetest,inwhichthevirtualmachineswererunning

    constantloads,weobservedevenbetterthroughputimprovements.

    Figure 13. Throughput with Varying Load and Bad Placement

    Figure14showstheresultswhenwedistributedvirtualmachinesinasymmetricplacementthatis,a

    somewhatimbalancedclusteratthebeginningofthetestwitheachofthevirtualmachineworkloadsvarying

    everyfiveminutes.AsFigure14shows,theinitialcondition,withoutDRS,achievesthesamevirtualmachine

    throughputatallenvironmentsizesbecausethevirtualmachinedistributionissymmetric.Ineachsize

    environment,whenweenabledDRSatamoderatethreshold,itbroughttheaveragethroughputtowithin98

    percentofoptimal.Increasingtheenvironmentsizebroughtminimalimprovementinthroughput,butthe

    mainreasonoptimalthroughputwasnotachievedwastheoverheadofVMotionasDRSbalancedtheload.

    Overall,DRAachievedsomeperformancegainsbytakingadvantageofsmalleropportunitiesforavoiding

    overcommitment.

    0.82

    0.84

    0.86

    0.88

    0.90

    0.92

    0.94

    0.96

    0.98

    1.00

    1.02

    2 hosts / 6 VMs 4 hosts / 12 VMs 8 hosts / 24 VMs 16 hosts / 48 VMs

    Averagevirtualmachinethrough

    put

    No DRS DRS moderate Optimal placement

  • 8/7/2019 Drs Performance Best Practices Wp

    13/19

    Copyright 2008 VMware, Inc. All rights reserved. 13

    DRS Performance and Best Practices

    Figure 14. Throughput with Varying Load and Symmetric Placement

    DRS Aggressiveness

    DegreeofaggressivenessforDRScontrolshowaggressivelyDRSactstobalanceresourcesacrosshostsina

    cluster.Thedegreeofaggressivenessissetfortheentirecluster.Thefiveoptionsrangefromconservative(level

    one)tomoderate(levelthree)toaggressive(levelfive)andcontrolthedegreetowhichDRStriestobalance

    resourcesacrosshostswithinthecluster.

    DRSmakesVMotionrecommendationsbasedonwhatisreferredtoastheirgoodnessvaluethatis,how

    muchimprovementinloadbalancingtheactioncanachieve.Thegoodnessvalueistranslatedtoarating

    betweenonestarandfivestars,andDRSexecutesmigrationsbasedontheaggressivenessthresholdsetfora

    cluster.InVMwareInfrastructure3,amigrationreceivesagoodnessvalueoffivestarsifitisrecommendedto

    resolve

    affinity,

    antiaffinity,

    or

    reservation

    rules.

    For

    each

    lower

    star

    rating

    the

    balancing

    impact

    of

    the

    migrationisless.Soamigrationwitharatingoffourstarswouldimprovetheloadbalancemorethana

    migrationwitharatingofthreestars,twostars,oronestar.

    Figure15showshowthroughputcomparesforthemoderateandaggressivelevelsforthemixofconstantload

    virtualmachinesandbadplacementdescribedinScalingtheNumberofHostsandVirtualMachineson

    page 9.Theresultsshowthattheaggressivethresholdachievedslightlybetterthroughputcomparedtothe

    moderatesetting.However,aswescaledtolargerconfigurations,thethroughputwasalmostthesameforboth

    settings.Whenweusedtheaggressivesetting,wesawmoremigrationsthanwedidwhenweusedthe

    moderatesetting.Whenweusedvaryingworkloads,notonlyweretheremoremigrationsintheaggressive

    casebutthethroughputdifferencesforthemoderateandaggressivesettingswereevensmaller.

    0.82

    0.84

    0.86

    0.88

    0.90

    0.92

    0.94

    0.96

    0.98

    1.00

    1.02

    2 hosts / 6 VMs 4 hosts / 12 VMs 8 hosts / 24 VMs 16 hosts / 48 VMs

    Averagevirtualmachine

    throughput

    No DRS DRS moderate Optimal placement

  • 8/7/2019 Drs Performance Best Practices Wp

    14/19

    Copyright 2008 VMware, Inc. All rights reserved. 14

    DRS Performance and Best Practices

    Figure 15. Throughput for Constant Load and Bad Placement

    Interval Between DRS Invocations

    VirtualCenteractivatestheDRSalgorithmafterafixedinterval(thedefaultisfiveminutes)soDRScanmake

    recommendationsbasedonthepastperformancemetricsofthevirtualmachinesintheDRScluster.This

    sectiondiscussestheimpactontheDRSalgorithmofchangingthisinterval.TheDRScalculationsareinvoked

    betweentheregularlyscheduledinvocationsiftherearechangesthatrequireDRSdecisionsforexample,if

    youchangeresourceallocationsettings.Ourdiscussionignorestheseintermediateinvocationsbecausethey

    dependonthefrequencywithwhichyouperformadministrativeoperationsinthecluster.

    OurresultsshowthatchoosingtheappropriateDRSperiodicfrequencydependsontheperiodicityofthe

    workloadswithinthevirtualmachinesinthecluster.FairlyconstantloadsmaybenefitfromfrequentDRS

    invocationssothatimmediateactionistakenwhentheloaddoeschange.However,wedonotrecommendan

    intervaloflessthanfiveminutesbecauseofthewaythealgorithmusesstatistics.Weobservednonotable

    benefitwhenweincreasedfrequencytomorethanonceeveryfiveminutes,andrunningtheDRSalgorithm

    veryoftencausesunnecessaryoverhead.Inaddition,anintervaloflessthanfiveminutesmightbetoo

    aggressivegiventhatVMotionmigrationscantakeontheorderoftensofsecondstocomplete,dependingon

    theloadrunninginthevirtualmachine.(Theguestoperatingsystemrunsnormallyforthemostofthe

    VMotionoperationandgenerallyincursadowntimeoflessthanonesecond.)

    Ifyounoticeveryfrequentmigrations,youmightfeeltheneedtoincreasetheinvocationinterval.However,

    beforedoingso,youshouldchecktheDRSaggressivenesssetting.Decreasingtheaggressivenessmightbethe

    betterchoice.WhenahighnumberofmigrationsisrecommendedinoneDRSinvocation,thisisusuallythe

    resultofanaggressivethresholdsetting.

    Ifaclusterincludesvirtualmachineswithmemoryintensiveworkloads,aninfrequentinvocationsettingmightnotallowDRStorespondtomemorypressuresoonenoughandthevirtualmachinemightstart

    ballooningorswappingunnecessarily.Thiscansignificantlydegradeperformance.However,becausean

    increaseinmemoryuseistypicallyamoregradualprocess,keepingthedefaultoffiveminutesdoesnotaffect

    performanceforamajorityofworkloads.DRSalsoprioritizesmemorybalancingmigrationsoverCPU

    balancingmigrationswhenmemorycommitmentishighinordertopreventunnecessaryballooningor

    swapping.

    0.82

    0.84

    0.86

    0.88

    0.90

    0.92

    0.94

    0.96

    0.98

    1.00

    1.02

    2 hosts / 6 VMs 4 hosts / 12 VMs 8 hosts / 24 VMs 16 hosts / 48 VMs

    Averagevirtualmachinethroughput

    No DRS DRS moderate DRS aggressive Optimal placement

  • 8/7/2019 Drs Performance Best Practices Wp

    15/19

    Copyright 2008 VMware, Inc. All rights reserved. 15

    DRS Performance and Best Practices

    Althoughwerecommendkeepingthedefaultvalueoffiveminutes,youcanchangethefrequencybyadding

    thefollowingoptionsinthevpxd.cfgfile,changing300tothedesirednumberofseconds.

    .

    300

  • 8/7/2019 Drs Performance Best Practices Wp

    16/19

    Copyright 2008 VMware, Inc. All rights reserved. 16

    DRS Performance and Best Practices

    Highlyskewedhostsizes

    Virtualmachinesmightnotfitonallhosts.Forexample,virtualmachineswithfourvirtualCPUs

    cannotmigratetohostswithonlytwophysicalCPUs.

    DRStendstofavormigratingvirtualmachinestoahostwithmorememoryorCPUresourcesto

    balancememoryorCPUloads.Essentially,thealgorithmtriestobalancetherelativeutilizationsfor

    eachhost.Soifaclusterhasvirtualmachineswithidenticalloads,DRStypicallyplacesmorevirtual

    machinesonahostthathasmoreCPUormemoryresources.Ifthevirtualmachineloadsare

    primarilyCPUintensive,thehostmemorysizesdonotreallymatter.Conversely,iftheloadsare

    memoryintensive,CPUcapacitiesdonotmatter.

    HostconfigurationTheconfigurationofthehostnetworkordatastoremaydisallowmigrationusing

    VMotionbecausenotallhostssharethesamevirtualmachineorVMotionnetworkordatastore.

    DRSdoesareasonablejobinbalancingloadacrossheterogeneoushostssolongastheclusterismadeupofa

    subsetofhoststhatareVMotioncompatible.Thefactorsdiscussedinthissectionmaylimitbenefits.We

    recommendahomogenousclusterwhereverpossible.

    Impact of Idle Virtual Machines

    BasedonthetestsweperformedinourlabswithDRS,inenvironmentsrangingfromsmalltolarge,idle

    virtualmachinesweregenerallynotmigratedbecausetheydidnotusesignificantresources.However,inan

    imbalancedDRSclusterwithaggressivemode,alargenumberofidlevirtualmachinesperhost,orboth,DRS

    wouldmigratetheidlevirtualmachinesbecausetheyhavesomememoryoverhead.Furthermore,some

    operatingsystemsuseCPUcyclesforidlingforexample,fordeliveringguesttimerinterruptsandthiscan

    alsoaffectDRSmigrationrecommendations.

    DRS Performance Best Practices

    ClusteringconfigurationscanhavesignificantimpactonDRSperformance.VMwarerecommendsthe

    followingDRSconfigurationsandpracticesforoptimalperformance:

    WhendecidingwhichhoststogroupintoaDRScluster,trytochoosehoststhatareashomogeneousas

    possibleinCPUandmemory.Thisensureshigherperformancepredictabilityandstability.VMotionis

    notsupportedacrosshostswithincompatibleCPUs.Hencewithheterogeneoussystemsthathave

    incompatibleCPUs,DRSislimitedinthenumberofopportunitiesforimprovingtheloadbalanceacross

    thecluster.ToensureCPUcompatibility, systemsshouldbeconfiguredwithCPUsfromthesamevendor,

    withsimilarCPUfamilyandSSE3status.However,inVMwareInfrastructure3beginningwithversion

    3.5Update2,EnhancedVMotionCompatibilitygivesyoutheabilitytoensureVMotioncompatibilityfor

    thehostsintheclusterbypresentingthesameCPUfeaturesetacrosshosts,evenwhentheactualCPUs

    onthehostsdiffer.SeeVMwareVMotionandCPUCompatibilityfordetails.SeeResourceson

    page 19foralink.

    DRSperformsinitialplacementofvirtualmachinesacrossallhostseveniftheyarenotVMotion

    compatible.

    IfheterogeneoussystemsdohavecompatibleCPUsbuthavedifferentCPUfrequencies,differing

    amountsofmemory,orboth,thesystemswithmorememoryandhigherCPUfrequenciesaregenerallypreferred,allotherthingsbeingequal,ashostsforvirtualmachines,becausetheyhavemoreroomto

    accommodatepeakload.

    Usingmachineswithdifferentcacheormemoryarchitecturesmaycausesomeinconsistencyin

    performanceofvirtualmachinesacrossthosehosts.

    WhenmoreESXhostsinaDRSclusterareVMotioncompatible,DRShasmorechoicestobetterbalance

    workloadsacrossthecluster.Werecommendclustersofupto32hosts.

  • 8/7/2019 Drs Performance Best Practices Wp

    17/19

    Copyright 2008 VMware, Inc. All rights reserved. 17

    DRS Performance and Best Practices

    BesidesCPUincompatibilities,somemisconfigurationscanmaketwoormorehostsincompatible.For

    instance,ifthehostsVMotionnetworkadaptersarenotconnectedbyaGigabitEthernetlink,the

    VMotionmigrationmightnotoccurbetweenthehosts.YoushouldalsobesuretheVMotiongatewayis

    configuredcorrectly,VMotionnetworkadaptersonthesourceanddestinationhostshavecompatible

    securitypolicies,andthevirtualmachinenetworkisavailableonthedestinationhost.SeetheResourceManagementGuideandtheESXServer3ConfigurationGuidefordetails.SeeResourcesonpage 19forlinks.

    Thedefaultmigrationthreshold(moderate)worksformostconfigurations.Youcansetthemigrationthresholdtomoreaggressivelevelswhenallofthefollowingconditionsaresatisfied:

    Thehostsintheclusterarerelativelyhomogeneous.

    Thevirtualmachinesresourceutilizationremainsfairlyconstant.

    Theclusterhasrelativelyfewconstraintsonwhereavirtualmachinecanbeplaced

    Youshouldsetthemigrationthresholdtomoreconservativelevelswhentheconverseistrue.

    ThedefaultDRSfrequencyisonceeveryfiveminutes,butyoucansetittoanyperiodbetweenoneand

    60minutes.Youshouldavoidchangingthedefaultvalue.Ifyouareconsideringachangeinthesetting,

    seeIntervalBetweenDRSInvocationsonpage 14.

    Ingeneral,donotspecifyaffinityrulesunlessyouhaveaspecificneedtodoso.Insomecases,however,

    specifyingaffinityrulescanimproveperformance.

    Keepingvirtualmachinestogethercanimproveperformanceifthevirtualmachinesneedto

    communicatewitheachother,becausenetworkcommunicationbetweenvirtualmachinesonthe

    samehostenjoyslowerlatencies.

    Separatingvirtualmachinesmaintainsmaximalavailabilityofthevirtualmachines.

    Forexample,iftwovirtualmachinesarebothWebserverfrontendstothesameapplication,you

    wanttomakesurethattheydonotbothgodownatthesametime.

    Anotherexampleofvirtualmachinesthatmightneedtobeseparatedisvirtualmachineswith

    I/Ointensiveworkloads.Iftheyshareasinglehost,theymightsaturatethehostsI/Ocapacity,

    leadingtoperformancedegradation.DRSdoesnotmakevirtualmachineplacementdecisionsbased

    ontheirusageofI/Oresources.

    Assignresourceallocationstovirtualmachinesandresourcepoolscarefully.Bemindfuloftheimpactof

    limits,reservationsandvirtualmachinememoryoverhead.SeeResourceAllocationRecommendations

    onpage 8fordetails.

    VirtualmachineswithsmallermemorysizesorfewervirtualCPUsprovidemoreopportunitiesforDRS

    tomigratetheminordertoimprovebalanceacrossthecluster.Virtualmachineswithlargermemorysize

    ormorevirtualCPUsaddmoreconstraintsinmigratingthevirtualmachines.Henceyoushould

    configureonlyasmanyvirtualCPUsandasmuchmemoryforavirtualmachineasneeded.

    YoucanspecifyDRSmodesofautomatic,manual,orpartiallyautomatedattheclusterlevelaswellasthe

    virtualmachinelevel.Werecommendthatyoukeeptheclusterinautomaticmode.Forvirtualmachines

    sensitivetoVMotion,youcansetmanualmodeatthevirtualmachinelevel.Thissettingallowsyouto

    decideifandwhenthevirtualmachinecanbemigrated.KeepvirtualmachinesinDRSautomaticmode

    asmuchaspossible,becausevirtualmachinesinautomaticmodeareconsideredforclusterloadbalance

    migrationsacrossESXhostsbeforevirtualmachinesthatcannotbemigratedwithoutuseractions.

    Monitoring DRS Cluster Health

    InVMwareInfrastructure3,youcanmonitoraDRSclustershealthbytrackingresourceutilizationofthe

    cluster,loaddistributionacrosshosts,andwhetherthevirtualmachinesarereceivingtheresourcetowhich

    theyareentitled.Thisinformationisdisplayedintwographsthatarepartoftheclustersummarydisplay.

  • 8/7/2019 Drs Performance Best Practices Wp

    18/19

    Copyright 2008 VMware, Inc. All rights reserved. 18

    DRS Performance and Best Practices

    Host Utilization Percentage Graph

    Thehostutilizationgraph,thetopgraphintheVMwareDRSResourceDistributionsection,isahistogramthat

    showsthenumberofhostsontheYaxisandtheutilizationpercentageontheXaxis.Iftheclusteris

    unbalanced,youseemultiplebars,correspondingtodifferentutilizationlevels.ThebluebarsrepresentCPU

    usageandtheorangerepresentmemoryusage.Forexample,inthescreencapturebelow,twohostsareat010

    CPUutilization,thethirdat8090percentCPUutilization,eachrepresentedbyabluebar.Inthiscluster,DRS

    isinmanualmode(moderateaggressiveness)andismakingarecommendation.Aftertherecommendationis

    appliedthebluebarsbecomeclosertoeachotherontheXaxis.Theclosertheblueandorangebarsaretoeach

    other,themorebalancedtheclusteris.

    Percent of Entitled Resources Delivered Graph

    Chart(BottomDRSResourceDistributionChart)

    Theentitledresourcesdeliveredgraph,thebottomgraphintheVMwareDRSResourceDistributionsection,

    isahistogramthatshowsthenumberofhostsontheYaxisandthepercentageofentitledresourcesdelivered

    foreachhostontheXaxis.Thetopgraphreportsrawresourceutilizationvalues.Thebottomgraph

    incorporatesadditionalinformationaboutresourcesettingsforvirtualmachinesandresourcepools.

    DRScomputesaresourceentitlementforeachvirtualmachine,basedonvirtualmachineandresourcepool

    configuredshares,reservations,andlimitssettings,aswellasthecurrentdemandsofthevirtualmachinesand

    resourcepools.Whatthevirtualmachinedemandsisnotnecessarilywhatitdeservesorisentitledtobecause

    ofthevirtualmachineandresourcepoolconfigurations.

    Aftercomputingtheentitlements,DRScomputesaresourceentitlementforeachhostbyaddingtheresource

    entitlementsforallvirtualmachinesrunningonthathost.Thepercentageofentitledresourcesdeliveredis

    equaltothehostscapacitydividedbytheentitlementsofthevirtualmachines.Thismeansthatinaclusterin

    whichalltheentitledresourcesaredelivered,thegraphshouldhaveasinglebarforeachresourceinthe

    90100percenthistogramrange.ThescreenshotintheprevioussectionshowsthateventhoughtheDRS

    clusterwasnotbalanced,eachofthevirtualmachinesgottheresourcestowhichitwasentitled.

    However,theremaybecasesinwhichhostsdonotdeliverallentitledresources.Forexample,inthescreen

    capturebelow,oneofthehostscoulddeliveronly8090percentofentitledresourcesbecauseCPUresources

    wereovercommittedonthehost.

    Insummary,ifthebarsinthisgraphappearintherightmostcategory,allvirtualmachinesaregettingthe

    resourcestowhichtheyareentitled.

  • 8/7/2019 Drs Performance Best Practices Wp

    19/19

    DRS Performance and Best Practices

    If you have comments about this documentation, submit your feedback to: [email protected]

    VMware, Inc. 3401 Hillview Ave., Palo Alto, CA 94304 www.vmware.com

    Copyright 2008 VMware, Inc. All rights reserved. Protected by one or more of U.S. Patent Nos. 6,397,242, 6,496,847, 6,704,925, 6,711,672, 6,725,289, 6,735,601, 6,785,886,6,789,156, 6,795,966, 6,880,022, 6,944,699, 6,961,806, 6,961,941, 7,069,413, 7,082,598, 7,089,377, 7,111,086, 7,111,145, 7,117,481, 7,149, 843, 7,155,558, 7,222,221, 7,260,815,7,260,820, 7,269,683, 7,275,136, 7,277,998, 7,277,999, 7,278,030, 7,281,102, 7,290,253, and 7,356,679; patents pending. VMware, the VMware boxes logo and design,Virtual SMP and VMotion are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other marks and names mentionedherein may be trademarks of their respective companies.

    Revision 20090109 Item: PS-060-PRD-01-01

    Conclusions

    TheeffectivenessandscalabilitytestsweperformedwithDRSdemonstratethatresourcesaredistributed

    acrosshostsinaDRSclusteraccordingtotheresourceallocationsandthatbetterthroughputsareachieved

    especiallyinthepresenceofvaryingloadsandrandomvirtualmachineplacementonhosts.TheDRS

    frameworkalsoprovidestheabilitytoadjusttheaggressivenessoftheDRSalgorithmandintervalbetween

    invocationsofthealgorithm.Inaddition,DRSavoidswastefulmigrationswithcostbenefitanalysisand

    minimizesmigrationofidlevirtualmachines.

    Basedonourexperienceandcustomerexperienceinthefield,weprovidedalistofbestpracticesthatyoucan

    followtoavoidpitfalls.

    Finally,DRSalsoprovidesafewmechanismsyoucanusetomonitortheDRSclusterperformanceand

    potentially

    provide

    feedback

    on

    how

    the

    resources

    are

    being

    delivered

    and

    whether

    the

    DRS

    settings

    need

    to

    beadjusted.

    Resources

    ESXServer3ConfigurationGuidehttp://www.vmware.com/pdf/vi3_35/esx_3/r35/vi3_35_25_3_server_config.pdf

    ResourceManagementwithVMwareDRS

    http://www.vmware.com/resources/techresources/401

    ResourceManagementGuidehttp://www.vmware.com/pdf/vi3_35/esx_3/r35/vi3_35_25_resource_mgmt.pdf

    VMwareVMotionandCPUCompatibilityhttp://www.vmware.com/resources/techresources/1022

    mailto:[email protected]:[email protected]:[email protected]