Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools...

21
Performance Tools and Holistic HPC Workflows Karen L. Karavanic Portland State University Work Performed with: Holistic HPC Workflows: David Montoya (LANL) PSU Drought Project: Yasodha Suriyakumar (CS), Hongjiang Yan (CEE), PI: Hamid Moradkhani (CEE), co-PI: Dacian Daescu (Math) PPerfG PSU Undergraduate Programmers: Jiaqi Luo, Le Tu

Transcript of Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools...

Page 1: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PerformanceToolsandHolisticHPCWorkflows

KarenL.KaravanicPortlandStateUniversity

WorkPerformedwith:HolisticHPCWorkflows:DavidMontoya(LANL)

PSUDroughtProject:Yasodha Suriyakumar (CS),Hongjiang Yan(CEE),PI:HamidMoradkhani (CEE),co-PI:DacianDaescu (Math)PPerfG PSUUndergraduateProgrammers:Jiaqi Luo,LeTu

Page 2: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

Slide 2

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

UNCLASSIFIED - LA-UR-16-23542

WhatisanHPCWorkflow?HolisticView

– Onescienceeffortacrossaperiodoftime/campaign,orfor1specificgoal– mayincludemultipleplatformsorlabs

– Trackresourceutilization,performance,andprogress,datamovement

– IncludesSystemServices– power,resourcebalance,scheduling,monitoring,datamovement,etc.

– IncludesDataCenter– power,cooling,physicalplacementofdataandjobs

– Informedby&InterfaceswiththeApplicationandExperimentViews

– Includeshardware,systemsoftwarelayers,application

Page 3: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

Slide 3

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

UNCLASSIFIED - LA-UR-16-20222

Foundational Work: All Layers of Workflow and their RelationshipsLayer 0 – Campaign• Process through time of repeated Job Runs• Changes to approach, physics and data needs as a campaign or

project is completed - Working through phasesLayer 1 – Job Run• Application to application that constitute a suite job run series• May include closely coupled applications and decoupled ones that

provide an end-to-end repeatable process with differing input parameters

• User and system interaction, to find an answer to a specific science question.

Layer 2 – Application• One or more packages with differing computational and data requirements

Interacts across memory hierarchy to archival targets• The subcomponents of an application {P1..Pn} are meant to model various

aspects of the physics Layer 3 – Package• The processing of kernels within a phase and associated interaction with

various levels of memory, cache levels and the overall underlying platform• The domain of the computer scientist

Page 4: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

Slide 4

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

UNCLASSIFIED - LA-UR-16-20222

We described a layer above the application layer (2) that posed use cases that used the application in potential different ways. This also allowed the entry of environment based entities that impact a given workflow and also allow impact of scale and processing decisions. At this level we can describe time, volume and speed requirements.

Layer 1 – Ensemble of applications – Use Case – example template

Page 5: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

Slide 5

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

UNCLASSIFIED - LA-UR-16-23542

OurGoal

MeasurementinfrastructureinsupportofHolisticHPCWorkflowPerformanceAnalysisandValidation

Page 6: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

Goal#1:PPerfG

• Motivation:Howcanweautomaticallygeneratetheworkflowlayerdiagrams?

• InitialFocus:• Layer2(Application):OneormorepackageswithdifferingcomputationalanddatarequirementsInteractsacrossmemoryhierarchytoarchivaltargets

• Approach:• ImplementsimpleprototypeusingpythonandTkInter• Investigatedatacollectionoptions• Evaluatewithacasestudy

KarenL.Karavanic7/9/18 6

Page 7: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PPerfG

• PPerfG:AVisualizationToolforHolisticHPCWorkflowsforuseinbothperformancediagnosisandprocurement

• Capturesthedatamovementbehaviorbetweenstoragelayers,andbetweendifferentstagesofanapplication

• Challenges:MeasurementandDataintegrationtogeneratethedisplay• InitialprototypedevelopedwithPythonandTkInter

KarenL.Karavanic7/9/18 7

Page 8: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PPerfG Prototype

KarenL.Karavanic7/9/18 8

Page 9: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PPerfG Prototype

KarenL.Karavanic7/9/18 9

Page 10: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PPerfG Prototype:simplejson inputfile

KarenL.Karavanic7/9/18 10

Page 11: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

CaseStudy:TheDroughtHPC1 ProjectGoals

• DevelopaperformantimplementationofDroughtHPC,anovelapproachtodroughtpredictiondevelopedatPortlandStateUniversity

• Scaletheapplicationtodofiner-grainedsimulations,andtosimulatealargergeographicalarea

o DroughtHPCo improvespredictionaccuracyforatargetgeographicalareao usesdataassimilationtechniquesthatintegratedatafromhydrologicmodelsandsatellite

datao UsesMonteCarlomethodstogenerateanumberofsamplespercello Inputsspanavarietyofdata:soilconditions,snowaccumulation,vegetationlayers,canopy

coverandmeteorologicaldatao UsesVariableInfiltrationCapacity(VIC)MacroscaleHydrologicModel2

1 https://hamid.people.ua.edu/research.html2 Liang,X.,D.P.Lettenmaier,E.F.Wood,andS.J.Burges(1994),Asimplehydrologicallybasedmodeloflandsurfacewaterandenergyfluxesforgeneralcirculationmodels, J.Geophys.Res., 99(D7),14415–14428, doi:10.1029/94JD00483

KarenL.Karavanic7/9/18 11

Page 12: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

CaseStudy:DroughtHPC Code

• ApplicationiswritteninPython,andusestwohydrologicmodelsVIC[2]writteninC,andPRMS[3]writteninFORTRANandC

• Themodelingcodesaretreatedas“blackboxes”bythedomainscientists• Landsurfaceofthetargetgeographicalareaismodeledasagridof

uniformcells,andsimulationdividesitintojobs,withgroupof25cellsineachjob

• DataisSmallbyourstandards:Forajobthatsimulates50meteorologicalsamplesandonemonthtimeperiod:

• inputdatasize:144.5MB• satellitedata:132MB

• Runtimefor1 job(25cells)onsingle-nodeisapproximatelytwohourswiththeinitialPythonprototype

KarenL.Karavanic7/9/18 12

Page 13: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

KarenL.Karavanic7/9/18 13

Yan,H.,C.M.DeChant,andH.Moradkhani(2015), ImprovingSoilMoistureProfilePredictionwiththeParticleFilter-MarkovChainMonteCarloMethod,IEEETransactiononGeoscienceandRemoteSensing,DOI:10.1109/TGRS.2015.2432067

Page 14: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

InitializationOverheads

• Meanof30runs,simulationof24hours(onehourtimesteps)• Columbiariverbasin(CRB)has5359cellsinVIC4dataset,butithas11280cellsinVIC5dataset.Thedatausedinthemeteorologicalforcingisdifferentbetweenthetwoversions.VIC5dataincludesprecipitation,pressure,temperature,vaporpressure,andwindspeed.VIC4dataspecifiesmaximumtemperature,minimumtemperature,precipitationandwindspeed.

KarenL.Karavanic7/9/18 14

Model Data Initialization(Milliseconds)

Work(Milliseconds)

WriteOutput(Milliseconds)

Total

VIC4–ASCIItextfiles

Sample–singlecell

177.592 (99%)

0.241 0.144 177.977

CRB25cells

4,079.126 (98%)

70.990 10.774 4,170.89

VIC5–NetCDFfiles

Sample–Stehekin data – 20 cells

19,088.990 (99%)

196.116 29.065 19,314.171

CRB–11280cells

26,277.904 (47%)

29,001.285 80.398 55,359.587

Page 15: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

DroughtHPC /VICcallingpatterns

• InitialDroughtHPC prototypecode(python)calledVICversion4 (“classicdriver”):

• Foreachgridcell• Foreachsimulationtimestep

• Foreachprobabilisticsample• CallVIC• Useresultstocomputeinputsfornexttimestep

• VIC4isTime-before-space• NewVIC5“imagedriver”isSpace-before-time,designedforcall-once

• UsesMPI,embarassingly parallelmodel(eachcellcomputationisindependent)• SinglecalltoVICcannowcomputeoveralldata,reducingcalloverhead

• Oursolution:addextensibilitytoVIC,injectourcodeintothemodel

KarenL.Karavanic7/9/18 15

Page 16: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PPerfG:VisualizingDataPatternsAcrossSeparateCodes

KarenL.Karavanic7/9/18 16

drawing(notscreenshot)

Page 17: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PPerfG:Illustratingthechangeincallingpattern

KarenL.Karavanic7/9/18 17

drawing(notscreenshot)

Page 18: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PPerfG DataCollection

• PerformanceDatawascollectedwithavarietyofperformancetools• Nosingleperformancetoolprovidesallofthedataweneed• Notoolcharacterizesthecallingpattern/interactionsbetweenPythonandVIC

• PerfTrack performancedatabase1 usedtointegratethedatapostmortembutsomeintegrationwasdonemanually

• InterfaceoverPostGreSQL relationaldatabase• Multiplerunsfordifferentmeasurementtools

• Json filewasgeneratedmanually

1KarenL.Karavanic,JohnMay,KathrynMohror,BrianMiller,KevinHuck,RashawnKnapp,BrianPugh,"IntegratingDatabaseTechnologywithComparison-basedParallelPerformanceDiagnosis:ThePerfTrack PerformanceExperimentManagementTool," SC2005.

KarenL.Karavanic7/9/18 18

Page 19: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

PPerfG FutureWork

• HowtoeasecomparisonofdifferentversionswithPPerfG?• Slidertomoveforwardovertimefromstarttofinish?• Canwegeneratethejson automaticallyfromPerfTrack?• Howtointegrateapplication/developersemanticswithmeasurementdata?

• Howtolinkdatastructuresinmemorywithfiles?• Howtolabelthephases?• Howtocollecttheloopinformationatthebottom?

• Howtoshowscalingbehaviors?• Numberoffilespersimulationday?• Sizeoffilespersimulationcell?• TrafficMapidea:useedgecolorstoshowdatacongestion

KarenL.Karavanic7/9/18 19

Page 20: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

ConclusionsandFutureWork

• Weproposeanewperformancemetric:WorkflowCriticalPath• WCP:WhatpartoftheEntireWorkflow tofocuson?• DroughtHPC casestudy:patternoffileactivity,callingpattern,overheadofVICinitialization

• WehavedesignedPPerfG,avisualizationforWorkflowLayer2:Application• WorkflowLayers:differentperspectivesofanHPCworkflowusedinHolisticHPCPerformanceDiagnosis

• DataCollectionischallenging• NeedtointegrateLayer3(Package)– drillingdownintoDroughtHPC andVIC

• PerfTrack isusefultogatheranddosomeintegrationofdata• Currentlythegenerationofjson filesismostlymanual

KarenL.Karavanic7/9/18 20

Page 21: Performance Tools and Holistic HPC Workflows · •Multiple runs for different measurement tools •Jsonfile was generated manually 1Karen L. Karavanic, John May, Kathryn Mohror,

Acknowledgments

• PSUstudentsHenryCooney, Tu Le,Jiaqi Luo,andKristinaFryecontributedideasanddiscussionsandimplementedsoftwareusedinthisproject. StudentsinKaravanic’s AcceleratedComputingandIntroductiontoPerformancecoursesperformedanalysisandparallelizationoftheVICmodelingcode.

• ThismaterialisbaseduponworksupportedbytheNationalScienceFoundationunderGrantNo.1539605. Anyopinions,findings,andconclusionsorrecommendationsexpressedinthismaterialarethoseoftheauthor(s)anddonotnecessarilyreflecttheviewsoftheNationalScienceFoundation.

• ThisworksupportedinpartbyagenerousgiftfromIntelCorp.• PartofthisworkwasconductedattheUltrascale SystemsResearchCenter(USRC)supportedbyLosAlamosNationalLaboratoryunderContractNo.DE-AC52-06NA25396withtheU.S.DepartmentofEnergy.TheU.S.Governmenthasrightstouse,reproduce,anddistributethisinformation.ThisworksupportedinpartbyPortlandStateUniversityandbytheNewMexicoConsortium.

Contact:[email protected]

KarenL.Karavanic7/9/18 21