Post on 19-Dec-2015
Sameer Shende and Alan Morris{sameer, amorris}@cs.uoregon.edu
Department of Computer and Information Science
NeuroInformatics Center
University of Oregon
Advances in the TAU Performance System
2
Acknowledgement
Jaideep Ray, SNL Nick Trebon, U. Oregon Allen D. Malony, U. Oregon Manish Parashar, Rutgers Maria Liu, Rutgers
4
TAU Performance System Framework
Tuning and Analysis Utilities Performance system framework for scalable parallel and distributed high-
performance computing Targets a general complex system computation model
nodes / contexts / threads Multi-level: system / software / parallelism Measurement and analysis abstraction
Integrated toolkit for performance instrumentation, measurement, analysis, and visualization Portable, configurable performance profiling/tracing facility Open software approach
University of Oregon, LANL, FZJ Germany http://www.cs.uoregon.edu/research/paracomp/tau
6
Enhancements in TAU Instrumentation
Automatic generation of proxy components for SIDL & Classic CCA Malloc/free wrapper interposition library Support for MPI-2, SHMEM in wrapper interposition library TAU_COMPILER – improves TAU’s integration into Makefiles
Profile Measurement Phase based profiling Callpath profiling featuring user defined callpath depth Support for memory profiling Compensation of measurement overhead (-COMPENSATE)
Trace Measurement Online trace analysis, automatic merging and conversion of traces Support for hierarchical trace merging Support for binary VTF3 format (-vtf=<dir> configuration) Support for hardware performance counters in traces (Vampir) Trace to profile converter (vtf2profile) Trace input library
7
Enhancements in TAU (contd.)
Analysis PerfDMF (Performance Data Management Framework)
Oracle, PostgreSQL, MySQL supported
Paraprof profile browser Normalized/non-normalized views Callpath profile view (immediate parents, routine, immediate children) Scalable histogram display PerfDMF integration – load, update performance data Support for gprof, mpiP, Dynaprof, hpmtoolkit, psrun (besides TAU) Callgraph display with clickable callpaths
VNG (Vampir Next Generation, TU Dresden) Online/offline trace visualization Support for binary TAU format in VNG
CUBE (UTK, FZJ) calltree visualizer
8
TAU Performance Measurement
TAU supports profiling and tracing measurement TAU supports tracking application memory utilization Robust timing and hardware performance support using
PAPI Support for online performance monitoring
Profile and trace performance data export to file system Selective exporting
Extension of TAU measurement for multiple counters Creation of user-defined TAU counters Access to system-level metrics
Support for callpath and phase measurement Integration with system-level performance data
9
Memory Profiling in TAU
Configuration option –PROFILEMEMORY Records global heap memory utilization for each function Takes one sample at beginning of each function and
associates the sample with function name Independent of instrumentation/measurement options
selected No need to insert macros/calls in the source code User defined atomic events appear in profiles/traces For Traces, see Vampir’s
Global Displays->CounterTimeline to view memory samples
10
Memory Profiling in TAU
Instrumentation based observation of global heap memory (not per function) call TAU_TRACK_MEMORY()
Triggers one sample every 10 secs call TAU_TRACK_MEMORY_HERE()
Triggers sample at a specific location in source code call TAU_SET_INTERRUPT_INTERVAL(seconds)
To set inter-interrupt interval for sampling call TAU_DISABLE_TRACKING_MEMORY()
To turn off recording memory utilization call TAU_ENABLE_TRACKING_MEMORY()
To re-enable tracking memory utilization
11
TAU’s malloc/free wrapper for C/C++
#include <TAU.h>
#include <malloc.h>
int main(int argc, char **argv)
{
TAU_PROFILE(“int main(int, char **)”, “ ”, TAU_DEFAULT);
int *ary = (int *) malloc(sizeof(int) * 4096);
// TAU’s malloc wrapper library replaces this call automatically
// when $(TAU_MEMORY_INCLUDE) is used in the Makefile.
…
free(ary);
// other statements in foo …
}
13
Using TAU’s Malloc Wrapper Library for C/C++
include /usr/common/acts/TAU/tau-2.14.1/rs6000/lib/Makefile.tau-pdt
CC=$(TAU_CC)
CFLAGS=$(TAU_DEFS) $(TAU_INCLUDE) $(TAU_MEMORY_INCLUDE)
LIBS = $(TAU_LIBS)
OBJS = f1.o f2.o ...
TARGET= a.out
TARGET: $(OBJS)
$(F90) $(LDFLAGS) $(OBJS) -o $@ $(LIBS)
.c.o:
$(CC) $(CFLAGS) -c $< -o $@
14
Profile Measurement – Three Flavors
Flat profiles Time (or counts) spent in each routine (nodes in callgraph). Exclusive/inclusive time, no. of calls, child calls E.g,: MPI_Send, foo, …
Callpath Profiles Flat profiles, plus Sequence of actions that led to poor performance Time spent along a calling path (edges in callgraph) E.g., “main=> f1 => f2 => MPI_Send” shows the time spent in MPI_Send
when called by f2, when f2 is called by f1, when it is called by main. Depth of this callpath = 4 (TAU_CALLPATH_DEPTH environment variable)
Phase based profiles Flat profiles, plus Flat profiles under a phase (nested phases are allowed) Default “main” phase has all phases and routines invoked outside phases Supports static or dynamic (per-iteration) phases E.g., “IO => MPI_Send” is time spent in MPI_Send in IO phase
15
Flat Profile – Pprof Profile Browser
Intel Linux cluster
F90 + MPICH
Profile - Node - Context - Thread
Events - code - MPI
21
TAU’s CCA Performance Component Measurement port and interfaces
Timer set name/type/group start/stop
Phase set name/type/group start/stop
Control enable/disable groups
Query get timer names, get metric names, get user-defined event names get timer data, get user-defined event data, dump data to disk
Event set name, trigger event
MemoryTracker enable interrupt tracking, track memory here, set interrupt interval enable/disable tracking memory
22
Performance evaluation using Performance component Uses underlying TAU library for measurement Timer, Phase, Event, Control, Query, MemoryTracker
interfaces Lightweight instrumentation option
Performance modeling using Mastermind component Tracks per-invocation performance data Associates performance data with application data Method arguments logged with performance data Callpath information Helps us build performance models [IPDPS’04]
TAU’s CCA Interfaces
23
Phase Interface
interface Timer { /* Start/stop the Timer */ void start(); void stop();
/* Set/get the Timer name */ void setName(in string name); string getName();
/* Set/get Timer type information (e.g., signature of the routine) */
void setType(in string name); string getType();
/* Set/get the group name associated with the Timer */
void setGroupName(in string name); string getGroupName();
/* Set/get the group id associated with the Timer */
void setGroupId(in long group); long getGroupId(); }
interface Measurement extends gov.cca.Port { /* Create a Timer */ Timer createTimer(); Timer createTimerWithName(in string name); Timer createTimerWithNameType(in string name,
in string type); Timer createTimerWithNameTypeGroup(in
string name, in string type, in string group);
interface Phase { /* Start/stop the Phase */ void start(); void stop();
/* Set/get the Phase name */ void setName(in string name); string getName();
/* Set/get Phase type information (e.g., signature of the routine) */
void setType(in string name); string getType();
/* Set/get the group name associated with the Phase */
void setGroupName(in string name); string getGroupName();
/* Set/get the group id associated with the Phase */
void setGroupId(in long group); long getGroupId(); }
interface Measurement extends gov.cca.Port { /* Create a Phase */
Phase createPhase(); Phase createPhaseWithName(in string name); Phase createPhaseWithNameType(in string name,
in string type); Phase createPhaseWithNameTypeGroup(in
string name, in string type, in string group);
24
Measurement Proxy Component
Interpose a proxy component for each port Inside the proxy
Make calls to Performance component for each invocation
MidpointIntegrator
IntegratorPortGo
Driver
IntegratorPort
IntegratorProxy Component
IntegratorPortUsesIntegratorPortProvides
MeasurementPort
Performance
MeasurementPort
25
MasterMind Component
Idea: Create a performance model for the component by tracking performance per invocation
Uses Monitor Port Outputs:
Times per invocation, e.g.
Component call path Regular performance data (uses performance component)
# integ_proxy::integrate(double, double, int)# MPI_TIME Time count lowBound upBound 72420 336 10000 0 1 407 449 1000 0 1 364 540 100 0 1 64838 844 10000 0 1 381 945 1000 0 1 332 1027 100 0 1
26
Monitor Proxy Component
Same idea (from the user’s point of view)
MidpointIntegrator
IntegratorPortGo
Driver
IntegratorPort
Integrator Monitor Proxy
IntegratorPortUsesIntegratorPortProvides
MonitorPort
MonitorPort
MasterMind
MeasurementPort MeasurementPort
Performance
27
Tree pruner Input:
Callgraph generated by Mastermind component User specified rules
Output: Pruned callgraph with insignificant nodes removed
Performance modeling library – brute force Tries all possible permutations of component instances Input: performance model of each component Selects optimal component assembly for the ensemble
Optimizer Swaps one component instance with another
Tools Included with MasterMind Component
28
Generate regular measurement proxy or monitor (MasterMind) proxy
Arguments:
Options:
TAU’s Proxy Generator for SIDL/Classic CCA
-c <component name> Full name of the component-t <type name> Type of component-p <port name> Name of port to generate proxy for-d <pdbfile name> Name of pdb file created from cxxparse-h <header file> Header file for this port
-n <proxy name> Name of the proxy component (default: base of component name + Proxy)-o <output filename> Name of output file (default: proxy.cc)-f <selective instrumentation file> Use Pre-generated Selective instrumentation file-x <tag> Namespace Tag-m Generate MasterMind component proxy
29
TAU’s Proxy Generator for Classic C++ Interface
Creating PDB Files:
Merging PDB Files:
Invoking tau_pg (example)
pdbmerge -o merged.pdb file1.pdb file2.pdb …
cxxparse <file.cpp> -I<dir> -D<flags>
tau_pg -c integrators::ccaports::Integrator -t integrators.ccaports.Integrator -p IntegratorPort -d ParallelIntegrator_CCA.pdb -o Proxy.cc -h ports/Integrator_CCA.h -f select.dat
30
What’s Going On Here?
Alternative implementationsof performance componentruntime TAU
performance data
TAU API other API
…
ApplicationComponent
ApplicationComponent
PerformanceComponent
TAU API
ApplicationComponent
ApplicationComponent
31
Using TAU
Install TAU% configure ; make clean install
Instrument application TAU Profiling API
Typically modify application makefile include TAU’s stub makefile, modify variables
Set environment variables directory where profiles/traces are to be stored name of merged trace file, retain intermediate trace files, etc.
Execute application% mpirun –np <procs> a.out;
Analyze performance data paraprof, vampir/traceanalyzer, pprof, paraver …
32
AutoInstrumentation using TAU_COMPILER
$(TAU_COMPILER) stub Makefile variable in 2.14+ release
Invokes PDT parser, TAU instrumentor, compiler through tau_compiler.sh shell script
Requires minimal changes to application Makefile Compilation rules are not changed User adds $(TAU_COMPILER) before compiler name
F90=mpxlf90Changes toF90= $(TAU_COMPILER) mpxlf90
Passes options from TAU stub Makefile to the four compilation stages
Uses original compilation command if an error occurs
33
TAU_COMPILER – Improving Integration in Makefiles
OLDinclude /usr/tau-2.14/include/MakefileCXX = mpCCF90 = mpxlf90_rPDTPARSE = $(PDTDIR)/
$(PDTARCHDIR)/bin/cxxparseTAUINSTR = $(TAUROOT)/$(CONFIG_ARCH)/
bin/tau_instrumentorCFLAGS = $(TAU_DEFS) $(TAU_INCLUDE)LIBS = $(TAU_MPI_LIBS) $(TAU_LIBS) -lmOBJS = f1.o f2.o f3.o … fn.o
app: $(OBJS)$(CXX) $(LDFLAGS) $(OBJS) -o $@
$(LIBS).cpp.o:
$(PDTPARSE) $<$(TAUINSTR) $*.pdb $< -o
$*.i.cpp –f select.dat$(CC) $(CFLAGS) -c $*.i.cpp
NEWinclude /usr/tau-2.14/include/Makefile
CXX = $(TAU_COMPILER) mpCC
F90 = $(TAU_COMPILER) mpxlf90_r
CFLAGS =
LIBS = -lm
OBJS = f1.o f2.o f3.o … fn.o
app: $(OBJS)
$(CXX) $(LDFLAGS) $(OBJS) -o $@ $(LIBS)
.cpp.o:
$(CC) $(CFLAGS) -c $<
34
TAU_COMPILER Options Optional parameters for $(TAU_COMPILER):
-optVerbose Turn on verbose debugging messages -optPdtDir="" PDT architecture directory. Typically $(PDTDIR)/$(PDTARCHDIR) -optPdtF95Opts="" Options for Fortran parser in PDT (f95parse) -optPdtCOpts="" Options for C parser in PDT (cparse). Typically
$(TAU_MPI_INCLUDE) $(TAU_INCLUDE) $(TAU_DEFS) -optPdtCxxOpts="" Options for C++ parser in PDT (cxxparse). Typically
$(TAU_MPI_INCLUDE) $(TAU_INCLUDE) $(TAU_DEFS) -optPdtF90Parser="" Specify a different Fortran parser. For e.g., f90parse instead of f95parse -optPdtUser="" Optional arguments for parsing source code -optPDBFile="" Specify [merged] PDB file. Skips parsing phase. -optTauInstr="" Specify location of tau_instrumentor. Typically
$(TAUROOT)/$(CONFIG_ARCH)/bin/tau_instrumentor -optTauSelectFile="" Specify selective instrumentation file for tau_instrumentor -optTau="" Specify options for tau_instrumentor -optCompile="" Options passed to the compiler. Typically
$(TAU_MPI_INCLUDE) $(TAU_INCLUDE) $(TAU_DEFS) -optLinking="" Options passed to the linker. Typically
$(TAU_MPI_FLIBS) $(TAU_LIBS) $(TAU_CXXLIBS) -optNoMpi Removes -l*mpi* libraries during linking (default) -optKeepFiles Does not remove intermediate .pdb and .inst.* files
e.g., OPT=-optTauSelectFile=select.tau –optPDBFile=merged.pdbF90 = $(TAU_COMPILER) $(OPT) mpxlf90_r
35
Program Database Toolkit
Componentsource/ Library
C / C++parser
Fortran 77/90/95parser
C / C++IL analyzer
Fortran 77/90/95IL analyzer
ProgramDatabase
Files
IL IL
DUCTAPE
tau_pg
SILOON
CHASM
TAU_instr
Proxy Component
Applicationcomponent glue
C++ / F90interoperability
Automatic sourceinstrumentation
36
TAU Tracing Enhancements
Configure TAU with -TRACE –vtf=dir option% configure –TRACE –vtf=<dir>
-MULTIPLECOUNTERS –papi=<dir> -mpi –pdt=dir …
Set environment variables% setenv TAU_TRACEFILE foo.vpt.gz% setenv COUNTER1 GET_TIME_OF_DAY (required)% setenv COUNTER2 PAPI_FP_INS…% setenv COUNTER2 PAPI_NATIVE_<event>
for IBM, see /usr/pmapi/lib/POWER4.evs e.g., PAPI_NATIVE_PM_FPU0_FDIV for FPU0 executed FDIV instruction (for using native events)
Execute application (automatic merge/convert)% poe a.out –procs 4 % traceanalyzer foo.vpt.gz
NOTE: COUNTER1 must be GET_TIME_OF_DAY
40
ParaProf
TAU Performance Data Management FrameworkPerformance
analysis programs
PerfDMF Java API
. . .
JDBC
PostgreSQLOracleMySQL
Database
Profile meta-data
Raw performance data
HpmtoolkitPsrunDynaprofmpiPGprof …
…C API …
47
Current Status (Jan 2005)
Released TAU v2.14.1 and PDT v3.3.1 PerfDMF (Performance Database Framework) http://www.cs.uoregon.edu/research/paracomp/tau
Released Performance Component v1.5 MasterMind Component
Tree Pruner Performance Modeling Library Optimizer
Supports SIDL, Classic C++, Classic Neo interfaces Previous versions of CCAFE, BABEL supported (1.0-1.5)
http://www.cs.uoregon.edu/research/paracomp/tau/cca
48
Support Acknowledgements
Department of Energy (DOE) Office of Science contracts University of Utah DOE ASCI Level 1
sub-contract DOE ASC/NNSA Level 3 contract
NSF Software and Tools for High-EndComputing Grant
Research Centre Juelich John von Neumann Institute for
Computing Dr. Bernd Mohr
Los Alamos National Laboratory
49
SIDL Performance Interface
package Performance version 1.5.0{ interface Timer { /* Start/stop the Timer */ void start(); void stop();
/* Set/get the Timer name */ void setName(in string name); string getName();
/* Set/get Timer type information (e.g., signature of the routine) */ void setType(in string name); string getType();
/* Set/get the group name associated with the Timer */ void setGroupName(in string name); string getGroupName();
/* Set/get the group id associated with the Timer */ void setGroupId(in long group); long getGroupId(); }
interface Phase { /* Start/stop the Phase */ void start(); void stop();
/* Set/get the Phase name */ void setName(in string name); string getName();
/* Set/get Phase type information (e.g., signature of the routine) */ void setType(in string name); string getType();
/* Set/get the group name associated with the Phase */ void setGroupName(in string name); string getGroupName();
/* Set/get the group id associated with the Phase */ void setGroupId(in long group); long getGroupId(); }
50
SIDL Performance Interface
interface Query { /* Get the list of Timer and Counter names */ array<string> getTimerNames(); array<string> getCounterNames();
/* Get the timer data */ void getTimerData(in array<string> timerList, out array<double, 2> counterExclusive, out array<double, 2> counterInclusive, out array<int> numCalls, out array<int> numChildCalls, out array<string> counterNames, out int numCounters);
/* User Event query interface */ array<string> getEventNames(); void getEventData(in array<string> eventList, out array<int> numSamples, out array<double> max, out array<double> min, out array<double> mean, out array<double> sumSqr);
/* Writes instantaneous profile to disk in a dump file. */ void dumpProfileData();
/* Writes instantaneous profile to disk in a dump file with a specified prefix. */ void dumpProfileDataPrefix(in string prefix);
/* Writes the instantaneous profile to disk in a dump file whose name * contains the current timestamp. */ void dumpProfileDataIncremental();
/* Writes the list of timer names to a dump file on the disk */ void dumpTimerNames();
/* Writes the profile of the given set of timers to the disk. */ void dumpTimerData(in array<string> timerList);
/* Writes the profile of the given set of timers to the disk. The dump * file name contains the current timestamp when the data was dumped. */ void dumpTimerDataIncremental(in array<string> timerList); }
51
SIDL Performance Interface
/* Memory Tracker interface */ interface MemoryTracker { /* track heap memory at a given place */ void trackHere(); /* enable interrupt driven memory tracking */ void enableInterruptTracking(); /* set the interrupt interval, default is 10 seconds */ void setInterruptInterval(in int value); /* disable tracking (both interrupt driven and manual) */ void enable(); /* enable tracking (both interrupt driven and manual)*/ void disable(); } /* User defined event profiles for application specific events */ interface Event { /* Set the name of the event */ void setName(in string name);
/* Trigger the event */ void trigger(in double data); } /* Interface for runtime instrumentation control based on groups */ interface Control { /* Enable/disable group id */ void enableGroupId(in long id); void disableGroupId(in long id);
/* Enable/disable group name */ void enableGroupName(in string name); void disableGroupName(in string name);
/* Enable/disable all groups */ void enableAllGroups(); void disableAllGroups(); }
52
SIDL Performance Interface
/* Interface to create performance component instances */ interface Measurement extends gov.cca.Port { /* Create a Timer */ Timer createTimer(); Timer createTimerWithName(in string name); Timer createTimerWithNameType(in string name, in string type); Timer createTimerWithNameTypeGroup(in string name, in string type, in string group);
Phase createPhase(); Phase createPhaseWithName(in string name); Phase createPhaseWithNameType(in string name, in string type); Phase createPhaseWithNameTypeGroup(in string name, in string type, in string group); /* Create a Query interface */ Query createQuery();
/* Create a MemoryTracker interface */ MemoryTracker createMemoryTracker();
/* Create a User Defined Event interface */ Event createEvent(); Event createEventWithName(in string name);
/* Create a Control interface for selectively enabling and disabling * the instrumentation based on groups */ Control createControl(); } /* Monitor Port for MasterMind component */ interface Monitor extends gov.cca.Port { void startMonitoring(in string rname); void stopMonitoring(in string rname, in array<string> paramNames, in array<double> paramValues); void setFileName(in string rname, in string fname); void dumpData(in string rname); void dumpDataFileName(in string rname, in string fname); void destroyRecord(in string rname); } interface PerfParam extends gov.cca.Port { int getPerformanceData(in string rname, out array<double, 2> data, in bool reset); int getCompMethNames(out array<string> cm_names); }}
53
SIDL Performance Interface
/* Interface to create performance component instances */ interface Measurement extends gov.cca.Port { /* Create a Timer */ Timer createTimer(); Timer createTimerWithName(in string name); Timer createTimerWithNameType(in string name, in string type); Timer createTimerWithNameTypeGroup(in string name, in string type, in string group);
Phase createPhase(); Phase createPhaseWithName(in string name); Phase createPhaseWithNameType(in string name, in string type); Phase createPhaseWithNameTypeGroup(in string name, in string type, in string group); /* Create a Query interface */ Query createQuery();
/* Create a MemoryTracker interface */ MemoryTracker createMemoryTracker();
/* Create a User Defined Event interface */ Event createEvent(); Event createEventWithName(in string name);
/* Create a Control interface for selectively enabling and disabling * the instrumentation based on groups */ Control createControl(); } /* Monitor Port for MasterMind component */ interface Monitor extends gov.cca.Port { void startMonitoring(in string rname); void stopMonitoring(in string rname, in array<string> paramNames, in array<double> paramValues); void setFileName(in string rname, in string fname); void dumpData(in string rname); void dumpDataFileName(in string rname, in string fname); void destroyRecord(in string rname); } interface PerfParam extends gov.cca.Port { int getPerformanceData(in string rname, out array<double, 2> data, in bool reset); int getCompMethNames(out array<string> cm_names); }}
54
Sample Driver
voidsample::Driver_impl::setServices ( /*in*/ ::gov::cca::Services services )throw ( ::gov::cca::CCAException){ // DO-NOT-DELETE splicer.begin(sample.Driver.setServices)
frameworkServices = services;
gov::cca::TypeMap tm = frameworkServices.createTypeMap(); gov::cca::Port p = self;
frameworkServices.addProvidesPort (p, "Go", "gov.cca.ports.GoPort", tm);
frameworkServices.registerUsesPort ("MeasurementPort", "Performance.Measurement", tm);
// DO-NOT-DELETE splicer.end(sample.Driver.setServices)}
55
Sample Driver
int32_tsample::Driver_impl::go ()throw ()
{ // DO-NOT-DELETE splicer.begin(sample.Driver.go)
::gov::cca::Port port;
port = frameworkServices.getPort ("MeasurementPort"); if (port._is_nil()) { std::cerr << "MeasurementPort is not connected" << std::endl; return -1; }
Performance::Measurement measurement = port;
for (int i = 0; i < 4; i++) {
std::ostringstream os; os << "Iteration " << i; std::string phaseName = os.str();
// Create and start a phase ::Performance::Phase phase = measurement.createPhaseWithName(phaseName); phase.start();
// Create and start a timer static ::Performance::Timer tautimer = measurement.createTimerWithNameTypeGroup("go", "int32_t ()", "TAU_GROUP_CCA"); tautimer.start();
// Create a memory tracker and start interrupt driven memory tracking ::Performance::MemoryTracker tracker = measurement.createMemoryTracker(); tracker.enableInterruptTracking();
56
Sample Driver
sleep(i);
// Manually track memory here tracker.trackHere();
tautimer.stop();
phase.stop(); } // Create a query interface ::Performance::Query query = measurement.createQuery(); // Get the event names ::sidl::array< ::std::string> eventNames = query.getEventNames();
::sidl::array<int32_t> numSamples; ::sidl::array<double> max, min, mean, sumSqr; // Get the event data query.getEventData(eventNames, numSamples, max, min, mean, sumSqr); int numEvents = eventNames.upper(0) - eventNames.lower(0) + 1; for (int i = 0; i < numEvents; i++) { std::cout << "User Event: " << eventNames.get(i) << std::endl; std::cout << "Number of Samples: " << numSamples.get(i) << std::endl; std::cout << "Maximum Value: " << max.get(i) << std::endl; std::cout << "Minimim Value: " << min.get(i) << std::endl; std::cout << "Mean Value: " << mean.get(i) << std::endl; std::cout << "Sum Squared: " << sumSqr.get(i) << std::endl << std::endl; } frameworkServices.releasePort("MeasurementPort"); return 0; // DO-NOT-DELETE splicer.end(sample.Driver.go)}
57
CCA Classic C++ Performance interface
#include <string>using std::string;
namespace performance { class Timer { public: /** * The destructor should be declared virtual in an interface class. */ virtual ~Timer() { }
/** * Start the Timer. * Implement this function in * a derived class to provide required functionality. */ virtual void start(void) = 0;
/** * Stop the Timer. */ virtual void stop(void) = 0;
/** * Set the name of the Timer. */ virtual void setName(string name) = 0;
/** * Get the name of the Timer. */ virtual string getName(void) = 0;
/** * Set the type information of the Timer * (e.g., signature of the routine) */ virtual void setType(string name) = 0;
/** * Get the type information of the Timer * (e.g., signature of the routine) */ virtual string getType(void) = 0;
58
CCA Classic C++ Performance interface
/** * Set the group name associated with the Timer * (e.g., All MPI calls can be grouped into an "MPI" group) */ virtual void setGroupName(string name) = 0;
/** * Get the group name associated with the Timer */ virtual string getGroupName(void) = 0;
/** * Set the group id associated with the Timer */ virtual void setGroupId(unsigned long group ) = 0;
/** * Get the group id associated with the Timer */ virtual unsigned long getGroupId(void) = 0;
};
class Phase { public:
/** * The destructor should be declared virtual in an interface class. */ virtual ~Phase() { }
/** * Start the Phase. * Implement this function in * a derived class to provide required functionality. */ virtual void start(void) = 0;
/** * Stop the Phase. */ virtual void stop(void) = 0;
/**
* Set the name of the Phase.
59
CCA Classic C++ Performance interface
virtual void setName(string name) = 0;
/** * Get the name of the Phase. */ virtual string getName(void) = 0;
/** * Set the type information of the Phase * (e.g., signature of the routine) */ virtual void setType(string name) = 0;
/** * Get the type information of the Phase * (e.g., signature of the routine) */ virtual string getType(void) = 0;
/** * Set the group name associated with the Phase * (e.g., All MPI calls can be grouped into an "MPI" group) */ virtual void setGroupName(string name) = 0; /** * Get the group name associated with the Phase */ virtual string getGroupName(void) = 0;
/** * Set the group id associated with the Phase */ virtual void setGroupId(unsigned long group ) = 0;
/** * Get the group id associated with the Phase */ virtual unsigned long getGroupId(void) = 0; };
/** * Query the timing information */ class Query { public:
60
CCA Classic C++ Performance interface
virtual ~Query() { }
/** * Get the list of Timer names */ virtual void getTimerNames(const char **& functionList, int& numFuncs) = 0; /** * Get the list of Counter names */ virtual void getCounterNames(const char **& counterList, int& numCounters) = 0;
/** * getTimerData. Returns lists of metrics. */ virtual void getTimerData(const char **& inTimerList, int numTimers, double **& counterExclusive, double **& counterInclusive, int*& numCalls, int*& numChildCalls, const char **& counterNames, int& numCounters) = 0;
/* * Get the list of User Event names */ virtual void getEventNames(const char **&eventList, int &numEvents) = 0;
/* * Get User Event data */ virtual void getEventData(const char **&inEventList, int numEvents, int* &numSamples, double* &max, double* &min, double* &mean, double* &sumSqr) = 0;
/** * dumpProfileData. Writes the entire profile to disk in a dump file. * It maintains a consistent state and represents the instantaneous * profile data had the application terminated at the instance this call * is invoked. */ virtual void dumpProfileData(void) = 0;
61
CCA Classic C++ Performance interface
/** * dumpProfileDataPrefix. Writes the entire profile to disk in a dump * file prefixed by 'prefix'. It maintains a consistent state and * represents the instantaneous profile data had the application * terminated at the instance this call is invoked. */ virtual void dumpProfileDataPrefix(const char *prefix) = 0;
/** * dumpProfileDataIncremental. Writes the entire profile to disk in a * dump file whose name contains the current timestamp. * It maintains a consistent state and represents the instantaneous * profile data had the application terminated at the instance this call * is invoked. This call allows us to build a set of timestamped profile * files. */ virtual void dumpProfileDataIncremental(void) = 0;
/** * dumpTimerNames. Writes the list of timer names to a dump file on the * disk. */ virtual void dumpTimerNames(void) = 0;
/** * dumpTimerData. Writes the profile of the given set of timers to the * disk. This allows the user to select the set of routines to dump and * periodically write the performance data of a subset of timers to disk * for monitoring purposes. */ virtual void dumpTimerData(const char **& inTimerList, int numTimers) = 0;
/** * dumpTimerDataIncremental. Writes the profile of the given set of * timers to the disk. The dump file name contains the current timestamp * when the data was dumped. This allows the user to select the set of * routines to dump and periodically write the performance data of a * subset of timers to the disk and maintain a timestamped set of values * for post-mortem analysis of how the performance data varied for a * given set of routimes with time. */ virtual void dumpTimerDataIncremental(const char **& inTimerList, int numTimers) = 0; }; /** * dumpProfileDataPrefix. Writes the entire profile to disk in a dump * file prefixed by 'prefix'. It maintains a consistent state and * represents the instantaneous profile data had the application * terminated at the instance this call is invoked. */ virtual void dumpProfileDataPrefix(const char *prefix) = 0;
/** * dumpProfileDataIncremental. Writes the entire profile to disk in a * dump file whose name contains the current timestamp. * It maintains a consistent state and represents the instantaneous * profile data had the application terminated at the instance this call * is invoked. This call allows us to build a set of timestamped profile * files. */ virtual void dumpProfileDataIncremental(void) = 0;
/** * dumpTimerNames. Writes the list of timer names to a dump file on the * disk. */ virtual void dumpTimerNames(void) = 0;
/** * dumpTimerData. Writes the profile of the given set of timers to the * disk. This allows the user to select the set of routines to dump and * periodically write the performance data of a subset of timers to disk * for monitoring purposes. */ virtual void dumpTimerData(const char **& inTimerList, int numTimers) = 0;
/** * dumpTimerDataIncremental. Writes the profile of the given set of * timers to the disk. The dump file name contains the current timestamp * when the data was dumped. This allows the user to select the set of * routines to dump and periodically write the performance data of a * subset of timers to the disk and maintain a timestamped set of values * for post-mortem analysis of how the performance data varied for a * given set of routimes with time. */ virtual void dumpTimerDataIncremental(const char **& inTimerList, int numTimers) = 0; }; /** * disable tracking (both interrupt driven and manual) */ virtual void disable() = 0; };
/** * This class implements the runtime instrumentation control based on groups */ class Control { public: /** * Destructor */ ~Control () { } /** * Control instrumentation. Enable group Id. */ virtual void enableGroupId(unsigned long id) = 0;
/** * Control instrumentation. Disable group Id. */ virtual void disableGroupId(unsigned long id) = 0;
/** * Control instrumentation. Enable group name. */ virtual void enableGroupName(string name) = 0;
/** * Control instrumentation. Disable group name. */ virtual void disableGroupName(string name) = 0;
/** * Control instrumentation. Enable all groups. */ virtual void enableAllGroups(void) = 0;
/** * Control instrumentation. Disable all groups. */ virtual void disableAllGroups(void) = 0; };
namespace ccaports { /** * This abstract class declares the Measurement interface. * Inherit from this class to provide functionality. */ class Measurement: public virtual classic::gov::cca::Port { public:
/** * The destructor should be declared virtual in an interface class. */ virtual ~Measurement() { }
/** * Create a Timer */ virtual performance::Timer* createTimer(void) = 0; virtual performance::Timer* createTimer(string name) = 0; virtual performance::Timer* createTimer(string name, string type) = 0; virtual performance::Timer* createTimer(string name, string type, string group) = 0;
/** * Create a Phase */ virtual performance::Phase* createPhase(void) = 0; virtual performance::Phase* createPhase(string name) = 0; virtual performance::Phase* createPhase(string name, string type) = 0; virtual performance::Phase* createPhase(string name, string type, string group) = 0;
/** * Create a MemoryTracker interface */ virtual performance::MemoryTracker* createMemoryTracker(void) = 0;
/** * Create a Query interface */ virtual performance::Query* createQuery(void) = 0;
/** * Create a User Defined Event interface */ virtual performance::Event* createEvent(void) = 0; virtual performance::Event* createEvent(string name) = 0; /** * Create a Control interface for selectively enabling and disabling * the instrumentation based on groups */ virtual performance::Control* createControl(void) = 0; }; }}