kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered...
Transcript of kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered...
![Page 1: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/1.jpg)
InteractionofArchitectureandAlgorithm
intheDomain-basedParallelizationofan
UnstructuredGridIncompressibleFlowCode
DineshK.Kaushik&DavidE.Keyes
CSDepartment,OldDominionUniversity&
ICASE,NASALangleyResearchCenter
BarryF.Smith
MCSDivision,ArgonneNationalLaboratory
![Page 2: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/2.jpg)
OrganizationofPresentation
�Issuesforunstructuredgriddomaindecompositionmethods
�BackgroundofFUN3D
�BackgroundofPETSc
�Illustrationsofgeneralportingissues
�Summaryofserialandparallelperformance
�Conclusions
![Page 3: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/3.jpg)
SolvingUnstructuredGridProblemsinParallel:
MainIssues
�SPMDparallelizationofunstructuredgridsolversiscomplicatedby
thefactthatnotwointerprocessordatadependencypatternsarealike
�Theuser-providedglobalorderingmaybeincompatiblewiththe
subdomain-contiguousorderingrequiredforhighperformanceand
convenientSPMDcoding
�Lossofregularityinunstructuredgridsolversmakesthem
more
memoryandinteger-opintensive;nevertheless,alibrary-based
solvershouldbecompetitiveinserialwithalegacysolverinterms
ofmemoryandexecutiontime
![Page 4: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/4.jpg)
ImplicationsoftheMemoryHierarchy
onComputationalE�ciency
�Storage/usepatternsshouldfollowmemoryhierarchy
{BlocksforRegisters
blockstorageformatformulticomponentsystems{savesCPU
cycles
{InterlacedDataStructuresforCache
choose
u1;v1;w1;p1;u2;v2;w2;p2;:::
inplaceof
u1;u2;:::;v1;v2;:::;w1;w2;:::;p1;p2;:::
{SubdomainsforDistributedMemory
\chunky"domaindecompositionforoptimalsurface-to-volume
(communication-to-computation)ratio
�Thishierarchyisconcernedwithdi�erentissuesthanthealgorith-
mice�ciencyissuesassociatedwithhierarchiesofgrids
![Page 5: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/5.jpg)
OptimalGranularityofDecomposition
Forcache-basedmicroprocessors,granularityisdeterminedbythree
forces:
�ConvergenceRate
usuallydeteriorateswithincreasedgranularity
�CommunicationVolume
increaseswithincreasedgranularity
�SizeofLocalWorkingSet
�tsbetterintosuccessivelysmallercachelevelswithincreasedgran-
ularity
![Page 6: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/6.jpg)
DescriptionoftheLegacyCode-FUN3D
�FUN3Disatetrahedralvertex-centeredunstructuredgridcodedevel-
opedbyW.K.Anderson(LaRC)forcompressibleandincompressible
EulerandNavier-Stokesequations
�ParallelexperienceiswithincompressibleEulersofar,butnothingin
thealgorithmsorsoftwarechangesfortheothercases;onlyconver-
genceratewillvarywithconditioning,asdeterminedbyMachand
Reynoldsnumbers(andmesh)
�FUN3Duses1st-or2nd-orderRoeforconvectionandGalerkinfor
di�usion,andfalsetimesteppingwithbackwardsEulerfornonlinear
continuationtowardssteadystate
�SolverisNewton-Krylov-Schwarz;timestepisadvancedtowardsin-
�nitybytheswitchedevolution/relaxation(SER)heuristicofVan
Leer&Mulder
![Page 7: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/7.jpg)
PETSc|
aPortableExtensibleToolkitforScienti�cComputing
�Givesrelativelyhigh-levelexpressiontopreconditionediterativelin-
earsolvers,andNewtoniterativemethods
�Supportscomplexarithmetic
�PortswhereverMPIports;committedtoprogressiveMPItuning
�Permitsgreat exibility(throughobject-orientedphilosophy)foral-
gorithmicinnovation
�Freelyavailable
�CallablefromFORTRAN77,C,andC++;writteninC
�Includesdiagnostic,monitoring,andvisualizationGUIs
![Page 8: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/8.jpg)
ThePETScPhilosophy
�Libraryapproach|compilercan'tdoall;usersshouldn'tdoallmore
thanonce
�Distributeddatastructuresasfundamentalobjects|
indexsets,
vectors,andmatrices(gridfunctionscoming)
�Iterativelinearandnonlinearsolvers,combinablemodularlyandre-
cursively,andextensible
�Portable
�UniformApplicationProgrammerInterface(API)
�Multi-layeredentry
�Message-passingdetailsuppressed
![Page 9: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/9.jpg)
ConversionofLegacyFUN3D
intoPETSc/MPIversion
�Projectbegun10/96,completed3/97,undergoingcontinualenhance-
ment
�Five-month(part-time)e�ortincluded:
{learningFUN3DandthePUNS3Dmeshpreprocessor
{learningtheMeTiSpartitioner
{addingandtestingnewfunctionalityinPETSc
{restructuringFUN3Dfromvectortocacheorientation
�Approximately3,300of14,400F77linesofFUN3Dretained(primar-
ilyas\nodecode"for uxandJacobianevaluations);PETScsolvers
usedfortherest
�E�orthasnotyetincluded:
{ParallelI/Oandpost-processing
�Nextunstructuredmeshcodeportshouldrequiresigni�cantlyless
time
![Page 10: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/10.jpg)
SolvingUnstructuredGridProblemsinParallel:
BasicOutlineoftheSolutionStrategy
�Followthe\ownercomputes"ruleunderthedualconstraintsofmini-
mizingthenumberofmessagesandoverlappingcommunicationwith
computation
�Eachprocessor\ghosts"itsstencildependencesinitsneighbors
�Ghostnodesorderedaftercontiguousownednodes
�Domainmappedfrom(user)globalorderingintolocalorderings
�Scatter/gatheroperationscreatedbetweenlocalsequentialvectors
andglobaldistributedvectors,basedonruntimeconnectivitypat-
terns
�Newton-Krylov-Schwarzoperationstranslatedintolocaltasksand
communicationtasks(nonblockingforoverlapwherehardwaresup-
ports)
![Page 11: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/11.jpg)
Three Different Orderings - In Focus
Application Ordering
0 1 2 3
45 6
7
89 10
11
12 13 14 15PETSc Ordering
10
4
23
5
6 7
10
8 9
11
12 13
14 15
Local Ordering for Processor 0
0 1
23
45
6 7
8
9
10
11Local Ordering for Processor 1
0 1
23
45
6 7
8
9
10
11
![Page 12: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/12.jpg)
ScatteringBetweentheOrderings
�Afterestablishingdi�erentorderings,establishthe\scatter"between
theglobalandlocalvectorsinthefollowingway:
ISCreateStride(MPICOMMSELF,bs*nvertices,0,1,&islocal);
ISCreateBlock(MPICOMMSELF,bs,nvertices,svertices,&isglobal);
VecScatterCreate(x,isglobal,user.localX,islocal,&user.scatter);
�Next,beforeusingthelocalvectorinanysubroutine,carryoutthe
scatteroperation:
VecScatterBegin(X,localX,INSERTVALUES,SCATTERFORWARD,scatter);
VecScatterEnd(X,localX,INSERTVALUES,SCATTERFORWARD,scatter);
![Page 13: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/13.jpg)
SampleSerialPerformanceComparison:
PETScvs.LegacyCode
Forbothcodes
�sameoptimizationlevel(-O3)wasused
�sametimerwasused
�timemeasurementstartedafterreadingalltheinput�les
�nooutputwaswrittenduringtimingmeasurements
�platformusedwasIBMSPatArgonnewithenoughmemorytoavoid
pagefaultsafterloading
Execution(s)
Memory(MB)
vertorignalPETScoriginalPETSc
2800
122.71
27.88
10.22
12.08
227002905.30
381.09
74.74
83.67
Percentagedi�erenceinmemoryrequirement
reduceswithproblemsize
![Page 14: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/14.jpg)
SampleMemoryConservationTechniques
AndSuccessiveE�ectsinFUN3D
PortingHistory
�Preciselysizedpreallocationofsparsematrixobjects
(77!
47MBofRAM)
�Pruningoflegacycodesolverdatastructures
(47!
34MBofRAM)
�In-placefactorizationofpreconditioner
(34!
21MBofRAM)
�Moving\MatSetValues"callsintolegacysubroutines
(21!
16MBofRAM)
�MakingPartitioningStageScalable
(16!
12MBofRAM)
�Sizeoflegacycodeonsameproblem:10MB
�Sizeofparallelsingle-nodecodeonsameproblem:12MB
![Page 15: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/15.jpg)
SummaryofParallelPerformanceonCrayT3EandIBM
SP
�1.4milliondegree-of-freedomproblemconvergedtomachineprecision
inapproximately6minuteswithapproximately1600 uxbalance
operations(workunits)on128processorsofaT3Eor80processor
ofanSP
�Relativee�cienciesof75%to85%overthisrange
�Algorithmice�ciency(ratioofiterationcountoflessdecomposed
gridtomoredecomposedgrid{usingthe\best"algorithmforeach
processorgranularity)isinexcessof90%overthisrange;iteration
countisonlyweaklydependentupongranularity
�Implementatione�ciency(ratioofthecostpervertexperiteration)
isinexcessof80%
overthisrangeandcanbesuperunitary
�Superunitaryimplementatione�ciencyderivesfromimprovedcache
localityathighergranularity(smallerworkingsetsoneachprocessor),
inspiteofgreaternearestneighborcommunicationvolume
�Properlysizingworkingsettocachelargelyovercomesconvergence
andcommunicationpenaltiesofconcurrency
![Page 16: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/16.jpg)
CrayT3EScalability{FixedSize
FUN3D-PETScM6WingTestCase,IncompressibleEuler
2nd-orderRoeScheme,1-layerHalo
Tetrahedralgridof357,900vertices(1,431,600unknowns)
procsits
exe
speedup
�alg
�impl
�overall
16772587.95s
1.00
1.001.00
1.00
24781792.34s
1.44
0.990.97
0.96
32751262.01s
2.05
1.031.00
1.03
40751043.55s
2.48
1.030.97
0.99
4876
885.91s
2.92
1.010.96
0.97
6475
662.06s
3.91
1.030.95
0.98
8078
559.93s
4.62
0.990.94
0.92
9679
491.40s
5.27
0.970.90
0.88
12882
382.30s
6.77
0.940.90
0.85
85%relativee�ciencyat128nodes
![Page 17: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/17.jpg)
IBM
SPScalability{FixedSize
FUN3D-PETScM6WingTestCase,IncompressibleEuler
2nd-orderRoeScheme,1-layerHalo
Tetrahedralgridof357,900vertices(1,431,600unknowns)
procsits
exe
speedup
�alg
�impl
�overall
8702897.46s
1.00
1.001.00
1.00
10732405.66s
1.20
0.961.00
0.96
16781670.67s
1.73
0.900.97
0.87
20731233.06s
2.35
0.960.98
0.94
3274
797.46s
3.63
0.950.96
0.91
4075
672.90s
4.31
0.930.92
0.86
4875
569.94s
5.08
0.930.91
0.85
6474
437.72s
6.62
0.950.87
0.83
8077
386.83s
7.49
0.910.82
0.75
75%relativee�ciencyat80nodes
![Page 18: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/18.jpg)
CrayT3EScalability{Gustafson
FUN3D-PETScM6WingTestCase,IncompressibleEuler
2nd-orderRoeScheme,1-layerHalo
Tetrahedralgrid
vertprocsvert/procits
exe
exe/it
357,900
80
4474
78559.93s7.18s
53,961
12
4497
36265.72s7.38s
9,428
2
4714
19131.07s6.89s
Lessthan7%variationinperformance
overfactorofnearly40inproblemsize
![Page 19: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/19.jpg)
NotesonE�ciency
Con ictingde�nitionsofparallele�ciencyabound,dependingupontwo
choices:
�Whatscalingistobeusedasthenumberofprocessorsisvaried?
{overall�xed-sizeproblem
{varyingsizeproblemwith�xedmemoryperprocessor
{varyingsizeproblemwith�xedworkperprocessor
�Whatformofthealgorithmistobeusedasnumberofprocessoris
varied?
{reproducethesequentialarithmeticexactly
{adjustparameterstoperformbestoneachgivennumberofpro-
cessors
Ourchartsincludebothoverall�xed-sizescalingandapproximately�xed
memoryperprocessor(Gustafson)scaling
Wealwaysadjustthesubdomainblockingparametertomatchthenum-
berofprocessors,onesubdomainperprocessor;thiscausesthenumber
ofiterationstovary
![Page 20: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/20.jpg)
NotesonE�ciency,cont.
E�ectofchanging-strengthpreconditionerande�ectofparalleloverhead
areoftenseparatedintoalgorithmicandimplementationfactors
�Customaryde�nitionofoveralle�ciencyingoingfromqtopproces-
sors(p>q):
�(pjq)=q�T(q)
p�T(p)
whereT(p)istheoverallexecutiontimeonpprocessors(measured)
�FactorT(p)intoI(p),thenumberofiterations,andC(p),theaverage
costperiteration.
�Algorithmice�ciencyismeasureofpreconditioningquality(mea-
sured):
�alg (pjq)=I(q)
I(p)
�Implementatione�ciencyisremaining(inferred,notdirectlymeasur-
able)factor:
�impl (pjq)=q�C(q)
p�C(p)
![Page 21: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/21.jpg)
FootnotesonScalabilityTables
�\its"representsthenumberofpseudo-transientNewtonsteps|
one
Newtonsteppertimestep,withSERgrowthintimestepuptoa
CFLof100,000,andwithamaximum
number(20)ofSchwarz-
preconditionedGMRESstepsperNewtonstepwithrelativetolerance
of10�
2
�Convergencede�nedasarelativereductioninthenormofthesteady-
statenonlinearresidualbyafactorof10�
10
�Convergenceratetypicallydegradesslightlyasnumberofprocessors
isincreased,duetointroductionofconcurrencyinpreconditioner|
highlypartition-dependent
�Implementatione�ciencymayimproveslightlyasprocessorsareadded,
duetosmallerworkingsets|
bettercacheresidency
�Implementatione�ciencyultimatelydegradesascommunication-to-
computationratioincreases
![Page 22: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/22.jpg)
OurViewofthe\State-of-the-Art":
ArchitectureandProgrammingEnvironment
�Vector-awarenessisout;cache-awarenessisin;vector-awarenesswill
returninsubtleways
�ExceptforTeraandinstalledvectorbase,allnear-termlarge-scale
computerswillbebasedoncommodityprocessors
�HPFandparallelcompilersnotyetuptoperformance
�Someusefulparallellibraries,likePETSc
�Needforbettermemorybandwidthtoharnessthefullcapabilityof
future(¤t)chips
![Page 23: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/23.jpg)
OurViewofthe\State-of-the-Art":
Algorithms
�Explicittimeintegrationissolvedproblem,exceptfordynamicmesh
adaptivity
�Implicitmethodsremainamajorchallenge:
{Today'salgorithmsleavesomethingtobedesiredinconvergence
rate
{Allgoodalgorithmshaveglobalsynchronization
�Dataparallelism
fromdomaindecompositionisunquestionablythe
mainsourceoflocality-preservingconcurrency,butgoodsmoothers
andpreconditionersviolatelocality
�Newformsofalgorithmiclatencytolerancemustbefound
�ExoticmethodsshouldbeconsideredatASCIscales
![Page 24: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/24.jpg)
OurViewofthe\State-of-the-Art":
Application-Algorithm-ArchitectureInteraction
�Ripestremainingadvancesareinterdisciplinary
�Application-Algorithm
{Ordering,partitioning,andcoarseningmustadapttocoe�cients
(gridspacingand owmagnitudeanddirection)
{Trade-o�sbetweenpseudo-timeiteration,nonlineariteration,lin-
eariteration,andpreconditioneriterationmustbeunderstoodand
exploited
�Algorithm-Architecture
{Algorithmicistsmustthinknativelyinparallelandavoidintroduc-
ingunnecessarysequentialconstraints
{Algorithmicistsshouldinform
theirchoiceswithwhattheirma-
chineisgoodatandwhatitisbadat
![Page 25: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/25.jpg)
Conclusions
�Hierarchyofdomaindecompositionshouldfollowdistributedmemory
hierarchyforcomputationale�ciency
{blockingforregisters
givesafactorof2inperformanceformulticomponentsystems
{interlaceddatastructureforcache
reducesexecutiontimebymorethanafactorof4
{subdomainsforprocessormemory
migratesthesequentialcodetoSPMDparallelism
�Inadditiontoconvergencerateandcommunicationvolume,
workingsetsizeisanotherparametertoconsiderfor\preferred"
granularityofdomaindecomposition
�PETScportedFUN3Dgivesnicescalabilityresults(parallele�-
ciencyrangesfrom
75%{85%)ontwoplatforms-IBM
SPand
CrayT3E
�Library(PETSc)basedsolveriscompetitivewiththelegacysolver
{outperformsbyafactorof9eveninserialmode{percentage
di�erenceinmemoryreduceswithproblemsize
![Page 26: kground of FUN3D · Description of the Legacy Co de-FUN3D FUN3D is a tetrahedral v ertex-cen tered unstructured grid co de dev el-op ed b y W. K. Anderson (LaR C) for compressible](https://reader036.fdocuments.net/reader036/viewer/2022090608/605ee8bd5553fd14602d46c1/html5/thumbnails/26.jpg)
ReferenceURLs
�FUN3D
http://fmad-www.larc.nasa.gov/~wanderso/Fun/fun.html
�PETSc
http://www.mcs.anl.gov/petsc/petsc.html
�Pointersandrelatedpapers
http://www.cs.odu.edu/~keyes/keyes.html