Architecture andProgramming Model for
Interactive Real TimeComputing
JackDennis,Arvind:MIT-CSAILXiaoming li,Lian-PingWang,Guang Gao:
UniversityofDelaware
SupportedbyAFOSRGrantFA9550-13-1-0213ProgramManager:Dr.FredericaDarema
Project Goal
Demonstrateeffectivenessof
FreshBreezetechnologyfor
DDDAS
MotivatingDDDASApplications
Domain application: turbulent particle-laden flows• Turbulent flows laden with solid particles or liquid droplets in a complex
geometry: flow drag, device erosion and damage, visibility, etc.
• Multi-scales & coupled multi-physical processes−Varying scales: Physics at fluid-particle interfaces è particle-particle interactions
è system scale (particle distribution, turbulence modulation by particles) −Particle-wall interactions: turbulence and drag modulation, erosion
• Related applications systems−High-speed gas turbine combustor−Operation of aircraft and engine in polluted environment−Explosion in military operations−Treatment of particulate nuclear waste
Environmental applications: sediment transport, rain formation, air-sea interactions
The mesoscopic simulation system of multiphase turbulent flows (based on the kinetic Boltzmann equation)
Physical considerations:Interface and particle scalesNo-slip boundary conditionHydrodynamic force / torqueViscous boundary layerVortex shedding and wakes
Channel domain-scale:Turbulent boundary layer with solid particlesDistribution of particlesImpact of particles on the drag force at channel wallEffects of particle size, particle inertia, and sedimentation
Local flow vorticity and particle positions on a 2D slide
Computational approach and challenges: Resolving multi-scale physicsScalable computation, visualization, and analysis
Wang’s group at UD pioneers this new approach. He will host the International Conference for MesoscopicMethods in Engineering and Science (ICMMES) in 2018 at UD.
The measurement system by plenoptic (light field) particle tracking velocimetry(with Jingyi Yu at UD, A CRI (CISE Research Infrastructure) project recently funded by NSF)
• A plenoptic camera is a single-shot, multi-view acquisition device.
• Allow 3D reconstruction of particles via stereo matching
• Robustly handle heavy occlusions• Two plenoptic cameras with color pattern-
coded particles allows observation of position, translational and rotational velocities
Measurements at the particle-scale:• Translational and rotational
accelerations• RMS fluctuations • Particle-particle, particle-wall
interactions
Measurements at the channel scale:• Mean velocity profiles of both phases• Turbulent statistics of both phases• Particle distributions
Sensors
ReducedModel
Simulation
Interface-resolvedDNS
Freshbreezesystemarchitecture
D=f(Re,dp/H,ϕ,…)Coarse-grainedmodels
Particulate Multiphase Flow as a DDDAS
The channel and the light field camera
Wideparameterrangesbutlimitedinstrument resolution andincomplete data
Limited parameterrangesbutcomplete space/time4Ddata
Applications
Keyquestions
Merge DNSSimulator
withdifferentinputdata
PTVView4
Comparison,Analysis,Decision,andControl
InteractiveVisualization
PTVView2
PTVView1
PTVView3 Distribute
Particulate Multiphase Flow as a DDDAS: data flow
Results data:Fluidflowdata,particle data,Ref.single-phase flowdata
Inputdata:Particlesize,volumefraction,densityratio,Gridresolution, etc.
light field camera
Codingchoices:numericalmethod,computeralgorithm, implementation
Mahali SpaceWeatherMonitoringFluctuationsinpropagationofGPSsignalsareusedtomeasureelectrondensityintheionosphere
Satellite1 Satellite2
Receiver1 Receiver2 Receiver3 Receiver4
A)NormalAtmosphericConditionReceiversneedtobereal-time redirected(pointingangle,satellitesignalband)accordingtosatellite
motion,timeofdayandanalysisofreceiveddata.
Mahali DataCollection
GlobalDataNetwork
DataProcessingSystem
Receiver0
Receiver1
Receiver2
Receiver3
Receiver4
Receiver5
Receiver6
Receiver7
DDDAS Requirements• Realtimeinteractionwithsensorsandeffectors:Henceinput/outputfacilitiescapableofrapidresponsetolargenumbersofindependentevents.
• Abilitytoexecuteintenseprocessingfunctionstoreactappropriatelytochangingsituation.
• Theserequirementsimplytheneedforeffectivedynamicmanagementofmemoryandprocessingresources.
• Datastreamsarethenaturalprogramminginterfacetoinput/outputdevices
DDDAS Development Support
• ThefunJava ProgrammingLanguageformodularprogrammingofDDDAS.
• TheFreshBreezearchitectureforparallelcomputingwithfine-grainexecutionofmyriadcodelets.
• TheKiva systemsimulatorcapableofcycleaccuratesimulationofsystemswiththousandsofcomponents.
• TheFreshBreezecompilerforgeneratingcodeletsforhighlyparallelcomputationfromfunJavaprograms.
funJavaA Functional Programming
Language for DDDAS
• Alanguageinwhichallformsofparallelismarereadilyexpressed:ExpressionParallel,DataParallel,Producer-ConsumerandTransactionProcessing.
• Ahighlevelprogramminglanguageinwhichdatastreamsarefirstclassdataobjects
• RetainsthetypesecurityfeatureoftheJavalanguage.
Data Uniformity: Trees of Chunks
DataChunkse.g. 128 Bytes
RootChunk
Cycle-Free Heap Arrays as Trees of Chunks
Stream as a Chain of Chunks
Application Composability: Codelet
§ A block of instructions scheduled for execution when needed data objects are available.§ Results made available to successor codelets.§ Data objects are trees of chunks.
Codelet
ObjectA
ObjectB
Illustration:Non-deterministicStreamProcessing
Timer GPSReceiverControl
GPSSignalStrength Notsynchronized
Filtering&Analysis
Illustration:Non-deterministicStreamProcessing
Timer GPSReceiverControl
GPSSignalStrength Notsynchronized
Filtering&Analysis
funJava:Nativesupportsnon-deterministic streammerging.Vs.
Synchronous streamprocessing: Potentialtiminghazards.
Advantages ofFreshBreezeStreamProcessing
• Firstinstanceofsupportforhighlevelmodularprogrammingwithstreams.
• Fine-grainconcurrencywithouttiminghazards.
•Highperformance:Pipelineprocessingoftenmillionstreamelementspersecond
ADDDASApplicationImplementedinFreshBreeze
Mahali SpaceWeatherMonitoringFluctuationsinpropagationofGPSsignalsareusedtomeasureelectrondensityintheionosphere
Satellite1 Satellite2
Receiver1 Receiver2 Receiver3 Receiver4
A)NormalAtmosphericCondition
Receiversareredirected(pointingangle,satellitesignalband)accordingtosatellitemotion,timeofdayand
analysisofreceiveddata.
Mahali inStreamProcessingRepresentation
GPS Receiver 0 Tag 0
Stream Merge
Filter & Analysis
Select 0
GPS Receiver 1
GPS Receiver n
Tag 1
Tag n
Select 1
Select n
Stream <Data>
Stream <Data>
Stream <Data>
Stream <Tagged Command>
Timer
Stream <Tagged Data>
Stream <Command>
Stream <Command>
Stream <Command>
AFreshBreezeMahali Simulation
IOProcessor0 4x4PacketRoutingNetwork
(Commands)
4x4PacketRoutingNetwork
(Responses)
MemoryUnit0
MemoryUnit1
MemoryUnit2
MemoryUnit3
Receiver0
Receiver1
Receiver2
Receiver3
Receiver4
Receiver5
Receiver6
Receiver7
IOProcessor1
IOProcessor2
IOProcessor3
• IOProcessors areFreshBreeze ProcessingUnitswithcapabilitiesforcommunicatingwithGPSReceivers
• EachIOProcessorhasIOportsfortwoGPSReceivers.• TheLoadBalancerandTaskSchedulersarenotshown.
AMahali Scenario
• Kivasimulationrunshavebeenperformedforsystemswithfour,eightandsixteenProcessors.
• Thesimulationsconfirmabilityofeachprocessortohandleinputdataatratesupto6GBsor 50Mpackets persecond.
• Thesimulationsdemonstrateabilitytoperformrealtimeinteractionstogetherwithanalyticcomputation.
funJava Matrix MultiplyMultiplicationofsquarematrices
MatrixSize NumberofTasks16 1,55832 6,81264 34,600128 202,304256 1,330,288512 10,563,804
Foreachmatrixsizethecomputationisrunonnineconfigurationsrangingfromoneprocessorto256processors.These simulation runsstressedtheKivasimulatorfortargetsystemswithasmanyas64,515Kivacomponents.
1.0
10.0
100.0
1 2 4 8 16 32 64 128 256
SpeedUp
(Logscale)
NumberofProcessors
16 32 64 128 256 512
Nomanualorruntimecodeoptimization.Nosystemadaption.
FreshBreezeBenefitsforDDDAS• Hardwaresupportforfinegraintaskingallowscomputationtobedistributedovermanyprocessingcores.
• DirectcommunicationwithIOdevicesinsupportoflowlatencyrealtimeinteraction.
• Lowenergy:Noruntimesoftwareoverhead.Nocostforcacheconsistency,
• ExpressionofDDDASinahighlevelprogramminglanguagesupportingcomponentbasedsoftware.
• Highperformance:Pipelineprocessingoftenmillionstreamelementspersecond
Achievements• funJava:Ahighlevelfunctionalprogramming languagewithstreamprocessing andrealtime IO.
• FreshBreezearchitectureforefficient executionofDDDAS.• IncorporationofTaskSchedulingandLoadBalancingfor500ormoreprocessing cores.
• Demonstration oflinearspeedup toover500processing cores• Acodelet compiler thatgenerateshighperformance codewithoutmanual tuning.
• Demonstration ofrealtimeinteractivedatacollectionderivedfromMahali spaceweathermonitoring.
• Parallelmultiscale mesoscopic simulationsofmultiphase flowsonthousandsofprocessors
• Lightfieldreconstructionofmultiphase flowsatvariousscales
PublicationsPublished:• JackB.Dennis.IFIPWorkingGroup2.8FunctionalProgramming.October14- 18,2013Meeting,Assois,FranceAugust10- 15,2014Meeting,EstesPark,Colorado
May25- 29,2015,Kefalonia,Greece
• WangL-P,PengC,Guo ZL,YuZS,2016,LatticeBoltzmannSimulationofParticle-LadenTurbulentChannel Flow,Computers&Fluids,124:226-236.• WangL-P,Ardila OGC,AyalaO,GaoH,PengC,2016,Studyoflocalturbulenceprofilesrelativetotheparticlesurfaceinparticle-ladenturbulentflows,ASMEJ.of
FluidsEngr.,138:041203.
• WangL-P,PengC,Guo ZL,YuZS,2016,FlowModulationbyFinite-SizeNeutrallyBuoyantParticlesinaTurbulentChannelFlow,ASMEJ.ofFluidsEngr.,138:041103.• YuZS,LinZW,ShaoXM,WangL-P,Aparallelfictitiousdomainmethodfortheinterface-resolvedsimulationofparticle-ladenflowsanditsapplicationtothe
turbulentchannelflow,Engr.Appl.Comput.FluidMech.10:160-170.
• PengC,Teng Y,HwangB,Guo ZL,WangL-P,2016,ImplementationissuesandbenchmarkingoflatticeBoltzmannmethodformovingparticlesimulationsinaviscousflow,Computers&MathematicswithApplication.doi:10.1016/j.camwa.2015.08.027
• Zong Y.,Guo ZL,Wang L-P,2015,DesigningCorrectFluidHydrodynamicsonARectangularGridusingMRTLatticeBoltzmannApproach,Computers&MathematicswithApplication,doi:10.1016/j.camwa.2015.05.021.
• Lin,ZW,ShaoXM,YuZS,andWangL-P,2016,Effectsoffinite-sizeheavyparticlesontheturbulentflowsinasquareduct.J.Hydrodynamics,accepted.
Submitted:
ChenSY,PengC,Teng YH,WangL-P,2015,ImprovinglatticeBoltzmannsimulationofmovingparticlesinaviscousflowusinglocalgridrefinement,Computers&Fluids.Lin,Zhaowu;Shao,Xueming;Yu,Zhaosheng;Wang,L-P,2015,Eects ofnite-sizeneutrallybuoyantparticlesontheturbulentflowsinasquareduct.J.FluidMech.
Lin,Zhaowu;Shao,Xueming;Yu,Zhaosheng;Wang,L-P,2015,Effectsofparticleinertiaontheinteractionsbetweentheturbulentchannelflowandthefinite-sizeparticles.PhysicsofFluids,HaodaMin,PengC,Guo ZL,WangL-P,2016,Aninversedesignanalysisofmesoscopic implementation ofnon-uniformforcinginMRTlatticeBoltzmannmodels,Computers&MathematicswithApplications.
PengC,Guo ZL,WangL-P,2016,Alattice-BGKmodelfortheNavier-Stokesequationsbasedonarectangulargrid,Computers&MathematicswithApplications.PengC,MinHD,Guo ZL,WangL-P,2016,Ahydrodynamically-consistentMRTlatticeBoltzmannmodelona2Drectangulargrid,J.Comp.Phys.
WangP,Guo,ZL;XuK;WangLP,2016,AcomparativestudyofDUGKSandLBEmethodsforlaminarflowsanddecayinghomogeneousisotropicturbulenceflows.Phys.Rev.E.BoYT,WangP,Guo ZL,WangL-P,2016,ParallelimplementationandvalidationofDUGKSforthree-dimensionalTaylor-Greenvortexflowandturbulentchannelflow,Computers&MathematicswithApplications.
WangL-P,MinHD,PengC,GenevaN,Guo ZL,2016,Alattice-BoltzmannschemeoftheNavier-Stokesequationonathree-dimensionalcuboidlattice.Computers&Fluids.
Further Work• Develop acomplete simulationmodel foratypicalfieldedMahali datacollectionnetwork.
• ExtendFreshBreeze simulation tomodel asystemwithmultiplemulti-corenodesandasharedDRAMmemory.
• Implement GarbageCollection.• Extendtheexpressive poweroffunJava andtheFreshBreeze compiler.
• EvaluateenergyefficiencyofFreshBreezesystems.
• StudyotherDDDAStoassess anylimitationsofourapproach.
Top Related