Machine Learning for Memory Reduction in the Implicit ...

16
Machine Learning for IMC Anna Matsekh Machine Learning for Memory Reduction in the Implicit Monte Carlo Simulations of the Thermal Radiative Transfer 1 Anna Matsekh Luis Chacon, HyeongKae Park, Guangye Chen Theoretical Division Los Alamos National Laboratory 1 LA-UR-18-26613

Transcript of Machine Learning for Memory Reduction in the Implicit ...

Page 1: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh

Machine Learning for Memory Reduction inthe Implicit Monte Carlo Simulations of the

Thermal Radiative Transfer 1

Anna MatsekhLuis Chacon, HyeongKae Park, Guangye Chen

Theoretical DivisionLos Alamos National Laboratory

1LA-UR-18-26613

Page 2: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Thermal Radiative Transfer

• Thermal Radiative Transfer (TRT) equations• describe propagation and interaction of photons with the

surrounding material• are challenging to solve due to the stiff non-linearity and

high-dimensionality of the problem

• TRT applications at LANL include simulations of• Inertial Confinement Fusion experiments• astrophysical events

(a) ICF (b) supernova

Figure: TRT applications

Page 3: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Implicit Monte Carlo Simulations

• Advantages, compared to the deterministic case

• easier to extend to complex geometries and higher dimensions• easier to parallelize

• Disadvantages

• Monte Carlo solutions to IMC equations exhibit statisticalvariance and IMC convergence rate is estimated to be

O(1/√

Np)

where Np is the number of simulation particles• Even when advanced variance reduction techniques employed,

Monte Carlo simulations

• exhibit slow convergence• prone to statistical errors• require a very large number of simulation particles

• Implicit Monte Carlo codes are typically very large, long runningcodes with large memory requirements at checkpointing &restarting

Page 4: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Machine Learning for IMC

Project Goal: use parametric Machine Learning methods in order toreduce memory requirements at checkpointing & restarting in theIMC simulations of Thermal Radiative Transfer using

• Expectation Maximization and Weighted Gaussian MixtureModel-based approach for ‘particle-data compression’,introduced in Plasma Physics to model Maxwellian particledistributions by Luis Chacon and Guangye Chen

• Expectation Maximization with Weighted Hyper-Erlang Model inorder to compress isotropic IMC particle data in the frequencydomain

• Expectation Maximization and von Mises Mixture Models forcompression of anisotropic directional IMC data on a circle(work-in-progress)

Page 5: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh TRT EquationsConsider 1-d Transport Equation without scattering and withoutexternal sources in the Local Thermodynamic Equilibrium (LTE):

1

c

∂Iν∂t

+ µ∂Iν∂x

+ σν Iν =1

2σν Bν (1)

coupled to the Material Energy Equation

cv∂T

∂t=

∫∫σν Iν dν dµ−

∫σν Bνdν (2)

where the emission term

Bν(T ) =2 h ν3

c21

eh νk T − 1

(3)

is the Planckian (Blackbody) distribution and

• Iν = I (x , µ, t, ν) - radiation intensity

• ν - frequency, T - temperature

• σν - opacity, cv - material heat capacity

• k - Boltzmann constant, h - Planck’s constant, c - speed of light

Page 6: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Implicit Monte Carlo Method ofFleck and Cummings

Transport Equation

1

c

∂Iν∂t

+ µ∂Iν∂x

+ (σνa + σνs) Iν =1

2σνa c u

nr +

1

2σνs (bν/σp)

∫∫σν′ Iν′ dν′ dµ

(4)

Material Temperature Equation (T n = T (tn) ≈ T (t), tn ≤ t ≤ tn+1)

cν Tn+1 = cν T

n − f σp c ∆t unr + f

∫ tn+1

tn

dt

∫∫σν′ Iν′ dν′ dµ (5)

• f = 1/(1 + αβ c ∆t σp) - Fleck factor

• σνa = f σν - effective absorption opacity

• σνs = (1− f )σν - effective scattering opacity

• ur - radiation energy density, bν(T ) - normalized Planckian

• σp =∫σν bν dν - Planck opacity

• α ∈ [0, 1] s.t. ur ≈ α un+1r + (1− α) unr ;β = ∂ur/∂um

Page 7: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Monte Carlo Implementation ofthe IMC method

On each simulation time step

• Particle Sourcing – calculating total energy in the system fromdifferent sources due to the

• boundary conditions and initial conditions• external sources and emission sources

• Particle Tracking – tracking distance to an event in time:• distance to the spacial cell boundary• distance to the end of the time-step• distance to the next collision

• Tallying - computing sample means of such quantities as

• energy deposited due to all effective absorptions• volume-averaged energy density• fluxes

• Calculate next time-step temperature approximation T n+1

Page 8: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh The concept of an IMC particle

• An IMC particle is a simulation abstraction representing a‘radiation energy bundle’ characterized by

• the energy-weight (relative number of photons represented by anIMC particle)

• spacial location• angle of flight• frequency group it belongs to

• On each simulation time step tn ≤ t ≤ tn+1 an IMC particle canundergo the following events:

• escape through the boundaries• get absorbed by the material• scattering / re-emission• survive (particle goes to census at tn+1)

• Surviving particles are called census particles and have to bestored in memory to be reused on the next time-step

Can we ‘learn’ the probability distribution function describing censusparticles at the end of each time-step and store in memory only thisdistribution?

Page 9: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Expectation Maximization Method

• Expectation Maximization (EM) is an iterative method forestimating parameters from probabilistic models. It is typicallyapplied to the Finite Mixture Models

P(xj , θ) =k∑

i=1

pi F(xj , θi ) (6)

• F(xj , θi ) - pdf with the parameter vector θi• xj - data points from the sample X n = (x1, x2, . . . , xn)• pi - probability of F(xj , θi ) in the mixture

∑ni=1 pi = 1

• EM algorithm alternates between the Expectation andMaximization steps:

• Expectation (E) step - computing priors (probabilities)• Maximization (M) step - updating model parameters θi that

maximize expected Likelihood function

L(X n) =n∑

j=1

lnk∑

i=1

pi F(xj , θi ) (7)

Page 10: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh EM and Hyper-Erlang Model

• A Hyper-Erlang Model H(ν, α, β) is a mixture of Erlangdistributions E(ν, α, β):

m∑k=1

pk E(νi , αk , βk) =m∑

k=1

pk1

(αk − 1)!ναk−1i β−αk

k e(−νi/βk )

where• {ν1, . . . , νn} is an iid data sample• αk > 0 - integer shape parameter, βk > 0 - real scale parameter

• Expectation Maximization priors

γik =pk E(νi , αk , βk)∑mk=1 pk E(νi , αk , βk)

, i = 1, . . . , n, k = 1, . . .m (8)

• Maximum Likelihood parameter estimates

βk =

∑ni=1 γik νi

αk

∑ni=1 γik

, pk =

∑ni=1 γikn

, k = 1, . . . ,m (9)

Page 11: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Planckian and Erlang Distributions

• In the LTE emission term in the Transport Equation is given bythe Planckian Bν

• Frequency-normalized Planckian density function

bν(T ) =Bν(T )∫∞

0Bν(T )dν

=15

π4

ν3

T 4 (eν/T − 1)(10)

• There are no closed-form EM estimates of T for bν(T ) mixtures

• Can we model Planckian-like distributions with Erlang mixtures?

• Erlang distribution

E(ν, α,T ) =1

(α− 1)!

να−1

Tα eν/T(11)

• Consider Erlang distributions with shapes α = {3, 4}

E(ν, 3,T ) =1

2

ν2

T 3 eν/T, E(ν, 4,T ) =

1

6

ν3

T 4 eν/T(12)

Page 12: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Planckian and Erlang Distributions

5 10 15 20ν

0.05

0.10

0.15

0.20

pdf

Planckian

Erlang(ν, 3, β3)

Erlang(ν, 4, β4)

(a) Planckian and Erlang pdf

5 10 15 20ν

0.05

0.10

0.15

0.20

pdf

Planckian

Hyper-Erlang(ν, α, β)

(b) Hyper-Erlang pdf (α = {3, 4}-mixture)

Figure: Planckian frequency data sample size: 200, 000

Page 13: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Hyper-Erlang Model for IMC

• IMC particles in group g are characterized by the same averagefrequency, i.e. have the same likelihood to be drawn from g

• Cumulative energy-weight ωg =∑npg

j=1 ωgj of npg particles in g is

the relative number of photons represented by all particles in g

• Weighted Log-Likelihood of the Hyper-Erlang IMC Model

lnn∏

g=1

(m∑

k=1

pk E(νg , αk , βk)

)ωg

=n∑

g=1

ωg lnm∑

k=1

pk E(νg , αk , βk)

• Weighted Maximum Likelihood / EM parameter estimates

γgk =pk E(νg , αk , βk)∑mk=1 pk E(νg , αk , βk)

, g = 1, . . . , n, k = 1, . . .m

βk =

∑ng=1 ω

g γgk νg

αk

∑ng=1 ω

g γgk, pk =

∑ng=1 γgk∑ng=1 ω

g

Page 14: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Weighted Hyper-Erlang Model forcompressing IMC particles

0 500 1000 1500 2000energy

0.0005

0.0010

0.0015

0.0020

0.0025pdf

IMC data

Normalized Planckian, T=100

Weighted Hyper-Erlang

Figure: Planckian IMC data sample from the time step t0 = 0 (initialtemperature T0 = 100, 500 groups), Normalized Planckian bν(T ) atT = 100 and the Hyper-Erlang Model H(ν, α, β) of the sample with 20mixture elements.

Note: for x ∼ bν(T ) and y ∼ H(ν, α, β): ‖x− y‖∞ = 6.2× 10−6

Page 15: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Weighted Hyper-Erlang Model

Table: Parameters of the Hyper-Erlang Model of the Planckian IMC datasample with initial temperature T0 = 100 from the time step t0 = 0.Shape parameters αk are fixed, scale parameters βk model radiationtemperature and k = 1, 2, . . . , 20.

pk αk βk

0. 1 358.2460.0248371 2 182.5920.2026990 3 97.18970.2191770 4 71.54190.1673600 5 73.38360.1283270 6 69.37270.0927585 7 64.59440.0622957 8 61.42640.0394477 9 60.41350.0244054 10 60.74610.0151412 11 61.05870.0094205 12 60.82830.0058112 13 59.85230.0035262 14 58.21330.0020989 15 56.39010.0012253 16 54.77730.0007035 17 53.64420.0004003 18 53.22660.0002299 19 53.61960.0001359 20 54.2246

Page 16: Machine Learning for Memory Reduction in the Implicit ...

MachineLearning for IMC

Anna Matsekh Summary and Future Work

• Memory usage reduction during the code checkpointing andrestarting steps is of great importance in the IMC simulations

• Innovation: storing only parameters of the probabilitydistributions of census particles in place of the data structuresdescribing them in order to reduce IMC storage requirements

• Innovation: Expectation Maximization and WeightedHyper-Erlang Models can accurately model Planckian andtherefore are appropriate for compression of isotropic IMCcensus data in the frequency domain in LTE

• Work-in-Progress: We are currently researching applicability ofthe Expectation Maximization and von Mises Mixture Models forcompression of anisotropic directional IMC census data on acircle