Automatic Numeric Inversion - boun.edu.tr

48
Automatic Numeric Inversion Wolfgang H ¨ ormann, Gerhard Derflinger, Josef Leydold [email protected] IE-Department Bogazici University Istanbul Department of Statistics, WU Wien H ¨ ormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.0/41

Transcript of Automatic Numeric Inversion - boun.edu.tr

Page 1: Automatic Numeric Inversion - boun.edu.tr

Automatic Numeric InversionWolfgang Hormann, Gerhard Derflinger, Josef Leydold

[email protected]

IE-Department Bogazici University IstanbulDepartment of Statistics, WU Wien

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.0/41

Page 2: Automatic Numeric Inversion - boun.edu.tr

Non-Uniform Random Variate Generation

Aim:Generate a sequence Xi of IID random variates with givendistribution.

Solution:Transform sequence Ui of IID U(0, 1) random numbers intosequence Xi.

U1, U2, U3, U4, . . . −→ X1, X2, X3, . . .

For inversion this transformation is one-to-one.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.1/41

Page 3: Automatic Numeric Inversion - boun.edu.tr

Inversion Method

Theorem:Let F(x) be a CDF of the given distribution.If U is a U(0, 1) random number, then

X = F−1(U) = inf {x : F(x) ≥ U}

is a random variate with CDF F.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.2/41

Page 4: Automatic Numeric Inversion - boun.edu.tr

Inversion Method

Required: (Inverse of) CDF F of Distribution.

1

U

X

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.3/41

Page 5: Automatic Numeric Inversion - boun.edu.tr

Inversion Method

Required: (Inverse of) CDF F of Distribution.

1

U

X

U ∼ U(0, 1)

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.3/41

Page 6: Automatic Numeric Inversion - boun.edu.tr

Inversion Method

Required: (Inverse of) CDF F of Distribution.

1

U

X

U ∼ U(0, 1) −→ X = F−1(U)

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.3/41

Page 7: Automatic Numeric Inversion - boun.edu.tr

Inversion Method // Example

Exponential distribution:

F(x) = 1 − exp(−x)

F−1(u) = −log(1 − u)

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.4/41

Page 8: Automatic Numeric Inversion - boun.edu.tr

Inversion Method // Algorithm

Required: Inverse of CDF

Generate U ∼ U(0, 1).Compute X = F−1(U).Return X.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.5/41

Page 9: Automatic Numeric Inversion - boun.edu.tr

Inversion Method // Algorithm

Required: Inverse of CDF

Generate U ∼ U(0, 1).Compute X = F−1(U).Return X.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.5/41

Page 10: Automatic Numeric Inversion - boun.edu.tr

Inversion Method // Algorithm

Required: Inverse of CDF

Generate U ∼ U(0, 1).Compute X = F−1(U).Return X.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.5/41

Page 11: Automatic Numeric Inversion - boun.edu.tr

Inversion Method // Algorithm

Required: Inverse of CDF

Generate U ∼ U(0, 1).Compute X = F−1(U). ⇐ Problem (?)Return X.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.5/41

Page 12: Automatic Numeric Inversion - boun.edu.tr

Inversion Method // Advantages

The most general method for generating non-uniformrandom variates.Works for all distributions provided that the CDF isgiven.

Get one random variate X for each uniform U.

Preserves the structural properties of the underlyinguniform PRNG.Easy to use for QMC.Required for Copula generation.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.6/41

Page 13: Automatic Numeric Inversion - boun.edu.tr

Inversion Method // Advantages

Consequently . . .

Can be used for variance reduction techniques.(common / antithetic variates, stratified sampling, . . . )

Sampling from truncated distributions.

Quality of generated random numbers depends only onthe underlying uniform PRNG, not on distribution.

Hence preferred method for MC.

Often considered as the only possible alternative for QMC.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.7/41

Page 14: Automatic Numeric Inversion - boun.edu.tr

Inversion Method // Disadvantages

CDF and its inverse often not given in closed form(or unknown).

Algorithms based on well-known numerical methods(e.g. Newton’s method or regula falsi) are slow and canonly be speeded up by the usage of (large) tables.

Numerical methods are not exact, i.e., they producerandom numbers which are only approximatelydistributed as the given distribution.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.8/41

Page 15: Automatic Numeric Inversion - boun.edu.tr

Automatic Methods

Automatic algorithms have been developed fornon-standard distributions.

Also called black-box or universal methods.

Idea:One algorithm works for a large class of distributions.

Today these algorithms have properties that makes themalso attractive for generating from standard distributions.Even for sampling from Gaussian distributions.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.9/41

Page 16: Automatic Numeric Inversion - boun.edu.tr

Automatic Inversion Method // Why ?

Reasons for development:

In many simulation situations the application ofstandard distributions is not adequate.Development of inversion methods for specialdistribution too “expensive”.

New Automatic Inversion Algorithm:

User provides the PDF and one point in the domain.Set-up calculates all necessary tables and thus"designs" the algorithm.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.10/41

Page 17: Automatic Numeric Inversion - boun.edu.tr

Automatic Inversion Method // Why ?

Reasons for development:

In many simulation situations the application ofstandard distributions is not adequate.Development of inversion methods for specialdistribution too “expensive”.

New Automatic Inversion Algorithm:

User provides the PDF and one point in the domain.Set-up calculates all necessary tables and thus"designs" the algorithm.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.10/41

Page 18: Automatic Numeric Inversion - boun.edu.tr

Basics of Numerical Inversion

Devroye’s Notion of “Exact R.V.G. Algorithms”:

Inversion only possible for F−1(u) available in closed form.

Practically all MC or QMC experiments use 64-bit floatingpoint numbers.

We call an algorithm “exact” if its precision is close tomachine precision.

Problem: Definition of the approximation error

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.11/41

Page 19: Automatic Numeric Inversion - boun.edu.tr

Definition of Approximation Error

Without inverting the CDF we can only control the “u-error”of the approximate inverse CDF denoted by F−1

a (x).

εu(u) = |u − F(F−1a (u))|

The “x-error”, i.e., εx(u) = |F−1(u) − F−1a (u)| can be quite

large in the tails as

εx(u) = εu(u)/f(x) + O(εu(u)2)

u-error useful for bound of “F-discrepancy”.

Small u-error enough for MC and QMC applications.

For exact quantiles in the far tails the x-error is necessary!

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.12/41

Page 20: Automatic Numeric Inversion - boun.edu.tr

Requirements for the new Algorithm

Maximal accepted u-error can be selected by the userMaximal u-errors close to machine precision can bereached with medium-sized tablesSampling algorithm (very) fast

To reach the above we accept:

Slow set-upMedium-sized tables

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.13/41

Page 21: Automatic Numeric Inversion - boun.edu.tr

Main Ingredients of Numerical Inversion

Numeric Integration to obtain the CDF F(x):Gauss-Lobatto integrationNumerical approximation of F−1(u):Newton interpolationDecomposition into many subintervals:indexed search to find the subintervalImportant Synergies:We use the same subintervals for integration andinterpolation.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.14/41

Page 22: Automatic Numeric Inversion - boun.edu.tr

Numerical integration

As Quadrature rule we use: Gauss-Lobatto(5)

(similar to Gauss-Legendre(4) but uses the intervalendpoints as nodes)

Error bound for Lobatto(5) for interval of length ∆:

integrationerror ≤ 7.03 · 10−10 ∆9 maxξ

f(8)(ξ)

Thus: Numeric integration is no problem forsmooth PDF f(x) with bounded 8th derivative.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.15/41

Page 23: Automatic Numeric Inversion - boun.edu.tr

CDF

Start with a point x0 in the far left tail and selectsubinterval borders xi.Calculate

∫xi+1

xif(x)dx with Gauss-Lobatto(5).

Store the CDF values of xi.

To evaluate the CDF at x:

Find the maximal i with xi ≤ x

Use Gauss-Lobatto(5) to evaluate∫x

xif(t)dt..

Return F(x) = F(xi) +∫x

xif(t)dt.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.16/41

Page 24: Automatic Numeric Inversion - boun.edu.tr

Lobatto(5) Integration error on (0, x)

0.0 0.1 0.2 0.3 0.4 0.5

−8e−

12−6

e−12

−4e−

12−2

e−12

0e+0

0

Lobato 5 Integration error for PDF gamma(2) on (0,x)

0.0 0.1 0.2 0.3 0.4 0.5

−0.0

015

−0.0

010

−0.0

005

0.00

00

Lobato 5 Integration error for PDF gamma(1.5) on (0,x)

Gamma(2), f(8) bounded Gamma(1.5), f(8) unbounded

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.17/41

Page 25: Automatic Numeric Inversion - boun.edu.tr

Polynomial Interpolation

Idea:Approximate function g(x) on some interval [x0, xn] bypolynomial p(x) such that

g(xi) = p(xi) for x0 < x1 < · · · < xn and i = 0, . . . , n.

Newton Interpolation uses a numerical stablerepresentation of the polynomial.

Note that we use x0 and xn as borders of the subintervals toavoid discontinuities.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.18/41

Page 26: Automatic Numeric Inversion - boun.edu.tr

Newton Interpolation

For smooth functions the error of the Newton interpolationis:

|g(x)−p(x)| =g(n+1)(ξ)

(n + 1)!

n∏

i=0

(x−xi) for x and ξ in (x0, xn).

Thus the approximation error is O((xn − x0)n+1) and can be

made as small es desired by using shorter subintervals.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.19/41

Page 27: Automatic Numeric Inversion - boun.edu.tr

Interpolation of the Inverse CDF

We can evaluate the CDF F(x) using numerical integration.Thus it is easy to specify the interval and constructionpoints xi and ui = F(xi).Use the pairs (ui, xi) for Newton interpolation of F−1

We have two alternatives for point selection:use the “rescaled” Chebyshev points to select xi.Advantage: SimpleDisadvantage: Interploation may be poor in the tails.use the “rescaled” Chebyshev points for ui.Advantage: optimal interpolationDisadvantage: numerical inversion in set-up

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.20/41

Page 28: Automatic Numeric Inversion - boun.edu.tr

F−1-Normal distribution, equidistant xi

0.980 0.985 0.990 0.995

2.0

2.2

2.4

2.6

2.8

3.0

n=5, N(0,1) on (2,3) xi equidistant

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.21/41

Page 29: Automatic Numeric Inversion - boun.edu.tr

F−1-Normal distribution, Chebyshev xi

0.980 0.985 0.990 0.995

2.0

2.2

2.4

2.6

2.8

3.0

n=5, N(0,1) on (2,3) xi Chebyshev

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.22/41

Page 30: Automatic Numeric Inversion - boun.edu.tr

F−1-Normal distribution, equidistant ui

0.980 0.985 0.990 0.995

2.0

2.2

2.4

2.6

2.8

3.0

n=5, N(0,1) on (2,3) ui Chebyshev

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.23/41

Page 31: Automatic Numeric Inversion - boun.edu.tr

u-error: Normal distribution, equidistant xi

0.978 0.980 0.982 0.984 0.986 0.988

−2.5

e−07

−1.5

e−07

−5.0

e−08

5.0e

−08

n=5, N(0,1) on (2,2.25) xi equidistant

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.24/41

Page 32: Automatic Numeric Inversion - boun.edu.tr

u-error: Normal distribution, Chebyshev xi

0.978 0.980 0.982 0.984 0.986 0.988

−1.5

e−07

−5.0

e−08

5.0e

−08

1.0e

−07

n=5, N(0,1) on (2,2.25) xi Chebyshev

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.25/41

Page 33: Automatic Numeric Inversion - boun.edu.tr

u-error: Normal distribution, equidistant ui

0.978 0.980 0.982 0.984 0.986 0.988

−6e−

08−2

e−08

2e−0

86e

−08

n=5, N(0,1) on (2,2.25) ui Chebyshev

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.26/41

Page 34: Automatic Numeric Inversion - boun.edu.tr

Error for interpolation of inverse CDF

For a Newton polynomial of order n the error bound for thex-error is:

|F−1(u) − F−1a (u)| ≤

maxv((F−1(v))(n+1))

(n + 1)!

n∏

i=0

(u − ui)

Thus the x-error depends on the n-th derivative of 1/f(x)

and is large for the tails.

For short intervals: maximal error attained for u ≈ ui thelocal extrema of∏n

i=0(u − ui).We use:

u-error ≈ maxi

|ui − F(F−1a (ui))|

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.27/41

Page 35: Automatic Numeric Inversion - boun.edu.tr

Polynomial Inversion Algo. // Preprocessing

User input: PDF (need not integrate to 1),point xs, maximal accepted u-error εu.

Assumption: PDF is smooth, support of PDF is an interval.Otherwise: Decomposition in such intervals known.

Steps before constructing the subintervals:Find a < b with f(a) and f(b) < f(xs) · 10

−8.

Calculate approximate I(a,b) =∫b

a f(x)dx numerically.Find cut-off values a, b with tail areas smaller thanI(a,b) εu 0.1.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.28/41

Page 36: Automatic Numeric Inversion - boun.edu.tr

The area below the density

Standard method for numerically integrating long intervals:

Half the intervals, till the total integral does not change anymore.

Gauss-Lobatto well suited as just 6 new points have to beevaluated for both new sub-intervals .

(Gauss-Legendre(4) requires 8 new points.)

It is easy to implement the above idea recursively. Thusonly sub-intervals which still are not integrated exact aresplit.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.29/41

Page 37: Automatic Numeric Inversion - boun.edu.tr

Cut-off Values

The "local concavity" lcf(x) of a density is defined as:

lcf(x) = 1 −f ′′(x) f(x)

f ′(x)2

Assuming that lcf(x) is constant in the far tails of thedistribution we get the approximate tail integral:

T(x) ≈f(x)2

f ′(x)(1 + lcf(x))

The first and the second derivative are estimated usingcentered differences. Then T(x) = εu 0.1 can be solved by aroot-finding algorithm.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.30/41

Page 38: Automatic Numeric Inversion - boun.edu.tr

Polynomial Inversion Algorithm // Set-up

[Function Sub-Interval (x0, xn)]

Select the xi for i = 1, 2, . . . , n.Using n Lobatto(5) integrations calculate the ui = F(xi)

for i = 1 to n.Calculate the coefficents of the Newton Interpolation forthe pairs (ui, xi), i = 0, 1, . . . , n.Calculate the control-points for the u-error.If the u-error at all control-points is a bit smaller than εu

return xn.otherwise retry with smaller/larger value for xn.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.31/41

Page 39: Automatic Numeric Inversion - boun.edu.tr

Polynomial Inversion Algorithm // Set-up Cont.

Use a, b and εu = I · εu from the preprocessing.

[Loop]Set x0 ← a, step← (b − a)/128, j← 0.Repeat while x < b.

x← Sub-Interval(xj, xj + step)

Set step← x − xj, j← j + 1, xj ← x.Store Coefficients of the Newton Polynomials for allsubintervals.Store cumulative probabilities of the subintervals.Make guide table.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.32/41

Page 40: Automatic Numeric Inversion - boun.edu.tr

Polynomial Inversion Algorithm // Sampling

[Generator]

Generate U ∼ U(0, 1).Use indexed search to calculate index J of thecorresponding subinterval.Evaluate the Newton polynomial of interval J at U tocalculate X.Return X.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.33/41

Page 41: Automatic Numeric Inversion - boun.edu.tr

Checking the Approximation

For Inversion Algorithm statistical test not necessary.

Need a (possibly slow) implementation of the CDF(not using any parts of our program!)

Check the u-error for many values of u in (0, 1).

Especially values close to 0 and 1 are important to check.

Also important to control that the algorithm returns a and b

for u = 0 and u = 1.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.34/41

Page 42: Automatic Numeric Inversion - boun.edu.tr

Test Results

We checked our implementation for:Normal, Gamma, and t distributionsMany different parametersFor n = 3 to 11.For εu = 10−6, 10−8, 10−10, 10−12.For 3 · 106 different u-values, many of them in the tails.

Results:For εu not smaller than 10−10 the maximal observed u-errorwas always below εu.

For 10−12 some moderately larger u-errors were observed.Probably error too close to machine precision.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.35/41

Page 43: Automatic Numeric Inversion - boun.edu.tr

u-error for normal distribution

0.0 0.2 0.4 0.6 0.8 1.0

0e+0

02e

−11

4e−1

16e

−11

8e−1

1

ue=1.e−10, n=5

uval

uerro

r

εu = 10−10, n = 5.Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.36/41

Page 44: Automatic Numeric Inversion - boun.edu.tr

Number of Intervals: Linear Interpolation

distribution εu = 10−6 εu = 10−8 εu = 10−10 εu = 10−12

Linear InterpolationNormal 1063 11533 117875 -Cauchy 1849 17491 185335 -Exponential 1012 10406 101959 -Gamma(5) 1072 11225 109336 -Gamma( 1

2) 1546 15432 154291 -

Beta(2,2) 823 8009 88179 -Beta(0.3,3) 1884 18783 187786 -

1001/2 = 10

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.37/41

Page 45: Automatic Numeric Inversion - boun.edu.tr

Number of Intervals: Newton Interpolation

distribution εu = 10−6 εu = 10−8 εu = 10−10 εu = 10−12

n = 3

Normal 58 171 521 1616Cauchy 103 281 820 2512Exponential 41 121 371 1145Gamma(5) 62 177 530 1636

1001/4 = 3.16

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.38/41

Page 46: Automatic Numeric Inversion - boun.edu.tr

Number of Intervals: Newton Interpolation

distribution εu = 10−6 εu = 10−8 εu = 10−10 εu = 10−12

n = 5

Normal 31 61 123 253Cauchy 56 103 197 388Exponential 20 38 75 156Gamma(5) 33 73 135 265Gamma(1.6) 29 57 118 240Hyperbolic(ζ = 1) 33 65 130 267Hyperbolic(ζ = 0.1) 35 68 136 281

1001/6 = 2.15

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.39/41

Page 47: Automatic Numeric Inversion - boun.edu.tr

New Algorithm // Advantages

Requires only the (unnormed) PDF and no CDF.Accuracy can be easily checked. We can thus use themaximal accepted approximation error εu as parameterof the algorithm.

There is not need to compute F−1(u) for any u.Only moderate number of subintervals requiredVery fast.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.40/41

Page 48: Automatic Numeric Inversion - boun.edu.tr

New Algorithm // Disadvantages

Have to cut off tails of distributions with unboundeddomains.

We only can control the “u-error”, i.e.,εu(u) = |u − F(F−1

a (u))|.The “x-error”, i.e., εx(u) = |F−1(u) − F−1

a (u)| can be quitelarge in the tails.Performance depends on floating point numbers used.For future new standards higher order polynomials maybe necessary.

Note:The floating point standard has not changed for 20 years.

Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.41/41