Product Autoregressive Models with Log-Laplace and Double...

CHAPTER 6

Product Autoregressive Models with

Log-Laplace and Double Pareto

Log-normal Marginal Distributions

6.1 Introduction

The relationships between Laplace and log-Laplace distributions and double Pareto log-

normal and normal Laplace distributions are analogous to the relationship between the

Normal and lognormal distributions. These models appeared in the statistical, economic

as well as science literature over the past seventy years. Most often they appeared as

models for data sets with particular properties or were derived as the most natural mod-

els based on the properties of the studied processes. Kozubowski and Podgorski (2003)

review many uses of the log-Laplace distribution. The asymmetric log-Laplace distribution

Some results included in this chapter form part of the paper Jose and Manu (2012).

112

CHAPTER 6. PRODUCT AUTOREGRESSIVE MODELS WITH LOG-LAPLACE AND DOUBLE PARETO LOG-NORMAL

MARGINAL DISTRIBUTIONS

has been fit to pharmacokinetic and particle size data. Particle size studies often show

the log size to follow a tent-shaped distribution like the Laplace, see Julia and Vives-Rego

(2005) for more details. It has been used to model growth rates, stock prices, annual gross

domestic production, interest and forex rates. Some explanation for the goodness of fit

of the Log-Laplace has been suggested because of its relationship to Brownian motion

stopped at a random exponential time.

In this chapter we consider log-Laplace distributions, double Pareto lognormal distri-

butions and their multivariate extensions along with applications in time series modelling

using product autoregression. Various divisibility properties like infinite divisibility and ge-

ometric infinite divisibility are studied. Multiplicative infinite divisibility and geometric mul-

tiplicative infinite divisibility are introduced and studied. Additive autoregressive models

with these marginal distribution are also developed. The generation of the process, sam-

ple path properties and estimation of parameters are considered.

6.2 The log-Laplace distribution and its properties

Symmetric and asymmetric forms of log-Laplace distribution were used for modelling var-

ious phenomena by a number of researchers. Inoue (1978) derived the symmetric log-

Laplace distribution from his stochastic model for income distribution, fitted it to income

data by maximum likelihood and reported a better fit than that of a lognormal model tra-

ditionally used in this area. Uppuluri (1981) obtained an axiomatic characterization of this

distribution and derived the distribution from a set of properties about the dose-response

curve for radiation carcinogenesis. Barndorff-Nielsen (1977) and Bagnold and Barndorff-

Nielsen (1980) proposed the log-hyperbolic models, of which log-Laplace is a limiting case

for particle size data. Log-Laplace models have been recently proposed for growth rates of

diverse processes such as annual gross domestic product, stock prices, interest or foreign

currency exchange rates, company sizes, and other processes. Log-Laplace distributions

are mixtures of lognormal distributions and have asymptotically linear tails. These two

features makes them particularly suitable for modelling size data.

113



Figure 6.1: (a) β = 1.3, δ = 1 (b) β = 1, δ = 1

Definition 6.2.1. A random variable Y is said to have a log-Laplace distribution with

parameters δ > 0, α > 0, and β > 0 (LL(δ, α, β)) if its probability density function

(p.d.f.) is

g(y) =1

δ

αβ

α + β

(yδ

)β−1for 0 < y < δ(

δy

)α+1

for y ≥ δ. (6.2.1)

The cumulative density function has the form

G(y) =

0 for x < 0

αβα+β

(yδ

)βfor 0 < y ≥ δ

1− βα+β

(δy

)αfor y ≥ δ

. (6.2.2)

This distribution can be derived by combining the two power laws and has power tails at

zero and at infinity. This density has a distinct ‘tent’ shape when plotted on the log-log

scale. The graphs of log-Laplace distribution for fixed β and for various values of α are

given in Figure 6.1

The log-Laplace pdf (6.2.1) can be derived as the distribution of eX where X is an

114



asymmetric Laplace (AL) variable with density

f(x) =αβ

α + β

exp(−α(x− θ)) for x ≥ θ

exp(β(x− θ)) for x < θ. (6.2.3)

Therefore if X has an AL distribution given by (6.2.3), then the density of Y = eX is given

by (6.2.1) with δ = eθ.

Kozubowski and Podgorski (2003) studied some important properties of LL(δ, α, β). It

has Pareto-type tails at zero and infinity, that is

P (Y > x) ∼ C1x−α as x→∞ and

P (0 < Y ≤ x) ∼ C2xβ as x→ 0+.

It also possesses invariance property with respect to scaling and exponentiation which

is natural property of variables describing multiplicative processes such as growth. The

distribution has a representation as an exponential growth-decay process over random

exponential time which extends a similar property of the Pareto distribution by allowing

decay in addition to growth. Its simplicity allows for efficient practical applications and

thus gives an advantage over many other models for heavy power tails, such as stable or

geometric stable laws. The upper tail index is not bounded from above which adds flexibility

over some other models for heavy tail data such as stable or geometric stable laws where

its value is limited by two. Maximum entropy property of LL distribution is desirable in

many applications. Stability with respect to geometric multiplication which may play a

fundamental role in modelling growth rates. Limiting distribution of geometric products

of LL random variables leads to useful approximations. Its straightforward extension to

the multivariate setting allows modelling of correlated multivariate rate data, such as joint

returns on portfolios of securities.

115



Let Y d= LL(δ, α, β), and let c > 0, r 6= 0, then cY d

= LL(cδ, α, β),

Y r d=

LL(δr, α/r, β/r) for r > 0

LL(δr, β/|r|, α/|r|) for r < 0.

In particular, if Y d=LL(δ, α, β) then 1

Y

d=LL(1/δ, β, α), and if α = β, we have the

reciprocal property1

Yd= Y.

The characteristic function of the LL(1, α, β) random variable Y has the form

E(eitY ) =α

α + βM(β, β + 1, it) +

αβ

α + βtα[C(t,−α) + iS(t,−α)], (6.2.4)

whereM(a, b, z) = 1+∞∑n=1

(a)nzn

(b)nn!, for a > 0, b > 0, z ∈ C, (a)n = a(a+1) · · · (a+n−1)

is the confluent hypergeometric function and C(x, a) =

∫ ∞x

ta−1 cos tdt and

S(x, a) =

∫ ∞x

ta−1 sin tdt are the generalized Fresnel integrals.

LL distributions are heavy tailed and some moments do not exist.The mean and the

variance are finite only if α > 1 and α > 2 respectively. Due to reciprocal properties of

these laws, the harmonic mean is of the same form as the reciprocal of the mean. We also

note that LL distributions are unimodal with the mode at δ when β > 1 and the mode at

zero when β < 1.

Mean = δαβ

(α− 1)(β + 1), α > 1

rth moment = δrαβ

(α− r)(β + r),−β < r < α

Variance = δ2

(αβ

(α− 2)(β + 2)−[

αβ

(α− 1)(β + 1)

]2), α > 2.

Log-Laplace distributions can be represented in terms of other well-known distribu-

116



tions, including the lognormal, exponential, uniform, Pareto and beta distributions. The

log-Laplace distribution LL(δ, α, β) can be viewed as a lognormal distribution LN(µ, σ),

where the parameters µ and σ are random. More specifically, the variable Y ∼ LL(δ, α, β)

has the representation

Yd= eµRσ,

where R is standard lognormal random variable,

µ = log δ +

(1

α− 1

β

)E and

σ =

√2E

αβ,

where E is a standard exponential variable independent of R.

As a direct consequence of the fact that a skew Laplace variable arises as a difference

of two independent exponential variables. We have

Yd= δe

1αE1− 1

βE2 ,

where E1 and E2 are two i.i.d. standard exponential variables. Let U1 and U2 be indepen-

dent random variables distributed uniformly on [0, 1]. Then we have

Yd= δ

U1/β1

U1/α2

.

LL random variable can also be represented as the ratio of two Pareto random variables.

Yd= δ

P1

P2

,

where P1 and P2 are independent Pareto random variables with parameters α and β re-

spectively, for more details, see Kozubowski and Podgorski (2003).

117



Entropy, basic concept in information theory is a measure of uncertainty associated

with the distribution of a random variable Y is defined as

H(Y ) = E[− log f(Y )].

It has found applications in a variety of fields, including statistical mechanics, queuing

theory, stock market analysis, image analysis and reliability. The entropy of Y ∼ LL(δ, α, β)

is given by

H(Y ) = 1 + log δ +1

α− 1

β+ log

(1

α+

1

β

).

For an AL random variable X , entropy is given by

H(X) = log

(1

α+

1

β

).

The entropy is maximized for an AL distribution and hence the same property holds for LL

distribution also. Jose and Naik (2008) introduced an asymmetric pathway distributions

and showed that the model maximizes various entropies.

The estimation of parameters of the log-Laplace distribution is given by Hinkley and

Revankar (1977). They given Fisher information matrix of the LL random variable. The

maximum likelihood estimates of the parameters are given by Hartley and Revankar (1974).

They showed that these estimators are asymptotically normal and efficient.

6.2.1 Multivariate Extension

Let X = (X1, . . . , Xd)′ have a multivariate asymmetric Laplace distribution with character-

istic function

ψ(t) =

[1 +

1

2t′Σt− im′t

]−1

, (6.2.5)

where t′ denotes transpose of t,m ∈ Rd and Σ is a d× d non-negative definite symmetric

matrix. A d-dimensional log-Laplace variable can be defined as a random vector of the

118



form

Y = eX = (eX1 , . . . , eXd)′.

If Σ is positive-definite, then the distribution is d-dimensional and the corresponding density

function can be derived easily from that of the Laplace distribution, see Kotz et al. (2001).

g(Y) = f(log y)

(d∏i=1

yi

)−1

, y = (y1, . . . , yd) > 0,

where log y = (log y1, . . . , log yd) is defined componentwise and

f(x) =2ex′Σ−1m

(2π)d/2|Σ|1/2

(x′Σ−1x

2 + m′Σ−1m

)ν/2Kν

(√(2 + m′Σ−1m)(x′Σ−1x)

), x 6= 0

is the density of multivariate asymmetric Laplace distribution. Here ν = 1− d/2 and Kν is

the modified Bessel function of the third kind.

Similar to the univariate case, multivariate LL distributions also possesses the stability

and limiting properties with respect to geometric multiplication. Each component of a

multivariate LL random vector is univariate LL.

6.2.2 Divisibility properties

Klebanov et al. (1984) introduced geometric infinite divisibility (g.i.d.) and obtained several

characterizations in terms of characteristic functions. The class of g.i.d. distributions form

a subclass of infinitely divisible (i.d.) distributions and contain the class of distributions with

complete monotone derivative (c.m.d.). They also introduced and characterized the related

concept of geometric strict stability (g.s.s.) for real valued random variables. The exponen-

tial and geometric distributions are examples of distributions that possess the g.i.d. and

the g.s.s. properties. Mittag-Leffler distributions, Laplace distributions etc are g.i.d., see

Pillai (1990), Pillai and Sandhya (1990). Fujitha (1993) constructed a larger class of g.i.d.

distributions with support on the non-negative half-line. Bondesson (1979) and Shanbhag

and Sreehari (1977) have established the self-decomposability of many of the most com-

119



monly occurring distributions in practice. Kozubowski and Podgorski (2010) introduced a

notion of random self-decomposability and discussed its relation to the concepts of self-

decomposability and g.i.d.. The notions of infinite divisibility (i.d.) and geometric infinite

divisibility (g.i.d.) play a fundamental role in the study of central limit theorem and Levy

processes. Variables appearing in many applications in various sciences can often be

represented as sums of larger number of tiny variables, often i.i.d. The theory of infinite

divisible distributions developed primarily during the period from 1920 to 1950.

Remark 6.2.1. AL distributions are i.d..

Remark 6.2.2. LL distributions are not i.d..

Definition 6.2.2. A random variable X and its probability distribution is said to be

g.i.d. if for any p ∈ (0, 1) it satisfies the relation

Xd=

νp∑i=1

X(i)p ,

where νp is a geometric random variable with mean 1/p, the random variables X(i)p are

i.i.d. for each p, and νp and (X(i)p ) are independently distributed, see Klebanov et al.

(1984).

Characterization of g.i.d.. A random variable X is g.i.d. if and only if (iff)

ϕX(t) =1

1 + ψ(t),

where ψ(t) is a non-negative function with complete monotone derivative (c.m.d.) and

ψ(0) = 0.

Now we consider the divisibility properties with respect to multiplication. Kozubowski

and Podgorski (2003) discussed the multiplicative divisibility and geometric divisibility.

There is no further developments on this area in the literature.

120



Definition 6.2.3. A random variable Y is said to be multiplicative infinitely divisible

(m.i.d.) if it has the representation

Yd=

n∏i=1

Yi, n = 1, 2, 3, . . . ,

for some i.i.d. random variables Yi.

Theorem 6.2.1. LL distributions are m.i.d..

Proof. Let Y ∼LL. We have to prove that

Yd=

n∏i=1

Yi,

where Yi’s are i.i.d. random variables.

Taking logarithm on both sides we get log Y =∑n

i=1 log Yi. Since we know that

X = log Y ∼AL distributions, we need only to prove that X is i.d.. AL distributions are i.d..

Therefore it follows that LL distributions are m.i.d..

Definition 6.2.4. A random variable Y is said to be geometric multiplicative infinitely

divisible (g.m.i.d.) if for any p ∈ (0, 1), it satisfies the relation

Yd=

νp∏i=1

Y (i)p ,

where νp is a geometric random variable with mean 1/p, the random variables Y(i)p are

i.i.d. for each p, and νp and (Y(i)p ) are independently distributed.

Characterization of g.i.d.. A random variable X is g.m.i.d. if and only if

ϕlogX(t) =1

1 + ψ(t),

121



where ψ(t) is a non-negative function with complete monotone derivative (c.m.d.) and

ψ(0) = 0.

Theorem 6.2.2. LL distributions are g.m.i.d..

The log-Laplace laws arise as limits of the products Y1Y2 . . . Yνp of i.i.d. random vari-

ables with geometric number of terms. As Laplace distributions are limits of sums of ran-

dom variables X1 +X2 + · · ·+Xνp with a geometric number of terms.

6.2.3 Product Autoregression

McKenzie (1982) introduced a product autoregression structure. A product autoregression

structure of order one (PAR(1)) has the form

Yn = Y an−1εn, 0 < a ≤ 1, n = 0,±1,±2, . . . , (6.2.6)

where {Yn} and {εn} are sequence of positive random variables and they are indepen-

dently distributed. In the usual non-linear autoregressive models, we have an additive

noise. But in product autoregressive models, we have a non-additive noise. We may

determine the correlation structure as follows.

Yn = Y an−1εn

=

{k−1∏i=0

εai

n−i

}Y ak

n−k.

Assuming stationarity,

E (YnYn−k) =

{k−1∏i=0

E(εa

i)}

E(Y ak

n−k

).

122



From (6.2.6), E(Y s) = E(Y as)E(εs) and therefore

E (YnYn−k) =k−1∏i=0

{E(Y ai)

E(Y ai+1)

}E(Y ak+1)

=E(Y )E(Y ak+1)

E(Y ak). (6.2.7)

Now consider the autocorrelation function ρY (k) = Corr(Yn, Yn−k), when Y has a

log-Laplace distribution.

ρY (k) =(α− 2)(β + 2)

(α− s− 1)(β + s+ 1)

×[

(α− s)(β + s)(α− 1)(β + 1)− αβ(α− s− 1)(β + s+ 1)

(α− 1)2(β + 1)2 − αβ(α− 2)(β + 2)

], α > 2.(6.2.8)

The usual additive first order autoregressive model is given by

Xn = aXn−1 + εn, 0 < a < 1, n = 0,±1,±2, . . . , (6.2.9)

where {εn} is the innovation sequence of independent and identically distributed random

variables. Its autocorrelation function is given by ρX(k) = ak, k = 0,±1,±2, . . .. From

(6.2.8), it is clear that the correlation structure is not preserved in the case of log-Laplace

distribution. It is well known that the correlation structure is not preserved in going from

the lognormal to the normal distributions. McKenzie (1982) showed that the gamma distri-

bution is the only one for which the PAR(1) process has the Markov correlation structure.

6.2.4 Self-decomposability

The basic problem in time series analysis is to find the distribution of {εn}. The class

of s.d. distributions form a subset of the class of infinitely divisible distributions and they

include the stable distributions as proper subset. A number of authors have examined the

L-class in detail and many of its members are now well known.

Definition 6.2.5. (Kozubowski and Podgorski (2010)) A distribution with charac-

123



teristic function ϕ is randomly self-decomposable (r.s.d.) if for each p, c ∈ [0, 1]

there exists a probability distribution with characteristic function ϕc,p satisfying ϕ(t) =

ϕc,p(t)[p+ (1− p)ϕ(ct)].

Definition 6.2.6. A characteristic function ϕ is multiplicative self-decomposable (m.s.d.)

if for every 0 < a < 1, there exists a characteristic function ϕlog a such that ϕlogX(t) =

ϕlogX(at)ϕlog a(t), t ∈ R.

The distributions in the L-class are several which are the distributions of the natural

logarithms of random variables whose distributions are also self-decomposable. These

include the normal, the log gamma and the log F distributions. This phenomenon is very

interesting in a time series point of view because the logarithmic transformation is the

commonest of all transformations used in time series analysis.

6.2.5 Autoregressive model

If we take logarithms of Yn in (6.2.6), let Xn = log Yn, then the stationary process of {Xn}has the form

Xn = aXn−1 + ηn,where ηn = log εn, (6.2.10)

which has the form of linear additive autoregressive model of order one. Then we can

proceed as in the case of AR(1) processes. Now from (6.2.10), under the assumption of

stationarity, we can obtain the characteristic function of η as

φη(t) =φX(t)

φX(at). (6.2.11)

We know that X follows an AL distribution with characteristic function,

φ(t) =eiθt

(1 + 12σ2t2 − iµt)

, −∞ < t <∞, σ > 0, −∞ < µ <∞.

124



This characteristic function can be factored as

φ(t) = eiθt

(1

1 + i σ√2κt

)(1

1− i σ√2κt

), (6.2.12)

where κ > 0, µ = σ√2

(1κ− κ), see Kotz et al. (2001).

Then substituting (6.2.12) in (6.2.11), we get

φη(t) =eiθt

eiθat

(1 + i σ√2κat)

(1 + i σ√2κt)

(1− i σ√2κat)

(1− i σ√2κt)

= eiθ(1−a)t

[a+ (1− a)

1

(1 + i σ√2κt)

][a+ (1− a)

1

(1− i σ√2κt)

].(6.2.13)

This implies that η has a convolution structure of the following form.

ηd= U + V1 − V2, (6.2.14)

where U is a degenerate random variable taking value θ(1 − a) with probability one and

V1 and V2 are convolutions of T1 and T2 where,

T1 =

0, with probability a

E1, with probability 1− a,

T2 =

0, with probability a

E2, with probability 1− a,

where E1 and E2 are exponential random variables with means σ√2κ

and σ√2κ respectively.

6.2.6 Sample path properties

Sample path properties of the process are studied by generating 100 observations each

from the process with various parameter (θ, κ, σ) combinations. In Figure 6.2, we take

125



(a) (b)

Figure 6.2: (a) Sample path of the process for a=0.70 and (θ, κ, σ) = (0, 1, 1) (b)Sample path of the process for a=0.70 and (θ, κ, σ) =(0, 10, 10)

a = 0.7 and the values of (θ, κ, σ) as (0, 1, 1) and (0, 10, 10) respectively. In Figure 6.3,

we take a = 0.4 and the values of (θ, κ, σ) as (0, 1, 1) and (0, 2, 2) respectively. The

process exhibits both positive and negative values with upward as well as downward runs

as seen from the figures.

6.2.7 Estimation of parameters

The moments and cumulants of the sequence of innovations {ηn} can be obtained directly

from (6.3.6) as

E(ηn) = (1− a)

(θ +

σ√2

(1

κ− κ))

, Var(ηn) = (1− a2)

(σ2

2

(1

κ2+ κ2

))

and kn =

(n− 1)!(1− an)(

σ√2

)n (1κn− κn

)if n > 1 is odd

(n− 1)!(1− an)(

σ√2

)n (1κn

+ κn)

if n is even.

Since the mean and variance of AL distribution are

E(X) = θ +σ√2

(1

κ− κ)

and Var(X) =σ2

2

(1

κ2+ κ2

)

126



(a) (b)

Figure 6.3: (a) Sample path of the process for a=0.70 and (θ, κ, σ) = (0, 1, 1) (b)Sample path of the process for a=0.40 and (θ, κ, σ) =(0, 2, 2)

and the higher order cumulants are given by

kn =

(n− 1)!(

σ√2

)n (1κn− κn

)if n > 1 is odd

(n− 1)!(

σ√2

)n (1κn

+ κn)

if n is even.

From the cumulants the higher order moments can be obtained easily since k3 = µ3, k4 =

µ4 − 3µ22 and k5 = µ5 − 10µ2µ3. Hence the problem estimation of parameters of the

process can be tackled in a way similar to the method of moments.

6.2.8 Multivariate product autoregression

A multivariate product autoregression structure of order one (PAR(1)) has the form

Yn = Yan−1εn, 0 < a ≤ 1, n = 0,±1,±2, . . . , (6.2.15)

where {Xn} and {εn} are sequence of positive d-variate random vectors and they are

independently distributed.Here also we have a non-additive noise.

For further analysis, we can take logarithms of Yn in (6.2.15), let Xn = log Yn. Then

127



obtain a multivariate linear AR(1) model,

Xn = aXn−1 + ηn, 0 < a < 1, (6.2.16)

where Xn and innovations ηn = log εn are d- variate random vectors. Clearly we know

that Xn follows a multivariate asymmetric Laplace distribution having the characteristic

function given in (6.2.5). Then the characteristic function of ηn can be obtained from

ψη(t) =ψX(t)

ψX(at),

where

ψX(t) = E(exp it′X).

By inverting the characteristic function, we can obtain the density function of η.

6.3 The double Pareto lognormal distribution and its properties

We know that the logarithmic scaling in the normal distribution results in the lognormal dis-

tribution. Logarithmic scaling in the Laplace distribution leads to log-Laplace distribution,

which is also known as double-Pareto distribution. The double Pareto lognormal (DPLN)

distribution is an exponentiated version of normal Laplace random variable.

A double Pareto lognormal random variable can be defined using a geometric Brow-

nian motion (GBM): dX = µXdt + σXdw where X(t) denotes an individual’s income at

time t, µ and σ are mean drift and variance parameters and dw is white noise. Suppose

that the distribution of starting incomes, say X0(t) at time t is distributed lognormally, so

that lnX0 ∼ N(ν, τ 2). After T time units the state X(T ) will also be lognormally dis-

tributed so that lnX(T ) ∼ N(ν+(µ−σ2/2)T, τ 2 +σ2T ). To find the distribution of X , it is

easiest to work in the logarithmic scale. Let Y = lnX (so that Y is the state of an ordinary

Brownian motion after an exponentially distributed time). The distribution of Y can easily

be shown (see, Reed (2003)) to be that of the sum of independent random variables W

128



and Z, where Z follows a normal distribution with parameters ν and τ 2 and W follows an

asymmetric Laplace distribution with parameters λ1 and λ2. The p.d.f. of Y is given by

gY (y) =λ1λ2

λ1 + λ2

φ

(y − ντ

)[R

(λ1τ −

y − ντ

)+R

(λ2τ +

y − ντ

)], (6.3.1)

where R(z) = 1−Φ(z)φ(z)

is the Mills’ ratio where Φ(z) and φ(z) are the cumulative distribution

function (cdf) and the pdf of standard normal distribution. The distribution of Y is known

as Normal-Laplace distribution (NL(λ1, λ2, ν, τ2)).

A normal Laplace (NL(λ1, λ2, ν, τ2)) random variable has the characteristic function,

which is the product of the characteristic functions of its normal and Laplace components

(see, Reed and Jorgensen (2004)),

φY (t) =

[exp

(iνt− τ 2

2t2)](

λ1λ2

(λ1 − it)(λ2 + it)

)(6.3.2)

Reed and Jorgensen (2004) established that the normal Laplace distribution is infinitely

divisible. Expanding the cumulant generating function of normal-Laplace distribution, it is

easy to show that

E(Y ) = ν +1

λ1

− 1

λ2

and V ar(Y ) = τ 2 +1

λ21

+1

λ22

. (6.3.3)

The third and fourth cumulants are given by

κ3 =2

λ31

− 2

λ32

and κ4 =6

λ41

− 6

λ42

. (6.3.4)

Now the pdf of X can be easily obtained from (6.3.1). It can be expressed in terms of

Mills’ ratio as

f(x) =1

xg(lnx).

129



A random variable X is said to have a DPLN distribution if its pdf is

fX(x) =λ1λ2

λ1 + λ2

[exp

(λ1ν +

λ21τ

2

2

)x−λ1−1Φ

(lnx− ν − λ1τ

2

τ

)+xλ2−1 exp

(−λ2ν +

λ21τ

2

2

)Φc

(lnx− ν + λ2τ

2

τ

)],

where Φ is the cdf and Φc is the complementary cdf of N(0, 1). We can write X ∼DPLN(λ1, λ2, ν, τ

2) to denote a random variable follows double Pareto lognormal distribu-

tion.

The DPLN distribution and log-Laplace distribution share some properties. Similar to

the log-Laplace distributions, the DPLN distribution can be represented as a continuous

mixture of lognormal distributions with different variances. A DPLN(λ1, λ2, ν, τ2) random

variable can be expressed as

Xd= U

V1

V2

,

where U, V1 and V2 are independent. U is lognormally distributed and V1 and V2 are Pareto

random variables with parameters λ1 and λ2 respectively. We can express X also as

Xd= UQ,

where Q is the ratio of the Pareto random variables, known as double Pareto random

variable, which has the pdf

f(q) =

λ1λ2

λ1 + λ2

qλ2−1, for 0 < q ≤ 1

λ1λ2

λ1 + λ2

q−λ1−1, for q > 1.

The DPLN(λ1, λ2, ν, τ2) can be represented as a mixture of the form

f(x) =λ2

λ1 + λ2

f1(x) +λ1

λ1 + λ2

f2(x),

130



where

f1(x) = λ1x−λ1−1 exp

(λ1ν +

λ21τ

2

2

)Φ


2

τ

)and

f2(x) = λ2xλ2−1 exp

(−λ2ν +

λ21τ

2

2

)Φc

(lnx− ν + λ2τ

2

τ

),

which are the limiting forms when λ2 → ∞ and λ1 → ∞ of the DPLN(λ1, λ2, ν, τ2). For

details, see Reed Jorgensen (2004).

The cumulative distribution function of DPLN(λ1, λ2, ν, τ2) can be written as

F (x) = Φ

(lnx− ν

τ

)− 1

λ1 + λ2

[λ2x

−λ1 exp

(λ1ν +

λ21τ

2

2

)Φ


2

τ

)+ λ1x

λ2 exp

(−λ2ν +

λ21τ

2

2

)Φc

(lnx− ν + λ2τ

2

τ

)].

Like the log-Laplace distribution, This distribution also exhibits power law behavior in

both tails for values near zero and large positive values.

i.e.

f(x) ∼ k1x−λ1−1 when x→∞

f(x) ∼ k2xλ2−1 when x→ 0,

where k1 = λ1 exp(λ1ν +

λ21τ

2

2

)and k2 = λ2 exp

(−λ2ν +

λ21τ

2

2

).

The parameters of the DPLN distribution include a lognormal mean (ν) and variance

(τ 2) parameter which control the location and spread of the body of the distribution, and

power-law scaling exponents for the left (λ2) and right (λ1) tails. Special cases of the DPLN

include the right Pareto lognormal (RPLN) distribution, with a power law tail on the right

but not the left side (λ2 →∞); the left Pareto lognormal (LPLN) distribution, with a power-

law tail only near zero (λ1 → ∞); and the lognormal distribution, with no power-law tails

(λ1 →∞, λ2 →∞).

The DPLN(λ1, λ2, ν, τ2) pdf is unimodal if λ2 > 1 and is monotonically decreasing

131



when 0 < λ2 < 1. The moment generating function does not exist for a DPLN distribution.

The lower order moments about zero are given by

µ′r = E(Xr) =λ1λ2

(λ1 − r)(λ2 + r)exp

(rν +

r2τ 2

2

)for r < λ1.

µ′r does not exist for r ≥ λ1. The mean (for λ1 > 1) is

E(X) =λ1λ2

(λ1 − 1)(λ2 + 1)eν+ τ2

2

and the variance (for λ1 > 2) is

V ar(X) =λ1λ2e2ν+τ2

(λ1 − 1)2(λ2 + 1)2

[(λ1 − 1)2(λ2 + 1)2

(λ1 − 2)(λ2 + 2)eτ

2 − λ1λ2

].

The DPLN family of distributions is closed under power-law transformation. If

X ∼ DPLN(λ1, λ2, ν, τ2), then for constants a > 0, b > 0,W = aXb will also follow a

DPLN distribution. In other words, W ∼ DPLN(λ1/b, λ2/b, bν + ln a, b2τ 2).

Various applications of DPLN distribution are discussed by Reed and Jorgensen (2004).

Reed (2003) showed the distribution of income sizes at a point in time to be the product of

independent double Pareto and log normal components. This distribution fits the distribu-

tion of size of cities within a particular country very well, since it has the power-law behavior

in both tails (see, Reed (2002)). It also fits the grouped data on aeolian sand particle size

and diamond particle size given in Barndorff-Nielsen (1977). The similarity between bi-

ological and language evolution has attracted the interest of researchers familiarized to

analyze genetic properties in biological populations with the aim of describing problems of

linguistics. Languages continuously evolve, changing, for example, their lexicon, phonetic

and grammatical structure. This evolution is similar to the evolution of species driven by

mutations and natural selection. Schwammle et al. (2009) show that the distribution of

language sizes in terms of speaker population can be well modelled by the DPLN distribu-

132



tion. For the Internet, many studies have shown that traffic patterns in the Internet appear

to have self-similarity. This self-similarity can possibly be explained if the underlying distri-

bution of file sizes obeys an appropriate power law. Ramırez et al. (2008) assumed that

claim sizes in insurance contexts can be modelled as DPLN random variables.

Mitzenmacher (2004) introduced and analyzed a new dynamic generative user model

to explain the behavior of web file size distributions. The file sizes tend to have a lognormal

body but a Pareto tail. It is also showed that DPLN gives a good fit for the file size distribu-

tions. Seshadri et al. (2008) analyzed a massive social network gathered from the records

of a large mobile phone operator with more than a million users and tens of millions of

calls. They examined the distributions of the number of phone calls per customer, the total

talk minutes per customer and the distinct number of calling partners per customer and

found better fits using the DPLN distribution.

6.3.1 Product Autoregression

Let us consider the PAR(1) structure with DPLN marginal distribution. The autocorrelation

function ρX(k) = Corr(Xn, Xn−k) has the form,

ρX(k) =

eakτ2

(λ1−ak)(λ2+ak)(λ1−1)(λ2+1)(λ1−ak−1)(λ2+ak+1)

− λ1λ2

eτ2 (λ1−1)2(λ2+1)2

(λ1−2)(λ2+2)− λ1λ2

. (6.3.5)

From(6.3.5), it is clear that the correlation structure is not preserved in the case of DPLN

distribution also.

6.3.2 Autoregressive model

If we take logarithms of Yn in (6.2.6), let Xn = lnYn, then the stationary process of {Yn}has the form

Xn = aXn−1 + ηn,where ηn = ln εn, (6.3.6)

which has the form of linear additive autoregressive model of order one. Then we can pro-

ceed as in the case of AR(1) processes. Now from (6.3.6), we can obtain the characteristic

133



function of η as

φη(t) =φY (t)

φY (at).

We know Y has a normal Laplace distribution (NL(λ1, λ2, ν, τ2)). Then we can write the

above expression as

φη(t) =

[exp

(iν(1− a)t− τ 2(1− a2)

2t2)][

(λ1 − iat)(λ1 − it)

(λ2 + iat)

(λ2 + it)

]. (6.3.7)

Also

λ1 − iatλ1 − it

= a+ (1− a)λ1

λ1 − itλ2 − iatλ2 − it

= a+ (1− a)λ2

λ2 − it.

Gaver and Lewis (1980) obtained the innovation distribution of the EAR(1) process as the

exponentially tailed distribution with parameters a and λ1(ET(a, λ1)). Hence the innovation

sequence {ηn} can be treated as a sequence of random variables of the form

ηd= Z + E∗1 − E∗2 , (6.3.8)

where Z ∼ N(ν(1− a), τ 2(1− a2)), E∗1 ∼ ET (a, λ1) and E∗2 ∼ ET (a, λ2).

6.3.3 Estimation of parameters

Reed and Jorgensen (2004) discussed the method of moments and maximum likelihood

estimation procedures for the DPLN distribution. Assuming that the data is from the DPLN

distribution, one can obtain method of moments estimates (MMEs) of λ1, λ2, ν and τ 2 using

the first four moments of either the DPLN distribution or having the first log-transformed

data, of the NL distribution. The estimates are not the same. Use of the DPLN moments

however is not recommended, since population moments of order one or more do not exist.

To find MMEs of λ1 and λ2 using the NL distribution, one needs only to solve (6.3.4),

134



with κ3 and κ4 set to their sample equivalents. Estimates of ν and τ 2 can then be ob-

tained from (6.3.3). Reed and Jorgensen (2004) show that estimates can be obtained by

maximum likelihood as (6.3.4) has no real solution and recommend the use of the method

of moments only for finding starting values for iterative procedures for finding maximum

likelihood estimates. In their absence trial and error method can be used.

Unlike MMEs, maximum likelihood estimates (MLEs) are the same whether one fits

the DPLN to data x1, x2, . . . , xn or fits the NL to y1 = ln x1, . . . , yn = ln xn. The log

likelihood function is

L = n lnλ1 + n lnλ2 − n ln(λ1 + λ2)

+n∑i=1

φ

(yi − ντ

)+

n∑i=1

ln

[R

(λ1τ −

yi − ντ

)+R

(λ2τ +

yi − ντ

)].

This can be maximized analytically over ν to yield

ν = y − 1

λ1

+1

λ2

.

Then

L(λ1, λ2, τ) = lnλ1 + n lnλ2 − n ln(λ1 + λ2) +n∑i=1

φ

(yi − y − 1

λ1+ 1

λ2

τ

)

+n∑i=1

ln

[R

(λ1τ −

yi − y − 1λ1

+ 1λ2

τ

)+R

(λ2τ +

yi − y − 1λ1

+ 1λ2

τ

)].

This can be maximized numerically using the MMEs as starting values. Reed and Jor-

gensen (2004) also describe the EM algorithm which uses a likelihood function based on

augmented data as a stepping stone towards the maximization of the likelihood based on

the observed data.

Ramırez et al. (2008) described a method for carrying out Bayesian inference for

the DPLN distribution and have illustrated that this model can capture both the heavy

135



tail behavior and also the body of the distribution for real data examples. They applied the

Bayesian approach to inference for the DPLN|M|1 and M|DPLN|1 queueing systems where

the inter-arrival and service times respectively are modelled using the DPLN distribution.

Since the DPLN distribution does not possess a moment generating function or Laplace

transform in closed form as most of heavy-tailed distributions, it is impossible to apply the

usual queueing theory techniques to obtain the equilibrium distributions of the DPLN|M|1and M|DPLN|1 queueing systems using standard techniques. Ramırez et al. (2008) ap-

plied a direct approximation of the non-analytical Laplace transform using a variant of the

transform approximation method (TAM). They have applied TAM to estimate the Laplace

transform of the DPLN distribution and the waiting time distribution in the M|DPLN|1 sys-

tem. By combining TAM with Bayesian inference for the DPLN distribution, they obtained

numerical predictions of queueing properties such as the probability of congestion. They

have illustrated this methodology with real data sets, estimating first waiting times and con-

gestion in internet and computing the probability of ruin in the insurance context, making

use of the duality between queues and risk theory.

6.4 Conclusion

The log-Laplace distribution and its important properties and its extension to multivariate

case are studied. The double Pareto lognormal distributions also studied. Some divisi-

bility properties like infinite divisibility, geometric infinite disability and divisibility properties

with respect to multiplication, namely multiplicative infinite divisibility, geometric multiplica-

tive infinite divisibility properties are explored. Product autoregression structures with log-

Laplace and double Pareto lognormal maraginals are developed. Self-decomposability

property is studied. A linear AR(1) model is developed along with the sample path prop-

erties and the estimation of parameters of the process. A multivariate extension of the

product autoregression structure is also considered.

136



References

Bagnold, R.A., Barndorff-Nielsen, O. (1980). The pattern of natural size distribution,

Sedimentology 27 , 199–207.

Barndorff-Nielsen, O. (1977). Exponentially decreasing distributions for the logarithm of

particle size, Proceedings of the Royal Society of London A 353, 401–419.

Bondesson, L. (1979). A general result on infinite-divisibility, Annals of Probability 7,

965–979.

Fujitha, Y. (1993). A generalization of the results of Pillai, Annals of the Institute of Statis-

tical Mathematics 45, 361–365.

Hartley, M.J., Revankar, N.S. (1974). On the estimation of the Pareto law from underre-

ported data, Journal of Econometrics 2, 327–341.

Hinkley, D.V., Revankar, N.S. (1977). Estimation of the Pareto law from underreported

data, Journal Econometrics 5, 1–11.

Inoue, T. (1978). On Income Distribution: The Welfare Implications of the General Equilib-

rium Model, and the Stochastic Processes of Income Distribution Formation, Ph.D.

Thesis, University of Minnesota .

Jose, K.K., Naik, S.R. (2008). A class of asymmetric pathway distributions and an entropy

interpretation, Physica A: Statistical Mechanics and its Applications 387(28), 6943–

6951.

Jose, K.K., Manu, M.T. (2012). A Product Autoregressive Model with Log-Laplace Marginal

Distribution, Statistica, anno LXXII, n.3, 312–327.

Julia, O., Vives-rego, J. (2005). Skew-Laplace distribution in Gram-negative bacterial

axenic cultures: new insights into intrinsic cellular heterogeneity, Microbiology 151,

749-755.

137



Klebanov, L.B., Maniya, G.M., Melamed, I.A. (1984). A problem of Zolotarev and analogs

of infinitely divisible and stable distribution in a scheme for summing a random num-

ber of random variables, Theory of Probabability and its Applications 29, 791–794.

Kozubowski, T., Podgorski, K. (2003). Log-Laplace distributions, International Journal of

Mathematics 3 (4), 467–495.

Kozubowski, T., Podgorski, K. (2010). Random self-decomposability and autoregressive

processes, Statistics and Probability Letters 80, 1606–1611.

Kotz, S., Kozubowski, T.J., Podgorski, K. (2001). The Laplace Distribution and General-

izations - A Revisit with Applications to Communications, Economics, Engineering

and Finance, Birkhauser, Boston.

McKenzie, ED. (1982). Product autoregression: a time-series characterization of the

gamma distribution, Journal of Applied Probability 19, 463–468.

Mitzenmacher, M. (2004). Dynamic models for file sizes and double Pareto distributions,

Internet Mathematics 1(3), 305–334.

Pillai, R.N. (1990). Harmonic mixtures and geometric infinite divisibility, Journal of Indian

Statistical Association 28, 87–98.

Pillai, R.N., Sandhya, E. (1990). Distributions with complete monotone derivative and

geometric infinite divisibility, Advances in Applied Probability 22, 751–754.

Ramırez, P., Lillo, R.E., Wiper, M.P., Wilson, S.P. (2008). Inference for double Pareto log-

normal queues with applications, Working Paper 08-04, Statistics and Econometrics

Series 02 , Universidad Carlos III de Madrid, Spain.

Reed, W.J., Jorgensen, M. (2004). The double Pareto lognormal distribution-a new para-

metric model for size distributions, Communications in Statistics - Theory and Meth-

ods 33(8), 1733–1753.

138



Reed, W.J. (2003). The Pareto law of incomes- an explanation and an extension, Physica

A 319, 579–597.

Reed, W.J. (2002). On the rank-size distribution for human settlements, Journal of Re-

gional Science 42, 1–17.

Schwammle, V., Queiros, S.M.D., Brigatti, E., Tchumatchenko, T. (2009). Competition

and fragmentation: a simple model generating lognormal-like distributions, New

Journal of Physics 11, 093006

Seshadri, M., Machiraju, S., Sridharan, A., Bolot, J., Faloutos, C., Leskovec, J. (2008).

Mobile call graphs: beyond power law and lognormal distributions, preprint.

Shanbhag, D.N., Sreehari, M. (1977). On certain self-decomposable distributions, Zeitschrift

fur Warhscheinlichkeitstheorie und Verwandte Gebiete 38, 217–222.

Uppuluri, V.R.R. (1981). Some properties of log-Laplace distribution, Statistical Distribu-

tions in Scientific Work 4, (eds., G.P. Patil, C. Taillie and B. Baldessari), Dordrecht:

Reidel, 105–110.

139

Product Autoregressive Models with Log-Laplace and Double...

Documents

Transcript of Product Autoregressive Models with Log-Laplace and Double...