Necessary and sufﬁcient conditions for …...Necessary and sufﬁcient conditions for representing...

Necessary and sufficient conditions for representing

general distributions by Coxians

Takayuki Osogami� Mor Harchol-Balter�

February 24, 2003

CMU-CS-02-178

School of Computer Science

Carnegie Mellon University

Pittsburgh, PA 15213

Abstract

A common analytical technique involves using a Coxian distribution to model a general distribution�,where the Coxian distribution agrees with� on the first three moments. This technique is motivated bythe analytical tracability of the Coxian distribution. Algorithms for mapping an input distribution� to aCoxian distribution largely hinge on knowing a priori the necessary and sufficient number of stages in therepresentative Coxian distribution. In this paper, we formally characterize the set of distributions� whichare well-represented by an�-stage Coxian distribution, in the sense that the Coxian distribution matchesthe first three moments of�. We also discuss a few common, practical examples. Lastly, we derivea partial characterization of the set of busy period durations which are well-represented by an�-stageCoxian distribution.

�Carnegie Mellon University, Computer Science Department. Email: [email protected]�Carnegie Mellon University, Computer Science Department. Email: [email protected]. This work was supported by

NSF Career Grant CCR-0133077, by NSF ITR Grant 99-167 ANI-0081396 and by Spinnaker Networks via Pittsburgh DigitalGreenhouse Grant 01-1.

Keywords: moment matching, Coxian distribution, busy period, phase-type distribution, normalized

moment, queueing, matrix analytic

1 Introduction

Background

Approximating general distributions by phase-type (PH) distributions has significant application in the

analysis of stochastic processes. Many fundamental problems in queueing theory are hard to solve when

general distributions are allowed as inputs. For example, the waiting time for an M/G/c queue has no nice

closed formula when� � �, while the waiting time for an M/M/c queue is trivially solved. Tractability

of M/M/c queues is attributed to the memoryless property of the exponential distribution. A popular

approach to analyzing queueing systems involving a general distribution� is to approximate� by a PH

distribution. A PH distribution is a very general mixture of exponential distributions, as shown in Figure 1

[19]. The Markovian nature of the PH distribution frequently allows a Markov chain representation of the

queueing system. Once the system is represented by a Markov chain, this chain can often be solved by

matrix-analytic methods [20, 16], or other means.

When fitting a general distribution� to a PH distribution, it is common to look for a PH distribution

which matches the first three moments of�. In this paper, we say that:

Definition 1 A distribution � is well-representedby a distribution � if � and � agree on their first three

moments.

It has been shown that matching three moments is sufficient for accurate modeling of many computer

systems [9, 23]. Matching fewer moments is less desirable since some queueing systems, e.g. the��

queue, have response times which are heavily dependent on the third moment of�� [31, 11].

Most existing algorithms for fitting a general distribution� to a PH distribution, restrict their attention

to a subset of PH distributions, since general PH distributions have so many parameters that it is difficult

to find time-efficient algorithms for fitting to them [30, 13, 12, 26, 18]. The most commonly chosen

subset is the class of Coxian distributions, shown in Figure 2. Coxian distributions have the advantage

of being much simpler than general PH distributions, while including a large subset of PH distributions

without needing additional stages. For example, for any acyclic PH distribution�, there exists a Coxian

distribution� with the same number of stages such that� and� have the same distribution function

[5]. In this paper we will restrict our attention to Coxian distributions.

Motivation and Goal

When finding a Coxian distribution which well-represents a given distribution�, it is desirable that

beminimal, i.e., the number of stages in be as small as possible. This is important because it minimizes

the additional states necessary in the resulting Markov chain for the queueing system. Unfortunately, it is

1

1µ 2µ 3µ 4µ p40

p13

p12 p23 p34

p43p32

p42p31

p21

p10

p20

p30

p01

p02

p03

p04

p00

p41

p14

p24

Exp Exp Exp Exp

Figure 1: A 4-stage PH distribution. There are � � � states, where the �th state has exponentially-distributed service time with rate ��. With probability �� we start in the �th state, and the next state isstate � with probability �� . Each state � has probability �� that it will be the last state. The value of thedistribution is the sum of the times spent in each of the states.

p p p2 3µ µ µ1 2 nn1p

11−p 2 31−p 1−p

Figure 2: An �-stage Coxian distribution. Observe the recursive definition: with probability � � �, thevalue is zero, and with probability �, the value is an exponential random variable with rate �� followedby an �� -stage Coxian distribution.

not known what is the minimal number of phases necessary to represent a given distribution� by a Coxian

distribution. This makes it difficult to evaluate the effectiveness of different algorithms and also makes the

design of fitting algorithms open-ended.

Theprimary goal of this paper is to characterize the set of distributions which are well-represented by

an�-stage Coxian distribution, for each� � �� .

Definition 2 Let �� denote the set of distributions that are well-represented by an �-stage Coxian dis-

tribution for positive integer �.

Our characterization of�� will allow one to determine, for any distribution�, the minimal

number of stages that are needed to well-represent� by a Coxian distribution.� Such a characterization

will be a useful guideline for designing algorithms which fit general distributions to Coxian distributions.

Another application of this characterization is that some existing fitting algorithms, such as Johnson and

Taaffe’s nonlinear programming approach [13], require knowing the number of stages� in the minimal

�One might initially argue that��, the set of distributions well-represented by a two-stage Coxian distribution, shouldinclude all distributions, since a 2-stage Coxian distribution has four parameters (��, ��, ��, ��), whereas we only need to matchthree moments of�. A simple counter example shows this argument to be false: Let� be a distribution whose first three momentsare 1, 2, and 12. The system of equations gives two solutions for parameters (��, ��, ��) as functions of��. However, in bothsolutions, one of�� and�� is ��

�� , which is not positive for all possible choices of��.

2

Coxian distribution. The current approach involves simply iterating over all choices for� [13], whereas

our characterization would immediately specify�.

A secondary goal of this paper is to specify the necessary and sufficient number of stages needed to

well-represent busy period durations by Coxian distributions. Fitting busy period durations to Coxian dis-

tributions has become relevant recently in the solution of common computer systems problems involving

cycle stealing, see [9, 23]. In [9, 23], transitions between states in a Markov chain represent busy period

durations, which are modeled via Coxian distributions for tractability. In addition to standard busy periods,

it is also common to model the busy period started by� jobs. This paper will specify the number of stages

needed to well-represent such busy periods by Coxian distributions.

Providing sufficient and necessary conditions for a distribution to be in�� does not always imme-

diately give one a sense ofwhich distributions satisfy those conditions, or of the magnitude of the set of

distributions which satisfy the condition. Athird goal of this paper is to provide examples of practical

distributions which are included in�� for particular integers�.

In finding simple characterizations of��, it will be very helpful to start by defining an alternative to

the standard moments, which we refer to asnormalized moments.

Definition 3 Let� be a distribution and �� be the �-th moment of � for � � �� . The normalized

second moment�� of � and the normalized third moment�� of � are defined to be

��

��

��

��

Notice the correspondence to the squared coefficient of variability,�, and the skewness factor,�: ��

� � and�� .

Relevant previous work

All prior work on characterizing�� has focused only on characterizing��

, where��

is the set

of distributions which are well-represented by a 2-stage Coxian distribution, where the definition of the

two-stage Coxian distribution used is more restrictive than our definition – specifically, there is no mass

probability at zero, i.e. � � �. Observe�� . Altiok [2] showed a sufficient condition for

a distribution� to be in��

. More recently, Telek and Heindl [29] expanded Altiok’s condition and

proved the necessary and sufficient condition for a distribution� to be in��

. While neither Altiok nor

Telek and Heindl expressed these conditions in terms of normalized moments, the results can be expressed

much simpler with our normalized moments, as shown in Theorem 1. In this paper, we extend these results

to characterize��, as well as characterizing��, for all integers� � �.

3

Our results

While the goal of the paper is to characterize the set��, this characterization turns out to be ugly. One

of the key ideas in the paper is that there is a set�� which is very close to�� in size, such

that�� has a very simple specification via normalized moments. Thus, much of the proofs in this paper

revolve around�� .

Definition 4 For integers � � �, let �� denote the set of distributions with the following property on

their normalized moments: �� and��:

��

��

� �� (1)

The main contribution of this paper is a derivation of the nested relationship between�� and�� for

all � � �. This relationship is illustrated in Figure 3 and proven in Section 3. There are three points to

observe: (i)�� is a proper subset of�� for all integers� � �, and likewise�� is a proper subset

of �� ; (ii) �� is contained in�� and close to�� in size; providing a simple characterization

for ��; (iii) �� is almost contained in�� for all integers� � � (more precisely, we will show

�� , where�� is the set of distributions well-represented by an Erlang-n distribution).

This result yields a necessary number and a sufficient number of stages for a given distribution to be

well-represented by a Coxian distribution. Additional contributions of the paper are described below:

With respect to the set��, we derive the exact necessary and sufficient condition for a distribution�

to be in�� as a function of the normalized moments of�. This extends the results of Telek and Heindl,

who analyzed��

, which is a subset of��. (See Section 2).

We next investigate the fitting of M/G/1 busy periods by Coxian distributions. Let� denote the

duration of an M/G/1 busy period where� is an arbitrary distribution with finite third moment and where

the size of the job starting the busy period is in�� . We prove that any such� has distribution in�� .

This is surprising in that the number of stages which suffice to represent the busy period is determined

solely by thefirst job starting the busy period, which may be a simple setup cost, and it is not required to

consider the distributions of the other jobs in the busy period. Furthermore, let�� denote the duration of

an M/G/1 busy period where� is an arbitrary distribution with finite third moment and where the busy

period is started by� jobs with service time distribution in�� where� is the number of Poisson arrivals

during a random variable with distribution in�� . We prove that any such�� is in �� . (See Section 4).

Lastly, we provide a few examples of common, practical distributions included in the set��

��. All distributions we consider have finite third moment. The Pareto distribution and the Bounded

Pareto distribution (as defined in [7]) have been shown to fit many recent measurement of job service

4

Sv(4)

S(3) Sv(3)

Sv(2)

S(4)

S(2)

Figure 3: The main contribution of this paper: a simple characterization of �� by �� . Solid lines

delineate �� (which is irregular) and dashed lines delineate �� (which is regular – has a simple spec-

ification). Observe the nested structure of �� and �� . �� is close to �� in size and is contained in

��. �� is almost contained in �� .

requirement in computing systems, including HTTP requests [3, 4], UNIX jobs [17, 8], and the duration

of FTP transfers [24]. We show that the Bounded Pareto with high variability is in��. We also provide

conditions under which the Pareto and uniform distributions are in�� for each� � �. (See Section 5).�

2 Full characterization of ��

The Telek and Heindl [29] result may be expressed in terms of normalized moments as follows:

Theorem 1 (Telek, Heindl) � � �� iff � is in the following union of sets: �

��

��

��

��

��

��

� ��

� ��

��

��

�Our results show that thefirst three moments of the Bounded Pareto distribution are matched by a two-stage Coxian distri-bution and thefirst three moments os the Pareto distribution with high variability are matched by a Coxian distribution with asmall number of stages. Note however that this does not necessarily imply that theshape of these distributions is well-matchedby a Coxian distribution with few stages, since the tail of these distributions is not exponential. Recently, fitting theshape ofheavy-tailed distributions by phase-type distributions such as hyperexponential distributions has been studied [6, 28, 15].

�Throughout this paper,�conditions on normalized moments� denotes the set of distributions that satisfy the conditions. Forexample,

��

�denotes set

��

��

� � �� and� � ��

�

�.

5

Since only an outline of a proof is given in [29], we derive our own proof of Theorem 1 in [21] for

completeness. We now show a simpler characterization of��:

Theorem 2 � � �� iff � is in the following union of sets:

��

��

��

��

��

� �� (2)

where recall �� is the set:��

�.

A summary of Theorems 1 and 2 is shown in Figure 4. Figure 4(a) illustrates how close�� and�� are

in size. Figure 4(b) shows the distributions which are in�� but not��

.

m2

m3

23/2

3

2

S(2)V

S SV(2) (2)

m2

m3

23/2

3

2

S(2)

S(2)*

Figure 4:(a) The thick solid lines delineate ��. The dashed lines (striped region) show �� . (b)Again, the thick solid lines delineate ��. The shaded area shows the region �� .

Proof:[Theorem 2] The theorem will be proved by reducing�� to ��

and employing Theorem 1. The

proof hinges on the following observation: An arbitrary distribution� � �� iff � is well-represented by

some distribution�, where

� �

� � with probability�

with probability��

for some� � �� . It therefore suffices to show that� is in set (2).

Let�� be the normalized�-th moment of� and�� be the normalized�-th moment of� for � � �� .

Observe that�� for � � �� and��

��

for � � �� .

By Theorem 1, since� � �� ,� is in the following union of sets:

��

��

��

��

��

��

� ��

� ��

��

��

6

Thus,� is in the following union of sets:�

��

��

��

��

��

��

�

��

� ��

� �

� ��

��

��

� � ��

� �

�

�(3)

We want to show that� is in set (2). To do this, we rewrite set (2) as��

��

��

��

� � ��

��

��

� � ��

��

�(4)

Observe that (3) and (4) are now in similar forms. We now prove that set (3) is a subset of set (4), and set

(4) is a subset of set (3). The technical details are postponed to Appendix B, Lemma 9.1.

3 A characterization of ��

In this section, we prove that�� is contained in��, where�� is the set of distributions whose

normalized moments satisfy (1) and that�� is almost contained in�� . Figure 6 provides a graphical

view of the�� sets with respect to the normalized moments. We prove the following theorem:

Theorem 3 �� , where �� is the set of distributions that are well-represented

by an Erlang-� distribution for integers � � �.

An Erlang-� distribution refers to the distribution shown in Figure 5. Notice that the normalized moments

µ µµ

n stages

Figure 5:An Erlang-� distribution.

of distributions in��,� ��

� and� ��

� , satisfy the following conditions:

��

� ��

��

� ��

�� (5)

Theorem 3 tells us that�� is “sandwiched between”�� and�� . From Figure 6, we see that��

and�� are quite close for higher�. Thus we have a very accurate representation of��. Theorem 3

follows from the next two lemmas:

��conditions on normalized moments in terms of�� denotes the set of distributions that satisfy the conditions for some�. Forexample,

��

��

�denotes set

�� s.t. � � � � � and �

��

� � �� and �

��

�

�.

7

Sv(32)

Sv(4)

Sv(3)

Sv(2)

m2

3m

E3

E2

E31

13/24/3

3/2

2

1 2

3

Figure 6:Depiction of �� sets for � � �� as a function of the normalized moments. The outermostdotted lines (�� and �� ) delineate the set of all the possible nonnegative distributions (thatis, any nonnegative distribution � satisfies �� and ��

� � �� ) [14]. �� for � � �� are

delineated by dashed lines.

Lemma 3.1 �� .

Lemma 3.2 �� .

Proof:[Lemma 3.1] The proof proceeds by induction. When� � �, the lemma follows from (1), (5), and

Theorem 2. Assume that�� for � �� . For any distribution� � ��, there exists

a�-stage Coxian distribution� by which� is well-represented, where� can be expressed as

� �

� � ��

��

and where� is an exponential distribution and� is a��-stage Coxian distribution. By the assumption

of induction,� � �� . We prove that (i) if� � �� , then� � �� and (ii) if � � ��,

then� � �� .

(i) Suppose� � �� : We first prove that��

��

. First observe that

��

� ��

��

� ��

��

where the inequality follows from� � �� . The derivative of the right hand side with respect to�� is

��

��

8

which is minimized when�� . Thus,

��

� �

��

��

Next, we prove that��

��

�� for all��

��

. Notice that��

��

is independent of :

��

��

��

��

Since��

��

is an increasing function of�� , it is minimized at��

��

�� , since� � �� .

Thus,

��

��

� ��

�� (6)

Denoting the r.h.s. of (6) by�� , we now find�� which minimizes�� . Since

��

� ��

��

��

�

the infimum of�� occurs at:

��

��

� ��

� � ��

��

By evaluating��

��

at��

�, we have

��

��

� ��

��

� �

By Lemma 9.4 in Appendix B,��

��

� �� . By evaluating

��

��

at

� ��

� ��

we have��

�

��

� � ��

��

� ��

where the last inequality holds iff�� . However,��

�� holds if

��

� ��

�

� � ��

9

since

��

� ��

� � ��

��

and�� and

�

��

� �

��

��

��

for � � �.

(ii) Suppose� � ��: We will prove that (a) if�� and � �, then� � ��,

and (b) if�� or � �, then� � �� : For part (a), observe that if� � ��,

�� , and � �, then we have already seen that��

in part (i). It is also easy to

see that��

�, and hence� � ��. For part (b), if�� or � �, then first notice

that��

��

, since�� is minimized when�� and � �. Also, since

� ��

��

� �

� ��

��

��

�� by part (i), and hence� � �� .

Proof:[Lemma 3.2] When� � �, the lemma follows from Theorem 2. The remainder of the proof

assumes� � �. We prove that for any distribution� � �� , there exists an�-stage Coxian� such that

the normalized moments of� and� agree. Notice that the mean of� is easily matched to� without

changing the normalizing moments of� by multiplying a constant to the rates,��, ...,��, of �. The proof

consists of two parts: (i) the case when the normalized moments of� satisfy�� ; (ii) the case

when the normalized moments of� satisfy�� .

(i) Suppose� � �� and��

� � �: We need to show that� is well-represented by some�-

stage Coxian distribution. We will prove something stronger: that� is well-represented by a distribution

� where� � � � , and� is a particular two-stage Coxian distribution with no mass probability at

zero and� is a particular Erlang-(� � �) distribution. (For the intuition behind this particular way of

representing�, please refer to [22]). The normalized moments of� are chosen as follows:

��

��

��

�

��

��

� � ��

��

� ��

��

��

��

� ��

�

��

�

10

The mean of� is chosen as follows:�� . It is easy to see that the normalized

moments of� and� agree:

��

��

� ��

��

� �

��

��

��

� � ��

� ��

��

�

��

� ��

� ��

where��

�� and��

�� are the normalized moments of� , and� � �� . Finally, we will

show that there exists a two-stage Coxian distribution with no mass probability at zero, with normalized

moments�� and��

� : By Theorem 1, it suffices to show that�� and��

� � ��

�� . The first

condition,�� , can be shown using�

�� , which follows from� � �� . It can also be shown

that��

� � � � ��

�� using �

�� and��

� � �� , which is the assumption that we

made at the beginning of (i).

(ii) Suppose:� � �� and��

� � �: We again must show that� is well-represented by an

�-stage Coxian distribution. We will show that� is well-represented by a distribution�:

� �



where � ��

� ��

and the normalized moments of� satisfy�� and��

� � �� . It is

easy to see that the normalized moments of� and� agree. Therefore, it suffices to show that� is well-

represented by an�-stage Coxian distribution� , since then� is well represented by an�-stage Coxian

distribution�:

� �



We will prove that� is well-represented by an�-stage Coxian distribution� � � � , where� is

a two-stage Coxian distribution with no mass probability at zero and� is an Erlang-(� � �) distribution.

The normalized moments of� are chosen as follows:

��

��

��

��

� � ��

the mean of� is chosen as follows:�� . It is easy to see that the normalized

moments of� and� agree:

��

��

� ��

��

� �

��

��

��

� � ��

� ��

��

�

��

� ��

� ��

� �

11

where��

�� and��

�� are the normalized moments of� , and� � �� . Finally, we will show

that there exists a two-stage Coxian distribution with normalized moments�� and�� : By Theorem 2,

it suffices to show that�� , since

�

��

� � ��

� � � � ��

��

�

Since� � �� , ��

�� . Thus,��

� � ��

��

��

�

� �� Finally, ��

� � �� follows from

��

�.

4 A characterization of busy periods

In this section we characterize the set of� /�/1 busy period durations which are in�� , and hence in

��. As explained earlier, the tractability of queueing problems often relies not just on representing

general distributions by Coxian distributions, but also on representing busy period durations by Coxian

distributions [9, 23]. This includes busy periods started by� jobs, where� is the number of arrivals

during a period of time. This section provides sufficient conditions on the number of stages needed to

represent common types of busy period durations by Coxian distributions. Formally, we will prove the

following theorems:

Theorem 4 Let � denote the duration of an M/G/1 busy period where � is an arbitrary distribution with

finite third moment and where the job starting the busy period has size � �� . Then, � � �� .

The above theorem states that the number of stages which suffice for a busy period duration to be well-

represented by a Coxian distribution is, surprisingly, determined solely by the distribution of thefirst job

in the busy period.

Lemma 4.1 Let �� !�� be the distribution of the sum of � i.i.d. random variables with distribution

� � �� where � is the number of Poisson arrivals with rate " during a random time ! � �� . Then,

�� !�� .

The following theorem follows from Theorem 4 and Lemma 4.1.

Theorem 5 Let �� denote the duration of an M/G/1 busy period where � is an arbitrary distribution

with finite third moment and where the busy period is started by �� !�� as defined in Lemma 4.1. Then,

�� .

12

We will now prove Theorem 4 and Lemma 4.1.

Proof:[Theorem 4] When# � �, � � � �� , and hence the theorem is true. In the following we

assume� � # � �. Let $� �� and$� ��

� �� . We prove that$� � �� and$� � ��

�$�� .

Observe that together these imply the conditions in (1). Notice that the first three moments of� are

��

��

��

��

�

��

��

��

��

��

��

��

�

��

��

��

��

��

It is easy to see that$� � �� :

��

��

��

�

��

where�� and��

�� . Next, we prove that$� � ��

�$�� . Note that

��

��

��

��

�

��

��

��

��

��

��

��

��

��

�

where�� and��

�� . Thus,

��

��

� �

��

��

�

�

��

��

��

�

��

��

��

��

��

�

��

��

��

��

��

� ��

��

Proof:[Lemma 4.1] Let � �� and � ��

� �� . We prove that � � �� and

� � �� .

Notice that the Laplace transform of�� !�� is �� !�� ! �"�� %��. Thus, the first three

moments of��!�� are

��

��

��

13

It is easy to see that � � �� :

��

��

�

��

where&� � �� and'� �

�� . Next, we prove that � � ��

� �� . Note that

��

��

��

where&� � �� and'� �

�� . Thus,

��

��

��

� �

��

�

��

� �

�

��

��

��

��

5 Examples of some common distributions in ��

In this section, we give examples of distributions in�� , and hence are well-represented by an

�-stage Coxian distribution. A summary is shown in Figure 7.

5.1 Distributions in ��

It is well-known that for all two-phase PH distributions�, there exists a two phase Coxian distribution�

such that� and� has the same distribution function, and hence� � ��. In the following, we show that

the Bounded Pareto distributions with high variability are also in��.

A Bounded Pareto distribution has density function

��'� � ('��

��

�� ' ��

where� � ( � �. Bounded Pareto distributions have been empirically shown to fit many recent mea-

surement of computing workloads. These include Unix process CPU requirements measured at Bellcore:

� ( �� [17], Unix process CPU requirements measured at UC Berkeley:( � � [8], sizes of files

transferred through the Web:�� ( �� [3, 4], sizes of files stored in Unix filesystems [10], I/O times

[25], sizes of FTP transfers in the Internet:�� ( �� [24], and Pittsburgh Supercomputing Center

14

m2

3m

Sv(2)

Sv(8)

Sv(4)

13/24/3

3/2

2

1 2

3

PARETO

UNIFORM/TRIANGULAR

BP

Figure 7: A summary of the results in Section 5. A few particular distributions are shown in relation to

�� . BP refers to the class of bounded Pareto distributions with high variability described in Definition 5.

All of these are contained in �� . UNIFORM refers to the class of all uniform distributions describedin Definition 6. We find that the larger the range of the UNIFORM distribution, the fewer the number ofstages that suffices. TRIANGULAR refers to the set of symmetric triangular distributions, described inDefinition 7. These interestingly have the same behavior as the UNIFORM distribution. Finally, PARETOrefers to the class of Pareto(() distributions with finite third moment, described in Definition 8. For thisclass, we find that the lower the value of the (-parameter, the fewer the number of stages that are needed.

workloads for distributed servers consisting of Cray C90 and Cray J90 machines [27]. In this section, we

prove the necessary and sufficient condition for a Bounded Pareto distribution to be in�� . Formally, we

use the following definition:

Definition 5 � is a set of Bounded Pareto distributions satisfying � � ( � � and ) �

is greater than

the maximum of the two lines shown in Figure 8.

0 0.5 1 1.5 20

50

100

150

200

α

min

imum

p/k

m2>2

m3/m

2 ≥ 4/3

Figure 8:The maximum of the two lines illustrates the lower bound needed on ) �

in the definition ofthe BP distribution. These lines are derived from (7) and (8).

With this definition, the theorem proven in this section is stated:

Theorem 6 � � �� and �� , where �� is the set of Bounded Pareto distributions not

15

in � and � is an empty set.

Proof: Let�� be the normalized�-th moment of a distribution� � � for � � �� . When( � �, the

moments of the Bounded Pareto are

��

��

�

��

�

��

Thus,

��

��

��

� ��

��

��

��

��

�� (7)

When� � ( � � or � � ( � �, the moments of the Bounded Pareto are

��

��

��

� � �

��

� � ��

��

��

� � ��

��

� � ��

��

��

� � ��

��

� �

Thus,

��

��

��

��

��

� ��

��

��

�� (8)

and

��

��

��

��

��

��

By Lemmas 9.5-9.8 in Appendix B, both�� and �

�

��

are increasing functions of). This makes

intuitive sense since the higher moments are likely to increase as the upper bound (and thus)) increases.

Thus, the minimum) such that�� and�

�

��

� �� can be obtained numerically for all( if there

exists such a finite). In the following we prove that there is such an) for � � ( � �. When( � �,

it is easy to see that as) goes to infinity, both�� and ��

��

go to infinity. Thus, there is a finite) such

that�� and�

�

��

� �� . Next, consider the case with� � ( � � or � � ( � �. By observing that

� � ( � �, it is easy to see that�� goes to infinity as) does. Thus, there is a finite) such that�� .

Next, we consider��

��

. Observe that

��!��

��

��

�

�

�� "� � � � � ��

� ��"� ��

Thus, there is a finite) such that��

��

� for � � ( � �. When� � ( � �, a finite) gives��

��

� if

16

and only if ��

�� . However, since

��

��

�

��

There is a finite) such that��

��

� �� for � � ( � �, too.

5.2 Distributions in ��

In this section, we give examples of distributions in��. It is known that for all acyclic PH distributions

�, there exists a Coxian distribution� with the same number of phases as� such that� and� have

the same distribution function [5]. Therefore, all the�-phase acyclic PH distributions are in��.

In the rest of this section, we discuss uniform distributions, symmetric triangular distributions, and

Pareto distributions. In particular, we derive the necessary and sufficient condition for these distributions

to be in�� . Formally, we use the following definitions:

Definition 6 *�+�,-��.� /� refers to the distribution with lower bound . and upper bound / � �

having density function ��'� � �� in the region . ' / and zero otherwise.

Definition 7 !-+0��*10-�.� /� is the distribution with density functions of the form

��

��

��

�� # � � � � ��

�

��

��

�� # ��

��"��$"�

where � . / and / � �.

Definition 8 0-�!,�(� is the distribution with density functions of the form ��'� � ('��, where

( � �.

With these definitions, the three distributions are formally characterized as follows:

Theorem 7 The normalized moments of *�+�,-��.� /� satisfy � �� and ��

�for

all � . / and / � �.

Theorem 8 The normalized moments of !-+0��*10-�.� /� satisfy � �� and ��

�

for all � . / and / � �.

17

Theorem 9 The normalized moments of distributions in 0-�!,�(� satisfy

� � ��

��

��

��

��

for all ( � �.

Simple consequences of the theorems are:

Corollary 1 *�+�,-��.� /� � �� if and only if � � �� , where ) � �

�. In particu-

lar, � � � if . � � and / � �, and � � � for all � � . /.

Corollary 2 !-+0��*10-�.� /� � �� if and only if � � �� , where ) � �

�. In

particular, � � � if . � � and / � �, and � � � for all � � . /.

Corollary 3 0-�!,�(� � �� if and only if � � �(� ��. In particular, � � � for all ( � �.

Below we prove Theorem 7. Theorems 8 and 9 can be proved very similarly (see Appendix A).

Proof:[Theorem 7] The first three moments of� � *�+�,-��.� /� are�� , ��

� �� , and��

�� . and the normalized second and third moments of� are

��

�

�

� � ��

��

� ��

�

� � ��

��

where) � ��

. Since ��

� � ��

�� for � ) �,��

� is a nonincreasing function of). So, the

minimum of�� is given by evaluating it at) � � and the maximum is given by evaluating it at) � �.

Thus,� � ��

�� . Also, it is easy to see that��

� and�� satisfy��

� � ��

�.

6 Conclusion

The contribution of this paper is a characterization of the set�� of distributions� which are well-

represented by an�-stage Coxian distribution. Prior work has only analyzed�� , and this char-

acterization was messy. We introduce several ideas which help in creating a simple formulation of��.

The first is the concept of normalized moments. The second is the notion of�� , a nearly complete subset

of �� with an extremely simple representation. The arguments required in proving the above results have

an elegant structure which repeatedly makes use of the recursive nature of the Coxian distributions.

Our characterization of�� provides the minimum number of necessary phases and the sufficient

number of phases for a given distribution to be well-represented by a Coxian distribution, and these bounds

18

are nearly tight. This result has several practical uses: First, in designing algorithms which fit general

distributions to Coxian distributions (fitting algorithms), the goal is to come up with aminimal (fewest

number of stages) Coxian distribution. Our characterization allows algorithm designers to determine how

close their Coxian distribution is to the minimal Coxian distribution, and provides intuition for coming

up with improved algorithms. We have ourselves benefitted from exactly this point: In a companion

paper [22], we develop an algorithm for finding a minimal Coxian distribution that well-represents a given

distribution. We find that the simple characterization of�� provided herein is very useful in this task.

Our results are also useful as an input to some existing fitting algorithms, such as Johnson and Taaffe’s

nonlinear programming approach [13], which require knowing a priori the number of stages� in the

minimal Coxian distribution.

In addition to characterizing those distributions in��, we also consider which M/G/1 busy periods

have durations in��. We find that the number of stages which suffice for a busy period duration to be

well-represented by a Coxian distribution is, surprisingly, determined solely by the distribution of thefirst

job in the busy period. Furthermore we classify a few examples of common and practical distributions as

being subsets of�� for some�.

Future work includes a simple characterization of the set of distributions that are well-represented

by general�-phase PH distributions. It is known that the Erlang distribution has the lowest normalized

second moment among all the�-phase PH distributions [1]. However, a lower bound on the normalized

third moment of�-phase PH distributions is not known.

References

[1] D. Aldous and L. Shepp. The least variable phase type distribution is Erlang.Communications in Statistics -Stochastic Models, 3:467 – 473, 1987.

[2] T. Altiok. On the phase-type approximations of general distributions.IIE Transactions, 17:110 – 116, 1985.

[3] M. E. Crovella and A. Bestavros. Self-similarity in World Wide Web traffic: Evidence and possible causes.IEEE/ACM Transactions on Networking, 5(6):835–846, December 1997.

[4] M. E. Crovella, M. S. Taqqu, and A. Bestavros. Heavy-tailed probability distributions in the world wide web.In A Practical Guide To Heavy Tails, chapter 1, pages 1–23. Chapman & Hall, New York, 1998.

[5] A. Cumani. On the canonical representation of homogeneous markov processes modelling failure-time distri-butions.Microelectronics and Reliability, 22:583 – 602, 1982.

[6] A. Feldmann and W. Whitt. Fitting mixtures of exponentials to long-tail distributions to analyze networkperformance models.Performance Evaluation, 32:245–279, 1998.

[7] M. Harchol-Balter. Task assignment with unknown duration.Journal of the ACM, 49(2), 2002.

[8] M. Harchol-Balter and A. Downey. Exploiting process lifetime distributions for dynamic load balancing. InProceedings of SIGMETRICS ’96, pages 13–24, 1996.

[9] M. Harchol-Balter, C. Li, T. Osogami, A. Scheller-Wolf, and M. Squillante. Task assignment with cycle stealingunder central queue. InTo appear in Proceedings of 23rd International Conference on Distributed ComputingSystems (ICDCS ’03)., May 2003. For a copy see http://www.cs.cmu.edu/harchol/.

19

[10] G. Irlam. Unix file size survey - 1993. Available athttp://www.base.com/gordoni/ufs93.html,September 1994.

[11] M. A. Johnson and M. F. Taaffe. A graphical investigation of error bounds for moment-based queueing ap-proximations.Queueing Systems, 8:295–312, 1991.

[12] M. A. Johnson and M. R. Taaffe. Matching moments to phase distributions: Density function shapes.Commu-nications in Statistics — Stochastic Models, 6:283–306, 1990.

[13] M. A. Johnson and M. R. Taaffe. Matching moments to phase distributions: Nonlinear programming ap-proaches.Communications in Statistics — Stochastic Models, 6:259–281, 1990.

[14] S. Karlin and W. Studden.Tchebycheff Systems: With Applications in Analysis and Statistics. John Wiley andSons, 1966.

[15] R. E. A. Khayari, R. Sadre, and B. Haverkort. Fitting world-wide web request traces with the EM-algorithm. InProceedings of SPIE: Internet Performance and Control of Network Systems II, volume 4523, pages 211–220,2001.

[16] G. Latouche and V. Ramaswami.Introduction to Matrix Analytic Methods in Stochastic Modeling. ASA-SIAM,Philadelphia, 1999.

[17] W. E. Leland and T. J. Ott. Load-balancing heuristics and process behavior. InProceedings of Performanceand ACM Sigmetrics, pages 54–69, 1986.

[18] R. Marie. Calculating equilibrium probabilities for��!"�!�!� queues. InProceedings of Performance 1980,pages 117 – 125, 1980.

[19] M. F. Neuts.Matrix-Geometric Solutions in Stochastic Models. Johns Hopkins University Press, 1981.

[20] M. F. Neuts.Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach. The Johns HopkinsUniversity Press, 1981.

[21] T. Osogami and M. Harchol-Balter. Necessary and sufficient conditions for representing general distributionsby Coxians. Technical Report CMU-CS-02-178, School of Computer Science, Carnegie Mellon University,2002.

[22] T. Osogami and M. Harchol-Balter. A closed-form solution for mapping general distributions to minimal PHdistributions. Technical Report CMU-CS-03-114, School of Computer Science, Carnegie Mellon University,2003.

[23] T. Osogami, M. Harchol-Balter, and A. Scheller-Wolf. Analysis of cycle stealing with switching cost. InToappear in ACM Sigmetrics 2003 Conference on Measurement and Modeling of Computer Systems (SIGMET-RICS ’03)., June 2003. For a copy see http://www.cs.cmu.edu/harchol/.

[24] V. Paxson and S. Floyd. Wide-are traffic: The failure of poisson modeling.IEEE/ACM Transactions onNetworking, pages 226 – 244, June 1995.

[25] D. L. Peterson and D. B. Adams. Fractal patterns in DASD I/O traffic. InCMG Proceedings, December 1995.

[26] C. Sauer and K. Chandy. Approximate analysis of central server models.IBM Journal of Research andDevelopment, 19:301 – 313, 1975.

[27] B. Schroeder and M. Harchol-Balter. Evaluation of task assignment policies for supercomputing servers: Thecase for load unbalancing and fairness. InProceedings of HPDC 2000, pages 211–219, 2000.

[28] D. Starobinski and M. Sidi. Modeling and analysis of power-tail distributions via classical teletraffic methods.Queueing Systems, 36:243–267, 2000.

[29] M. Telek and A. Heindl. Moment bounds for acyclic discrete and continuous phase type distributions of secondorder. InProceedings of UK Performance Evaluation Workshop, UKPEW 2002, 2002.

[30] W. Whitt. Approximating a point process by a renewal process: Two basic methods.Operations Research,30:125 – 147, 1982.

[31] W. Whitt. On approximations for queues, iii: Mixtures of exponential distributions.AT&T Bell LaboratoriesTechnical Journal, 63:163 – 175, 1984.

20

A Proof of Theorems 8 and 9

Proof:[Theorem 8] The first three moments of� � !-+0��*10-�.� /� are

��

��

� � � � ��

��

� � � ��

��

and the normalized second and third moments are

��

� ��

��

� ��

��

where) � ��

.Since �

��

� � ��

�� for � ) �, ��

� is a nonincreasing function of). So, the minimum

of �� is given by evaluating it at) � � and the supremum is given by evaluating at) � �. Thus,

� ��

. Also, it is easy to see that�� and��

� satisfy��

��

.

Proof:[Theorem 9] The first three moments of� � 0-�!,�(� are

��

��

�

��

�

��

and the normalized second and third moments are

��

��

��

� ��

��

Since ��

� � � �� for ( � �, ��

� is a decreasing function of(. So, the supremum of

�� is given by evaluating it at( � � and the infimum is given by letting( � �. Thus,� � ��

�� .

Also, it is easy to see that�� and��

� satisfy

��

��

� ��

��

� ��

��

�

B Technical lemmas

Lemma 9.1 This lemma proves that (3) and (4) are equivalent sets.

Proof: Recall that set (3) is the union of the following three sets:

#� �

��

��

��

��

��

��

��

�

�

��

#� �

��

�

��

�

�

��

#� �

��

��

�

��

��

21

set (4) is the union of the following three sets:

��

��

��

��

� ��

��

��

��

��

��

��

��

��

��

�

It suffices to prove that (i)0� � �� , (ii) 0� � �� , and (iii) 0� � ��. (ii) and (iii) areimmediate from definition. To prove (i), we prove that0� � �� and�� 0�.

Consider a distribution� � 0�. We first show that� � �� . Let/� � be the upper bound of�� :

��

� � ��

��

�

and let.� � be the lower bound of�� :

��

� � � ��

� ��

��

�

�

Then,/� � and.� � are both continuous and increasing functions of for ��

�

��

�

by Lemmas

9.2 and 9.3. When�� , the range of is �

��

�. Thus,

�

��

� � �

��

��

��

� � ��

� � ��

��

�

and hence� � ��. When� � �� , the range of is �

��

��

�

. Thus,

�

��

� � �

��

��

��

� �

��

��

��

�

��

� �

and hence� � ��. Therefore,0� � �� . However, since/� � and.� � are continuous functions of ,��

� can take any value between the lower and upper bounds. Therefore,�� 0�.

Lemma 9.2 Let 2 � �. Then, ��

is an increasing function of for � � � ��

.

Proof: Note that� ��

� �. The inequality holds when� � � ��.

Lemma 9.3 Let 2 � �. Then, ��

is an increasing function of for ��

�.

Proof:

� �� $��

�� $�� $��

��

��

22

Let �� 2 �� 2 �� 2 � �� . Then,

�� $

��

�

�� $��

�� $��

� �

�� $

��

�

�� $��

��

��

� �

�� $

��

�

��

��

��

� �

��

The first inequality follows from�� and the second inequality follows from �

�. So,�� is a

non-decreasing function of . Thus,�� 2� �� .

Lemma 9.4 Let � � � and � � �. Then,

��

��

��

� � � �

� ��

Proof: Let

��

��

��

��

� ��

��

We prove that�� . Let

3��

It suffices to prove3�� . Observe that��

� � iff � ��

��

�� , where

%��

Notice that%�� (9)

Thus,� ��

�%��

��

��

for � � �. Therefore,3�� is minimized when

� ��

�%��

�� (10)

Let

&�� '

��

� �� %��

��

�

�� %�� %��

��

��

It suffices to prove�� . Let�� be the numerator of��. It suffices to prove�� . Notice

23

that�� . Thus, it suffices to prove�� for � � �.

�� %��

� ��

where

� �� %��

�� %��

� ��

�� %��

� ��

� �

Lemma 9.5 ��)� � ��

�� is an increasing function for ) � �.

Proof: Note that

� ��

��

Note that �� for ) � �, and let��)� � �� ) �� )� �� ). Then,��)� is positive for) � �,

since�� , ��)� � �� ) � �, �� , and��)� � ��

�� .

Lemma 9.6 ��)� � �� ) is an increasing function for ) � �.

Proof: Note that

� ��

��

Note that �� for ) � �, and let��)� � )� � �) �� ) � �. Then,��)� is positive for) � �, since

�� , ��)� � �) � � �� ) � �, �� , and��)� � ��

� �.

Lemma 9.7 Let � � ( � � or � � ( � �. Then, ��)� � �� is an increasing function for

) � �.

Proof: Note that

� ��

��

�

Note that��

��

�� "� � � � �� "� � � � � ��

24

for ) � �. Let ��)� � () �� (� �(� ��)� � ()��. Then,

��

�� "� � � � �� "� � � � � ��

for ) � �, since�� , ��)� � ( �(� �� (�)�� (�� (�)��, �� , and

�� "� � � � �� "� � � � � ��

Lemma 9.8 Let ��)� � �� . If � � ( � �, ��)� is an increasing function for ) � �. If

� � ( � �, ��)� is a decreasing function for ) � �.

Proof: Note that

� ��)� ��) � ��)��

�)� � )�� (�� )�� (� ��) � )��

��

Since ��

�� for ) � �, it is easy to see that��)� � � for ) � � when� � ( � �. To see that

� ��)� � � for ) � � when� � ( � �, we let

��)� � �� (�� )�� (� ��) � )��

Then,��)� � � for ) � �, since�� , ��)� � �� (��( � ��)�� ( � �� ( � ��)��,�� , and

��)� � ��(� ��( � ��(� ��)��)��) � �� ( � �� ( � ��

25

beatrice

beatrice

This research was sponsored in part by National Science Foundation (NSF) grant no. CCR-0122581.

beatrice

beatrice

Necessary and sufﬁcient conditions for …...Necessary and sufﬁcient conditions for representing...

Documents

Transcript of Necessary and sufﬁcient conditions for …...Necessary and sufﬁcient conditions for representing...