Production-Inventory Systems with Imperfect Advance Demand

71
Production-Inventory Systems with Imperfect Advance Demand Information and Updating Saif Benjaafar William L. Cooper Setareh Mardan October 25, 2010 Abstract We consider a supplier with finite production capacity and stochastic production times. Cus- tomers provide advance demand information (ADI) to the supplier by announcing orders ahead of their due dates. However, this information is not perfect, and customers may request an order be fulfilled prior to or later than the expected due date. Customers update the status of their orders, but the time between consecutive updates is random. We formulate the production- control problem as a continuous-time Markov decision process and prove there is an optimal state-dependent base-stock policy, where the base-stock levels depend upon the numbers of orders at various stages of update. In addition, we derive results on the sensitivity of the state- dependent base-stock levels to the number of orders in each stage of update. In a numerical study, we examine the benefit of ADI, and find that it is most valuable to the supplier when the time between updates is moderate. We also consider the impact of holding and backorder costs, numbers of updates, and the fraction of customers that provide ADI. In addition, we find that while ADI is always beneficial to the supplier, this may not be the case for the customers who provide the ADI. Keywords: Advance demand information, production-inventory systems, make-to-stock queues, continuous-time Markov decision processes Program in Industrial and Systems Engineering, University of Minnesota, 111 Church Street S.E., Minneapolis, MN 55455 PROS, 3100 Main Street, Suite #900, Houston, TX 77002

Transcript of Production-Inventory Systems with Imperfect Advance Demand

Production-Inventory Systems with Imperfect Advance Demand

Information and Updating

Saif Benjaafar∗ William L. Cooper∗ Setareh Mardan†

October 25, 2010

Abstract

We consider a supplier with finite production capacity and stochastic production times. Cus-tomers provide advance demand information (ADI) to the supplier by announcing orders aheadof their due dates. However, this information is not perfect, and customers may request an orderbe fulfilled prior to or later than the expected due date. Customers update the status of theirorders, but the time between consecutive updates is random. We formulate the production-control problem as a continuous-time Markov decision process and prove there is an optimalstate-dependent base-stock policy, where the base-stock levels depend upon the numbers oforders at various stages of update. In addition, we derive results on the sensitivity of the state-dependent base-stock levels to the number of orders in each stage of update. In a numericalstudy, we examine the benefit of ADI, and find that it is most valuable to the supplier whenthe time between updates is moderate. We also consider the impact of holding and backordercosts, numbers of updates, and the fraction of customers that provide ADI. In addition, we findthat while ADI is always beneficial to the supplier, this may not be the case for the customerswho provide the ADI.

Keywords: Advance demand information, production-inventory systems, make-to-stock queues,continuous-time Markov decision processes

∗Program in Industrial and Systems Engineering, University of Minnesota, 111 Church Street S.E., Minneapolis,

MN 55455†PROS, 3100 Main Street, Suite #900, Houston, TX 77002

1 Introduction

It is increasingly common for members of the same supply chain to share advance demand infor-

mation (ADI). This practice has been facilitated by information technologies such as the Internet,

electronic data interchange (EDI), and radio frequency identification (RFID). It has also been sup-

ported by initiatives such as the inter-industry consortium on Collaborative Planning, Forecasting

and Replenishment (CPFR), which provides a framework for participating companies to share fu-

ture demand projections and coordinate ordering decisions. Large manufacturers, such as Toyota

and Boeing, are tightly integrated with their first tier suppliers with whom they share produc-

tion status, inventory usage, and even future design plans. Large retailers, such as Wal-Mart and

Best Buy, have invested in sophisticated information collecting and processing infrastructure that

enables them to share real-time inventory usage and point-of-sale (POS) data with thousands of

their suppliers. Several manufacturers that sell directly to the consumer, such as Dell, and online

retailers, such as Amazon, encourage their customers to place their orders early by offering dis-

counts to those that accept later delivery dates. In some industries, suppliers allow their long-term

customers to place soft orders far ahead of due dates, which may then later be firmed up, modified,

or canceled.

Although ADI can take on different forms and may be enabled by a variety of technologies, it

typically reduces to customers providing advance notice to their suppliers about the timing and

size of future orders. This information can be perfect (exact information about future orders) or

imperfect (estimates of timing or quantity of future orders). The information can also be explicit,

with customers directly stating their intent about future orders, or implicit, with customers allowing

suppliers to observe their internal operations and to determine estimates of future orders (the

systems we consider in this paper are in part motivated by settings with such implicit information;

we provide examples and further discussion later in this section). It is generally believed that

ADI, even if imperfect, can improve supply chain performance. In particular, with information

about future demand, a supplier may be able to reduce the need for inventory or excess capacity.

Customers may also benefit through improved service quality or lower costs.

However, the availability of ADI raises questions. How should a supplier use ADI to make

decisions? How valuable is ADI to suppliers and customers and how is this value affected by

operating characteristics of the supplier and the quality of information provided by customers?

How significant are benefits from receiving information further in advance or increasing the portion

of customers that provide ADI? Is ADI equally beneficial to all parties in the supply chain and

could it be harmful, particularly to the customer who provides it?

1

We address these and other related questions for a supplier that produces a single product.

Customers furnish the supplier with ADI by announcing orders ahead of their due dates. However,

this information is not perfect, and orders may become due prior to or later than the announced

expected due date or they can be canceled altogether. Hence, the demand leadtime (the time be-

tween when an order is announced and when it is requested or canceled) is random. Customers

provide status updates as their orders progress towards becoming due, but the time between con-

secutive updates is also random and independent of updates for other orders. In this paper, we are

primarily motivated by settings where customers implicitly provide ADI to the supplier by allowing

the supplier to observe their internal operations (e.g., order fulfillment, manufacturing, inventory

usage), thereby enabling the supplier to estimate when customers will eventually place orders. We

refer to such internal operations as the demand leadtime system.

For most of the paper, we assume that the actual due dates of different orders are independent

of each other and orders that are announced later can become due before (or after) those that are

announced earlier. Updates are also independent and do not follow a first-announced, first-updated

rule. We refer to this as a system with independent due dates (IDD). In Section 5, we show how our

treatment can be extended to systems where announced orders are updated and the orders become

due in the sequence in which they are announced. We refer to this as a system with sequential due

dates (SDD).

The following examples illustrate the types of settings we model in this paper. Consider a sup-

plier that provides a component to a manufacturer, such as Boeing, of a large and complex product

(e.g., an aircraft). The manufacturer informs the supplier each time it initiates the production of a

new product and each time it completes a stage of the production process. The component provided

by the supplier is not immediately needed and is required only at a later stage of the production

process. The manufacturer does not accept early deliveries, but wishes to receive the component as

soon as it is needed in a just-in-time fashion. The supplier uses the information about the progres-

sion of the product through the manufacturer’s production process to estimate when it will need

to make a delivery to the manufacturer. To make such estimates, the supplier uses its knowledge

of the manufacturer’s operations and available data from past interactions. However, the estimates

are imperfect and the manufacturer (due to inherent variability) may complete a production stage

sooner or later than expected. The manufacturer may initiate, in response to its own demand,

the production of multiple products simultaneously (e.g., an aircraft manufacturer may assemble

multiple airplanes in parallel). The evolution of these products through the production process

is largely independent, so that a product that enters a particular stage of production later than

2

another product may complete it sooner.

This type of ADI may also arise in settings other than manufacturing. For example, van

Donselaar et al. (2001) present a case study of how builders provide material suppliers with ADI

about the start and progress of construction projects. The suppliers use the information to estimate

when a builder will need materials. This estimation is not perfect because progress on a construction

project can be variable and because design specifications may change over the course of a project,

sometimes leading the builders ultimately not to place orders.

In this paper, we provide a general framework for modeling systems with this type of ADI.

The framework is broad enough to model a wide range of demand leadtime systems with various

assumptions regrading due date updating. The demand leadtime system can be viewed in general

as a queueing system composed of parallel servers, with service times consisting of multiple stages

of random duration and the completion of a stage of corresponding to an update. Arrivals to the

demand leadtime system correspond to orders being announced; similar to service times, interarrival

times are random and may have multiple stages, with the completion of a stage indicating an update.

Departures from the demand leadtime system correspond to orders becoming due.

In the systems we study, the supplier has finite capacity, producing items one at a time with

stochastic production times. Hence, the supplier itself can be viewed as a single server queue, with

arrivals corresponding to orders becoming due (i.e., an arrival to the supplier is a departure from

the demand leadtime system). The supplier has the ability to produce items ahead of their due

dates in a make-to-stock fashion. However, items in inventory incur a holding cost. When an order

becomes due and it cannot be immediately satisfied from inventory, it is backordered but it incurs

a backorder cost. The supplier’s objective is to find a production control policy to minimize the

expected total discounted cost or the expected average cost per unit time.

We formulate the problem as a continuous-time Markov decision process (MDP). We show that

there is an optimal production policy that is a state-dependent base-stock policy, wherein the supplier

produces if and only if the net inventory is below a base-stock level, which depends only upon the

numbers of orders in various stages of update. We also derive results on the sensitivity of the state-

dependent base-stock levels to the numbers of orders in each stage of update. For SDD systems,

we obtain similar results. In our analysis, we develop a method for proving structural properties of

optimal policies of continuous-time Markov decision processes (CTMDPs) with unbounded jump

rates. The derived structure is useful as it allows one to compute and store an optimal policy

in terms of just the base-stock levels, simplifying the policy’s implementation. The structure of

the optimal policy can also guide construction of simpler heuristics if needed, or in assessing the

3

effectiveness of heuristics that may already be in use.

We also conduct a numerical study to examine the benefits of ADI to both suppliers and

customers by comparing systems with ADI, without ADI, and with partial ADI. The study yields

several managerial insights, a few of which we now summarize. Increasing the average demand

leadtime by increasing the number of updates always reduces the supplier’s cost. However, given

a fixed number of updates, increasing the average time between updates may increase or decrease

cost. ADI is most valuable when the average time between updates is moderate. ADI is less valuable

when the average time between updates is short, because there is little time to react to information.

It is also less valuable when the average time between updates is long, because the earlier notice

comes with an increase in variability of the demand leadtime. This points out that obtaining earlier

notice (on average) of orders is not necessarily desirable, and that when evaluating the benefit of

ADI, it is important to account for the mechanism by which this ADI might be obtained. The

incremental cost reduction from updating is often small compared to that from announcing orders

ahead of their due dates. Typically, much of the benefit of ADI can be realized if customers provide

just initial advance order announcements and few or no updates. Although ADI leads to an overall

reduction in cost, in some cases it may be used by the supplier to reduce inventory at the expense

of more backorders. Therefore, customers that provide ADI may witness a decline in service levels.

However, in exchange for ADI, customers are in position to negotiate an increase in the backorder

penalty they apply to the supplier. Higher backorder penalties can serve as a mechanism for

customers to deter suppliers from reducing service levels, or as a mechanism to share, indirectly,

the cost savings from ADI.

The remainder of the paper is organized as follows. In Section 2, we give a brief literature

review and summarize our contribution. In Section 3, we formulate the problem and describe the

structure of an optimal policy. We also discuss extensions including systems with variable numbers

of updates, order cancelations, multiple customer classes, and lost sales. In Section 4, we present

numerical results. In Section 5, we extend our analysis to systems with sequential updating. In

Section 6, we offer a summary and concluding comments. Proofs are in the Appendix and Online

Supplement.

2 Literature Review and Summary of Contributions

There is a growing literature on inventory systems with ADI. A review of much of this work can

be found in Gallego and Ozer (2002). Models can be broadly classified into two categories based

on whether inventory is reviewed periodically or continuously.

4

For systems with periodic review, ADI is typically modeled as information available about

demand in future periods. Under varying assumptions, Gallego and Ozer (2001), Ozer and Wei

(2004), and Schwarz et al. (1997) have shown the existence of optimal state-dependent base-stock

policies for periodic-review problems with ADI. In these papers, the base-stock levels depend upon

a vector of advance orders for future periods. Ozer (2003) extends the analysis to distribution

systems with multiple retailers, Gallego and Ozer (2003) to serial systems, and Wang and Toktay

(2008) to systems with flexible delivery. Other papers that consider periodic review systems with

ADI include Thonemann (2002), Gavirneni et al. (1999), and DeCroix and Mookerjee (1997).

For continuous-review inventory systems with ADI, Buzacott and Shanthikumar (1994) consider

production-inventory systems with ADI and evaluate policies that use two parameters: a base-stock

level and a release leadtime. Hariharan and Zipkin (1995) introduced the notion of demand leadtime

in a system where orders are announced a fixed amount of time before they are due. For constant

supply leadtimes and Poisson order arrivals, they show that there is an optimal base-stock policy

with a fixed base-stock level. Karaesmen et al. (2002) analyze a discrete-time model with constant

demand leadtimes that is similar to our SDD model with no due-date updating (see Section 5). They

prove the optimality of state-dependent base-stock policies. Gallego and Ozer (2002, Section 2.4)

consider a system similar to a special case of our SDD setting, but with exogenous load-independent

supply leadtimes. Gayon et al. (2009) study a system similar to our IDD scheme but with multiple

demand classes, lost sales, and no due-date updates. Other papers that deal with continuous-review

systems include Liberopoulos et al. (2003) and Karaesmen et al. (2004).

It is possible to view the demand leadtime system in our model as a Markovian demand-

modulating process with transition probabilities between states determined by the dynamics of

order announcements and due dates. Previous literature (see, e.g., Chen and Song 2001) has estab-

lished the optimality of state-dependent base-stock policies for periodic review inventory problems

with Markov-modulated demand and exogenous leadtimes. The results of Chen and Song do not

directly apply to our setting of endogenous leadtimes and continuous review. Nevertheless, it might

be possible to develop an alternate analysis of the systems considered herein using techniques from

the study of inventory models with Markov-demand demand.

Advance demand information can be viewed as a form of forecast updating. Examples of papers

that deal with inventory systems with periodic forecast updates include Graves et al. (1986), Heath

and Jackson (1994), Gullu (1996), Sethi et al. (2001), Zhu and Thonemann (2004), and references

therein. The models we present in this paper can be viewed as dealing with forecast updates.

However, in our case the updates are with respect to the timing of future demand.

5

Finally, there is a literature that deals with how a supplier should quote delivery leadtimes to

its customers; see, for example, Duenyas and Hopp (1995), Hopp and Sturgis (2001), and references

therein. The setting studied in this literature is quite different from ours and typically concerns

make-to-order systems where no finished goods inventory is held in advance of customer orders.

Relative to the above literature, we make the following contributions. Our paper is the first

to consider imperfect ADI with updates for continuous-review production-inventory systems. It

also appears to be the first to directly model stochastic demand leadtimes and distinguish between

systems with independent and sequential due date updates, and to derive the structure of an optimal

policy for each. The paper offers one of the most general models of ADI in the literature (e.g.,

systems with no updates or with a single update can be treated as special cases). The modeling

framework is flexible and can accommodate additional features such as random numbers of updates,

order cancelations, multiple demand classes, and lost sales. Moreover, the numerical study yields

new insights on the benefit of ADI to suppliers, highlighting important effects due to capacity,

demand leadtime, and cost parameters. It also contrasts the impact of (i) increasing the number of

updates, (ii) increasing the fraction of customers who give ADI, and (iii) increasing the length of

individual update stages. The numerical results also shed light on effects of ADI on customers. We

show that customers may see their service quality deteriorate if they provide ADI to their suppliers.

Beyond the context of ADI, our paper also contains an approach for proving structural properties

of optimal policies of CTMDPs with unbounded jump rates. (The IDD model has unbounded jump

rates.) The usual approach for proving structural properties for CTMDPs with bounded jump

rates is to first uniformize (see, e.g., Lippman 1975) the CTMDP to get an equivalent discrete-

time Markov decision process (DTMDP), and then to show that certain properties of functions

are preserved by the DTMDP transition operator. Results then follow using induction and the

convergence of value iteration. With unbounded jump rates, uniformization cannot be applied, and

hence the “usual approach” does not work. Our method for CTMDPs with unbounded jump rates

involves proving the desired structural properties for each of a sequence of problems with bounded

jump rates, and then extending to the problem with unbounded jump rates by passing to a limit

via a suitably chosen subsequence and appealing to results of Guo and Hernandez-Lerma (2003).

Although the method is somewhat intuitive, it involves resolving a number of non-trivial technical

issues, some of which are problem specific. A variant of the approach was used in the paper by

Gayon et al. (2009) cited above. The approach may prove useful in other problems with unbounded

jump rates.

6

3 Problem Formulation and Structure of an Optimal Policy

Consider a supplier of a single product, who can produce at most one unit of the product at a time.

The supplier may hold completed units of the product in inventory. Any such unit of inventory

incurs a holding cost of h per unit time.

We model ADI through the notion of a demand leadtime system. (As mentioned in the in-

troduction, such a system may represent the internal operations of customers; ADI is provided

implicitly by allowing the supplier to view these internal operations.) Orders for the product are

announced before their due dates. Such announcements may be viewed as arrivals to the demand

leadtime system. We assume that the announcements arrive continuously over time according to

a Poisson process with rate λ. The amount of time between when an order is announced and

when it becomes due is random. We refer to this random variable as the demand leadtime. The

demand leadtime of an order is the amount of time it spends in the demand leadtime system. We

assume orders are homogeneous in the sense that demand leadtimes have the same distribution for

all orders, and hence the expected demand leadtime is the same for all orders.

After an order is announced, it progresses through the demand leadtime system before becoming

due. Specifically, it undergoes a series of k − 1 updates (k ≥ 1). For i = 1, . . . , k − 1, the time

between the (i− 1)th and ith update is exponentially distributed with mean ν−1i . (The 0th update

is the order’s initial announcement.) The time between the (k − 1)th update and when the order

becomes due is exponentially distributed with mean ν−1k . Hence, each demand leadtime consists

of k exponentially distributed stages with the expected demand leadtime of each order equal to

ν−11 + · · · + ν−1

k . (The case with k = 1 represents a situation with no updates and exponential

demand leadtimes.) When an order has undergone exactly i − 1 updates we say that it is in stage

i. Viewed in this fashion, the ith update corresponds to an order moving from stage i to stage

(i + 1). When an order undergoes its ith update, the supplier learns that the order’s expected

remaining demand leadtime has decreased from ν−1i + · · · + ν−1

k to ν−1i+1 + · · · + ν−1

k . Equivalently,

we may think of demand leadtime as having a phase-type distribution with k phases in series.

Information is provided each time the demand leadtime completes a phase. In the case where

νi = ν for i = 1, . . . , k, demand leadtimes have an Erlang distribution. Note that the process by

which orders are announced, updated, and become due can be viewed as an M/G/∞ queue.

After an order becomes due (i.e., after it leaves the demand leadtime system), the supplier fills

the order if it has inventory on hand. If the supplier does not have inventory on hand, then the order

is backordered and incurs a backorder cost of b per unit time. Orders do not incur backorder costs

before they are are due; i.e., orders in the demand leadtime system do not incur backorder costs.

7

As mentioned above, the supplier can produce one item at a time. We assume that production

times are exponentially distributed with mean µ−1. Hence, the production process can itself be

viewed as a queue, whose input is provided by the output of the demand leadtime system.

The assumptions of Poisson arrivals of announcements and exponential production and update

times are made in part for mathematical tractability as they allow the problem to be cast as a CT-

MDP. They are also appropriate for approximating systems with high variability. Such Markovian

assumptions are consistent with previous studies of production-inventory systems; see e.g., Buza-

cott and Shanthikumar (1993), Ha (1997), Zipkin (2000), de Vericourt et al. (2002), and others.

Later, we partially relax these assumptions.

In the remainder of this section, we develop the formulation and describe the structure of the

optimal policy. This is done by first analyzing in Section 3.1 a simplified version (with a truncated

state space) of the problem. We then extend the analysis in Section 3.2 to systems without the

truncation and state our main result for this section in Theorem 2.

3.1 Bounded Jump Rates

In this section, we assume that the total number of announced orders (i.e., the number of orders

in the demand leadtime system) at any instant remains bounded by a finite integer m < ∞,

so that∑k

i=1 yi ≤ m, where yi is the number of orders in stage i. Order announcements are

rejected and leave without entering the leadtime demand system (and hence never become due) if∑k

i=1 yi = m. This assumption allows us to formulate the problem as a Markov decision process

with bounded jump rates. From a queueing perspective, the introduction of the finite m means

that we approximate the M/G/∞ queue mentioned above by an M/G/m/m queue (an Erlang loss

system). When m is chosen to be large, very few order announcements are rejected and hence the

arrival rate of due orders to the production facility will be roughly the same as the arrival rate of

order announcements to the demand leadtime system. In fact, the precise arrival rate of due orders

to the production facility will be λ(1 − B(m)) where B(m) is the probability that an M/G/m/m

queue is full. The probability B(m) approaches 0 as m → ∞. The exact value of B(m) is given

by the well-known Erlang loss formula. Using the results of this section as a building block, in

Section 3.2 we extend our results to the case with no bound on the total announced orders.

Let Z and Z+ be respectively the sets of integers and non-negative integers, and let Zk and Z

k+

be their k-dimensional cross products. Let R be the real numbers. Throughout, y = (y1, . . . , yk).

The MDP has state space Sm := Z × Zk+(m), where Z

k+(m) := {y ∈ Z

k+ :

∑ki=1 yi ≤ m}. To

keep notation clean, we will indicate the dependence on m only in notation that is used later for

8

extending to the case without m. It is, however, important to keep in mind that most of the

quantities in this section do depend upon m, even if this is not reflected in the notation.

The state of the system is determined by X(t), which represents the net inventory at time t,

and Y(t) = (Y1(t), . . . , Yk(t)), where Yi(t) is the number of announced orders in stage i at time t.

In each state, two actions are possible: produce or idle (do not produce). The objective is to

find a production policy that minimizes the long-run expected discounted cost. Let the set of such

production policies be denoted by Π. A deterministic stationary policy π := {π(x,y) : (x,y) ∈ Sm}

specifies the action taken at any time as a function only of the state of the system, where π(x,y) = 1

means produce in state (x,y), and π(x,y) = 0 means idle in state (x,y).

We will work with a uniformized version (see, e.g., Lippman, 1975) of the problem in which the

transition rate in each state under any action is Λ := λ+µ+m∑k

i=1 νi so that the transition times

0 = τ0 ≤ τ1 ≤ τ2 ≤ . . . are such that {τn+1 − τn : n ≥ 0} is a sequence of i.i.d. exponential random

variables, each with mean Λ−1. Let {(Xn,Yn) : n ≥ 0} denote the embedded Markov chain of

states; that is, (Xn,Yn) := (X(τn),Y(τn)) is the state immediately after the n-th transition. For

i = 1, . . . , k, let ei be the k−dimensional vector with 1 in position i and zeros elsewhere. Let e0

be the k-dimensional vector of zeros. If action a ∈ {0, 1} is selected in state (x,y), then the next

state of the embedded Markov chain is (x′,y′) with probability

p(x,y),(x′,y′)(a) :=

Λ−1µI{a=1} if (x′,y′) = (x + 1,y)

Λ−1λI{y<m} if (x′,y′) = (x,y + e1)

Λ−1νiyiI{yi≥1} if (x′,y′) = (x,y + ei+1 − ei)

Λ−1νkykI{yk≥1} if (x′,y′) = (x − 1,y − ek)

Λ−1[Λ − µI{a=1} − λI{y<m} −

∑ki=1 νiyiI{yi≥1}

]if (x′,y′) = (x,y)

0 otherwise,

where y :=∑k

i=1 yi and I{·} is the indicator function. The cost rate when the state is (x,y) is

c(x,y) = c(x) := hx+ + bx−, where h > 0 and b > 0 are the per-unit holding and backorder cost

rates, and x+ = max{x, 0} and x− = −min{x, 0}. Here, we again emphasize that backorder costs

are incurred only when an order becomes due and is not immediately satisfied. Jobs inside the

demand leadtime system (which have been announced, but which are not yet due) do not incur

backorder costs.

The value function, which specifies the optimal expected total discounted cost, is given by

v∗m(x,y) := infπ∈Π

Eπ(x,y)

[ ∫ ∞

t=0e−βtc(X(t))dt

]= inf

π∈ΠEπ

(x,y)

[∞∑

n=0

γ

)n c(Xn)

γ

], (1)

9

where β > 0 is the discount rate, γ := β + Λ, and Eπ(x,y) denotes expectation with respect to the

probability measure determined by policy π and (X(0),Y(0)) = (x,y).

Let V be the set of real-valued functions on Sm and let v be an arbitrary element of V . Define

Tλ, T ′i , Tµ : V → V as follows

Tλv(x,y) := v(x,y + e1I{y<m})

T ′iv(x,y) := v(x,y + [ei+1 − ei]I{yi≥1}) i = 1, . . . , k − 1 (2)

T ′kv(x,y) := v(x − I{yk≥1},y − ekI{yk≥1}) (3)

Tµv(x,y) := min{v(x,y), v(x + 1,y)}.

Consider also the operator T : V → V defined by

Tv(x,y) := mina∈{0,1}

{c(x)

γ+

Λ

γ

(x′,y′)∈S

p(x,y),(x′,y′)(a)v(x′,y′)

}(4)

= γ−1[c(x) + λTλv(x,y) +

k∑

i=1

νiyiT′iv(x,y) +

k∑

i=1

νi(m − yi)v(x,y) + µTµv(x,y)]. (5)

The function v∗m defined in (1) is the minimum non-negative solution of the optimality equation

v = Tv, (6)

and moreover a stationary policy that specifies for each (x,y) an action that attains the minimum

on the right-hand side of (6) is optimal. See, e.g., Section 11.5 of Puterman (1994).

In the optimality equation (6), operator Tλ corresponds to the arrival of a customer. More

precisely, if v(x,y) represents the “value” of being in state (x,y), then Tλv(x,y) is the value just

after an arrival occurs when the state is (x,y). Similarly, operator T ′i ; i = 1, . . . , k − 1 corresponds

to an update of an order from stage i to stage i + 1 and operator T ′k corresponds to an order

becoming due. Operator Tµ corresponds to the production decision. When v(x + 1,y) < v(x,y),

it is better to produce a unit of inventory than it is to idle. In this case Tµv(x,y) = v(x + 1,y)

represents the value just after the completion of the unit of inventory when the state is (x,y). When

v(x + 1,y) ≥ v(x,y), it is instead better to idle, in which case Tµv(x,y) = v(x,y). The other term

—∑k

i=1 νi(m − yi)v(x,y) — in the optimality equation corresponds to null transitions introduced

through uniformization of the jump rate. To understand the term λ that multiplies Tλv(x,y), note

that in state (x,y), the next event will be an arrival with probability λ/Λ. Similar interpretations

are possible for the other multipliers. The term λ also represents the rate of order announcements.

Likewise, µ is the rate of potential production completions, νiyi is the rate of updates at stage i

10

when yi orders are in stage i, and∑k

i=1 νi(m− yi) is the rate of null transitions when there y jobs

in the demand leadtime system. Hence Λ is the overall rate of (real and null) transitions.

In preparation for Theorem 1, let ∆v(x,y) := v(x + 1,y)− v(x,y) for v ∈ V and let U := {v ∈

V : v satisfies conditions (C1)–(C4)}, where conditions (C1)–(C4) are defined as follows:

(C1) ∆v(x,y) ≤ ∆v(x + 1,y) for all x ∈ Z, y ∈ Zk+(m).

(C2) ∆v(x,y + ej) ≤ ∆v(x + 1,y + el) for all x ∈ Z, y ∈ Zk+(m − 1), j = 0, . . . , k − 1, and

l = j + 1, . . . , k.

(C3) ∆v(x,y + ej+1) ≤ ∆v(x,y + ej) for all x ∈ Z, y ∈ Zk+(m − 1), and j = 0, . . . , k − 1.

(C4) ∆v(x,y) ≤ 0 for all x ∈ Z with x < 0, y ∈ Zk+(m).

As we can see from Proposition 1 below, the value function satisfies these conditions. The fact that

the value function satisfies these conditions implies certain structural properties for the optimal

policy. In particular, Condition (C1) is a convexity property that can be used to show the existence

of a state-dependent base-stock optimal policy. Conditions (C2) and (C3) can be used to show that

the announcement of a new order or the update of an existing order will cause the base-stock level

either to increase by one or to remain unchanged. Condition (C4) can be used to show that it is

optimal to produce whenever there are backorders.

Proposition 1 The value function is an element of U ; that is, v∗m ∈ U .

The proof of Proposition 1 is in Section S-1 of the Online Supplement. We are now ready for

the main result of the section. Theorem 1 describes the structure of an optimal policy; a proof is

in the appendix.

Theorem 1 The stationary state-dependent base-stock policy π∗ = {π∗(x,y)} given by

π∗(x,y) :=

0 if x ≥ sy

1 if x < sy

(7)

where sy := min{x : v∗m(x + 1,y) − v∗m(x,y) ≥ 0} is optimal. In addition, (a) the base-stock levels

satisfy sy+el∈ {sy+ej

, sy+ej+1} for j = 0, . . . , k−1; l = j +1, . . . , k and (b) π∗(x,y) = 1 if x < 0.

Theorem 1 states that for each vector y of announced orders there exists a threshold sy such that

it is optimal to produce if net inventory is less than sy, and it is optimal to idle if net inventory is

at least sy. We refer to the parameters {sy} as the y-dependent base-stock levels. Part (a) with

11

l = j + 1 indicates that the y-dependent base-stock level increases by at most one if an order is

updated or if a new order is announced. It also follows from part (a) that sy is increasing in each

component of y; i.e., sy ≤ sy+iejfor i ∈ Z+ and j = 1, . . . , k. Part (b) states that it is optimal to

produce if there are any backorders. These results are consistent with those obtained by Ozer and

Wei (2004), who show a similar structure to the optimal policy in periodic-review systems where

ADI consists of confirmed demand for future periods (e.g., see Theorem 2 in Ozer and Wei 2004).

Figure 1 illustrates the structure described in Theorem 1 for two examples, each with k = 2

stages. In part (a) of the figure, the mean time 1/νi spent in each of the stages is relatively long,

and hence the production policy is much less sensitive to orders in stage 1 than it is to orders in

stage 2. In part (b) the mean time 1/νi is shorter, and hence the production policy treats orders

in stage 1 almost the same as orders in stage 2.

020

4060

80

0

20

40

60

800

20

40

60

80

100

120

140

Announced orders in stage 2, y 2

Announced orders in stage 1, y1

Net

inve

ntor

y x

(a) ν1 = ν2 = 0.01 (b) ν1 = ν2 = 0.10

Figure 1: Optimal policies for two different systems with IDD and k = 2: The surfaces

depict the state-dependent base-stock levels. For a given y = (y1, y2), if the net inventory

on hand x is below the surface, it is optimal to produce; if the net inventory on hand x is

on or above the surface, it is optimal to idle. (m = 200, µ = 1, λ = 0.6, h = 10, b = 100)

Above, we focused on the discounted-cost optimality criteria. A treatment of average cost can

be found in Section S-2 of the Online Supplement, where Theorem S-1 shows that a direct analog

of Theorem 1 holds for the average-cost optimality criteria under the additional assumption that

λ < µ (which ensures that production can keep up with demand and prevent backorders from

growing “infinitely large”). In the discounted-cost case, we do not need this assumption.

We close this section by illustrating the flexibility of our modeling framework in accommodating

additional features. In particular, we consider four extensions to our basic model: (1) systems with

12

random numbers of updates, (2) systems with order cancelations, (3) systems with two demand

classes, one providing ADI and the other one not, and (4) systems with lost sales.

Systems with Random Numbers of Updates. Suppose that customers update their orders

a random number of times. In particular, suppose that given an announced order is at stage i, it

will, independent of everything else, become due after the end of stage i with probability qi and

progress to next stage with probability 1 − qi, for i = 1, . . . , k − 1. To extend the model to a such

setting with a random number of updates, we need to replace T ′i in (5) by T ′

i defined by

T ′iv(x,y) := (1 − qi)T

′iv(x,y) + qiv(x − I{yi≥1},y − eiI{yi≥1}) i = 1, . . . , k . (8)

In this case, there is again an optimal state-dependent base stock policy and it is optimal

to produce whenever backorders are present. However, to prove properties (a) and (b) of the

base-stock levels in Theorem 1 we impose the condition that ν1q1 ≤ ν2q2 ≤ · · · ≤ νkqk; that is,

νiqi is non-decreasing in i. A proof is in Section S-3 of the Online Supplement. To understand

the importance of this condition, suppose temporarily that the condition does not hold. More

specifically, suppose, e.g., that ν1q1 is much larger than ν2q2. Then an order in stage 1 is, in a

sense, “closer” to becoming due than is an order in stage 2. To see this, observe that an order in

stage 1 tends to quickly become due after just one stage of update whereas, in the event that the

order progresses to stage 2, it tends to remain there a (relatively) long time. Hence, there may

be (x,y) such that it is best to produce when in state (x,y + e1), but best to idle when in state

(x,y + e2).

Systems with Order Cancelations. In some settings, customers may cancel their orders after

they have been announced. For example, consider a situation where, with each update, an order is

either canceled or its due date is updated. In this case, ADI is imperfect with regard to both timing

and realization of future orders. For example, in the context of a building construction project,

changes to building specifications at some stage of the project may lead the builder to cancel orders

for certain material. To incorporate this into the model, let pi now denote the probability that an

order is canceled at the end of its ith stage. The case where pi = 0 corresponds to a system with

no cancelations. The state space, action space, and cost rates are as in a system without order

cancelations. To handle cancelations, we need only replace T ′i in (5) by

T ′iv(x,y) := (1 − pi)T

′iv(x,y) + piv(x,y − eiI{yi≥1}) i = 1, . . . , k .

See Section S-4 of the Online Supplement for a proof that Theorem 1 holds for systems with order

cancelations under the additional assumption that νipi is non-increasing in i.

13

Systems with two Customer Classes. In some situations, it may be the case that not all

customers provide ADI. In other words, there may be a fraction of customers that does not announce

orders ahead of demand. This is plausible in settings where the supplier has a mix of long-term and

short-term (or non-recurring) customers. Long-term customers are more likely to share information

and to invest in the necessary infrastructure. Suppose that a fraction η of orders provides ADI,

and that a fraction 1 − η does not. Equivalently, we may view customers as belonging to two

separate classes. Class 1, with arrival rate λ1 = ηλ, provides ADI, and Class 2, with arrival rate

λ2 = (1 − η)λ, does not. To incorporate this into the model, we replace the operator Tλ by

Tλv(x,y) := ηTλv(x,y) + (1 − η)v(x − 1,y) .

The operator Tλ preserves conditions (C1)–(C4) because Tλ does; see the proof of Proposition 1.

Hence, Theorem 1 holds in this setting.

Systems with Lost Sales. So far we have assumed that when orders become due and there is

no on-hand inventory, orders can wait. In many applications, orders cannot wait and instead are

lost if they become due and cannot be filled immediately. With each lost order, a lost sales cost

is incurred. This cost may be a negotiated penalty with the customer or may reflect the cost of

expediting the order or fulfilling it from an outside supplier (it may also reflect the loss of good

will). To incorporate lost sales, we need only take the state space to be Z+ × Zk+(m), re-define the

cost function to be c(x) = hx, and replace the operator T ′k by T ′

k defined as follows:

T ′kv(x,y) := v([x − I{yk≥1}]

+,y − ekI{yk≥1}) + cLSI{yk≥1 and x=0}, (9)

where cLS corresponds to the lost sale cost per unit. Section S-5 of the Online Supplement contains

a proof that Theorem 1 [except property (b), which pertains to backorders] holds in this setting.

3.2 Unbounded Jump Rates

In this section we again consider the basic IDD system under the discounted-cost criterion. The

model is identical to that considered in the previous section, except that here we do not place the

bound m on the number of announced orders in the system. There are no rejected orders, and all

arrivals enter the demand leadtime system. The state space is now S := Z × Zk+. Without the

bound m, we have a continuous-time Markov decision process with unbounded transition rates.

In particular, the conditional rate of transitions out of state (x,y) ∈ S under action a ∈ {0, 1} is

λ +∑k

i=1 νiyi + µI{a=1}. With no bound m on∑k

i=1 yi, this conditional rate is not bounded.

Theorem 2 below shows that the results in Theorem 1 also hold in the setting with unbounded

jump rates (and discounted costs). We conjecture that similar results for unbounded jump rates

14

hold under the average-cost criterion, but we do not have a proof at this time. Although this

may not be surprising because Theorem 1 holds for any finite m, it is important to highlight

that the presence of unbounded transition rates poses a technical challenge. In particular, it is

not possible to apply uniformization to a problem with unbounded jump rates, and hence the

problem cannot be transformed into an “equivalent” discrete-time problem as in Section 3.1. Such

a transformation is typically a crucial step for proving structural properties of optimal policies

using inductive approaches (as in the proof of Proposition 1). An additional difficulty is that only

recently has there developed a theory for problems with both unbounded jump rates and unbounded

cost rates that characterizes the value function as a particular solution of the optimality equation,

and ensures the existence of stationary optimal policies. See Guo and Hernandez-Lerma (2003) —

hereafter called GH — for results and references.

Our proof of Theorem 2 establishes the structure of an optimal policy and of the value function

for the problem with unbounded jump rates by letting m grow to infinity through a particular

sequence of problems such as those considered in Section 3.1. In doing so, there are a number

of technical points, such as the existence of various limits, that must be treated with care. The

approach may provide a template that could be used for analyzing other CTMDPs with unbounded

jump rates and cost rates.

Let v∗ denote the value function of the problem with unbounded jump rates. The optimality

equation for the problem with unbounded jump rates is v = Lv, where L is given by

Lv(x,y) :=1

Q(y)

[c(x) + λv(x,y + e1) +

k∑

i=2

νi−1yi−1v(x,y + ei − ei−1)

+ νkykv(x − 1,y − ek) + µ min{v(x,y), v(x + 1,y)}]

and Q(y) := β + λ +∑k

i=1 νiyi + µ.

In preparation for the main theorem of the section, define function R(·) by

R(x,y) := |x| +

k∑

i=1

yi (10)

and consider the set of functions BR(S) := {v : there exist constants c1, c2 ≥ 0 so that |v(x,y)| ≤

c1 + c2R(x,y) for all (x,y) ∈ S} where R(x,y) is given in (10). To employ the theory of GH, we

must identify a non-negative function R suitable for defining BR(S). GH do not specify an R for

the use of their theory. “Suitable” means that R must satisfy some conditions that relate to the

cost and transition rates of the CTMDP. In Lemma S-2 in Section S-6 of the Online Supplement we

verify that our choice of R in (10) is indeed suitable. For convenience, Section S-6 also summarizes

results we use from GH.

15

The following is the main result of the section. A proof is in the appendix.

Theorem 2 For the system with IDD and unbounded jump rates, the value function v∗ is the unique

function in BR(S) that solves the optimality equation v = Lv. Moreover, v∗ satisfies conditions

(C1)–(C4), and there exists a stationary state-dependent base-stock policy that is optimal. The

base-stock levels satisfy the conditions in (a) and (b) in Theorem 1.

4 Numerical Results

In this section, we present results from a numerical study. The goal is to examine the benefits from

using ADI, to assess the value of updating, and to compare the impact of having full versus partial

ADI. The insights we obtain for production-inventory systems with continuous review complement

results in the literature for systems with periodic review and deterministic leadtimes.

We use average cost instead of discounted cost, because average cost is independent of the initial

state and the discount factor. In all cases we set ρ = λ/µ < 1, so that Theorem S-1 applies (see

Section S-2 of the Online Supplement). The holding-cost rate is h = 10 and the production rate

is µ = 1, unless stated otherwise. In the numerical study, we used values of m large enough that

further increases in m would not alter the average costs at the level of accuracy shown in our tables.

For each problem instance, we obtained the long-run average cost by solving the MDP using value

iteration.

4.1 Benefits of ADI

To assess the benefit of ADI, we compare the optimal average cost, JA, for a system with ADI to

the optimal average cost, JN , for a system with no ADI and obtain the percentage cost reduction

PCR := 100 × (JN − JA)/JN . The two systems are identical in all respects, except that in the

system with no ADI, orders are not announced ahead of their due dates; rather, information about

when orders enter the demand leadtime system and when they move from one stage to the next

is withheld. Only departures from the last stage of the demand leadtime system are observed. In

general, the distribution of the departure process from the demand leadtime system is different

from the distribution of its arrival process. However, the arrival and departure processes in steady

state have identical (Poisson with rate λ) distributions for systems with no bound m. This follows

from the fact that the departure process from an M/G/∞ queue in steady state is a homogeneous

Poisson process with the same rate as the exogenous input Poisson process. For systems with finite

m, the departure process in steady state is closely approximated by a Poisson process when m is

16

large.

Long-run average cost is unaffected by the transient behavior of the demand leadtime system. As

the demand leadtime system approaches steady state, its departure process converges in distribution

to a Poisson process with rate λ. Hence, we may, for the purpose of computing long-run average

cost for a system without ADI, assume arrivals to the system without ADI form a Poisson process.

Finally, we note that a (state-independent) base-stock policy is optimal for a system with no ADI

with Poisson arrivals and exponential production times; see, e.g., Veatch and Wein (1996).

Representative numerical results comparing systems with and without ADI can be found in

Table 1, where PCR is shown for varying values of parameters ν, λ, and b. The results are shown for

a system with a single stage (k = 1). The effect of multiple update stages is discussed in Section 4.2.

ν

λ 0.01 0.02 0.05 0.1 0.2 0.5 1 2 5 10

0.1 0.00 0.00 0.00 0.02 0.37 3.94 8.07 17.84 12.60 7.53

0.2 0.00 0.00 0.06 0.64 2.36 8.39 14.11 18.41 11.74 6.84

0.4 0.51 1.66 4.84 8.52 12.73 16.21 20.97 17.74 9.60 5.35

b = 10 0.6 0.84 2.45 6.13 9.46 12.03 12.82 9.81 1.38 1.08 0.81

0.8 6.04 8.47 10.54 10.61 9.06 5.87 3.26 2.02 0.54 0.04

0.9 9.02 9.25 7.58 5.37 3.03 1.21 1.17 1.03 0.99 0.92

0.1 0.36 1.96 8.01 16.33 25.47 44.92 40.43 28.71 14.76 8.12

0.2 3.06 6.18 12.88 20.44 28.61 38.63 28.61 13.15 0.38 0.29

0.4 5.15 7.57 12.41 17.58 22.52 17.64 15.29 10.67 5.17 2.74

b = 50 0.6 4.24 7.58 12.88 15.46 14.77 7.77 4.75 1.51 0.86 0.47

0.8 11.19 13.77 13.29 9.87 5.95 2.59 1.31 0.89 0.38 0.16

0.9 11.73 12.78 5.30 2.74 1.14 0.44 0.23 0.20 0.13 0.02

0.1 8.93 14.28 23.84 32.73 49.49 51.82 39.08 23.15 6.63 0.05

0.2 0.13 1.08 6.14 13.96 25.43 18.72 6.63 5.90 3.11 1.55

0.4 2.08 4.78 10.96 16.58 19.37 15.63 5.06 4.07 2.06 1.04

b = 100 0.6 5.85 9.72 15.17 16.82 13.96 6.95 3.14 1.89 0.86 0.52

0.8 12.93 14.98 12.63 8.04 4.42 1.64 0.83 0.46 0.24 0.13

0.9 9.85 10.09 6.98 4.01 2.03 0.41 0.23 0.20 0.12 0.03

Table 1: The percentage cost reduction (PCR) for a system with k = 1.

The effect of ν on PCR, when all other parameter values are fixed, is not monotonic, with

PCR initially increasing in the mean demand leadtime 1/ν and then decreasing. ADI offers the

greatest benefit in terms of PCR when the size of 1/ν is moderate. The percentage cost reduction

is relatively small when either 1/ν is very large or very small. This can be explained as follows.

17

When 1/ν is small, the mean time between when an order is announced and when it becomes due

is small. Hence, the information is of little use. When 1/ν is large, the mean of the time between

an order’s announcement and due date is large, but so is the variance. This makes the information

about future demand relatively less useful. The meanings of “large,” “small,” and “moderate” 1/ν

depend upon the value of λ. Although the joint effect of λ, µ, and ν on PCR is complicated, it

appears that the value of 1/ν that maximizes PCR for a given λ is increasing in λ. The largest

value of 1/ν shown in Table 1 is 1/ν = 100; however, computations for larger values support the

claim that the relative benefit of ADI is small for large 1/ν. For instance, with b = 100, λ = 0.8,

and 1/ν = 500, we find that PCR is 3.22. These results highlight an important insight: having

earlier notice of future orders may not always be desirable since the quality of this information tends

also to deteriorate (i.e., the variance in the demand leadtime increases when the average demand

leadtime increases). In our model, this is due to the fact that demand leadtime is assumed to have

the exponential distribution. However, this also captures the fact that in practice the earlier an

order is announced, the less reliable will be the estimate of its due date (see Section 4.2 for further

results and discussion for systems with multiple stages of updating).

For each fixed ν, the effect of λ on PCR is also not monotonic. For fixed ν, the relative benefit

of ADI is small when λ is large (close to µ = 1). When λ is large, the optimal policy with or

without ADI is for the production facility to produce most of the time. Hence, the availability of

ADI makes little difference for the decisions taken. When λ is small, the absolute cost reduction

from ADI is small, because costs in the systems with and without ADI both approach zero as λ ↓ 0,

but the value of PCR depends on the value of ν; see Section S-7 of the Online Supplement for

further discussion on this.

The effect of the ratio b/h is also not monotonic, with the value of PCR relatively small when

b/h is either small or large. When b/h is small, ignoring ADI and producing to order (i.e., holding

little or no inventory in anticipation of future demand) carries a relatively small penalty. When b/h

is large, the base-stock levels are high for systems both with and without ADI, and the probability

of backorders is relatively small in both systems. Hence, ADI becomes relatively less useful.

We conclude this section by comparing the preceding observations with those obtained by

Gavirneni et al. (1999), who also evaluated the benefit of ADI with respect to similar parameters,

but in a different context. They study a periodic review system with zero leadtimes and limited

replenishment capacity per period, where ADI is obtained by having the supplier observe the

demand of a retailer that uses an (s, S) ordering policy. Some of their qualitative insights (for

example, regarding the effect of the ratio b/h) are similar to those above. However, there are

18

some notable differences. For example, they found that the percentage cost reduction due to ADI

is increasing in capacity. Interestingly, Ozer and Wei (2004), who consider a different model of

capacitated periodic-review inventory systems with ADI, concluded the opposite. That is, in their

modeling framework, they found ADI to be most beneficial when capacity is tight. Both of these

findings can be contrasted to the effect of varying λ (which varies production system loading) that

we describe above.

Some effects observed in both Gavirneni et al. (1999) and in our study appear to have different

causes in the different settings. For instance, they find that long demand leadtimes (measured in

their case by the difference S − s) diminish the value of ADI, and they attribute this to the fact

that long leadtimes result in large orders from the retailer, forcing the supplier to build inventory

over time because of capacity limits. In our case, long demand leadtimes also diminish the value of

ADI, but for a different reason: the information regarding due dates becomes less reliable because

both the mean and the variance of demand leadtime increase simultaneously.

4.2 Benefits of Updating

In settings where the demand leadtime system consists of multiple stages, the supplier and customer

may have a choice of how much information is shared. For example, should the customer inform

the supplier as soon as an order enters the first stage or wait until an order has progressed further

before forwarding the information to the supplier? Similarly, should the customer update the

supplier each time an order enters a new stage or should it wait until the order has passed a

specified number of stages? These questions are relevant when there is a cost associated with

collecting the information, transmitting it from one party to another, and then making decisions

based on it. To explore the benefit of full versus partial information sharing, we consider a system

where the demand leadtime has two stages (k = 2) and compare the performance of this system

when there is full ADI (information is shared as soon as orders enter the first stage and as they

leave one stage and enter the next) to its performance when there is no ADI and when there is

only partial ADI (information is shared only when orders enter the second stage). The systems

can be viewed as identical in all respects except for the number of update stages, with full ADI

corresponding to k = 2, partial ADI to k = 1, and no ADI to k = 0. In the system with full

ADI, the order is announced and then progresses through two stages of update, each exponentially

distributed with mean 1/ν, before becoming due. In the system with partial ADI, the order is

announced and then progresses through a single stage of update, exponentially distributed with

mean 1/ν, before becoming due. In the system with no ADI, the order is announced and becomes

19

due immediately.

Representative numerical results are displayed in Table 2, which not surprisingly shows that full

ADI is superior. (A proof of this observation follows by noting that any policy for the system with

partial ADI can be reproduced for the system with full ADI by basing decisions in the latter only

on the net inventory and the number of orders in the second stage.) Additional numerical results

for b = 10 and b = 50 can be found in Section S-9 of the Online Supplement. The value of full ADI

is most significant when both ν and λ are in the mid-range, and least significant when both ν and

λ are either very small or very large, as in the upper left and lower right corners of Table 2. This

is consistent with results from Section 4.1. In most of the examples considered, the incremental

benefit from full ADI (k = 2) over partial ADI (k = 1) is small. This suggests that partial ADI

may be sufficient if updating is expensive to implement.

We close this section by noting a subtle difference between the effect of increasing ADI by

increasing the number of stages observable to the supplier and increasing ADI by increasing the

length of a particular stage. Compare a system with k stages in which each stage has mean 1/ν to

a system with a single stage with mean k/ν. Both systems have the same overall mean, k/ν, but

the variance of the system with k stages is k/ν2 while the one for the system with a single stage is

k2/ν2 (i.e., k times larger). This helps explain why observing more stages of the demand process

is always beneficial, but increasing the average length of a particular stage may not be.

4.3 Benefits of Full versus Partial ADI

As discussed at the end of Section 3.1, there are settings where ADI is not available from all

customers. An important question that arises in these settings is how beneficial is it to increase the

fraction of customers that provide ADI. In particular, is there a diminishing value to increasing this

fraction or is the marginal benefit from ADI insensitive to how many customers already provide

ADI? To address this question, we consider the version of our problem where there are two customer

classes described at the end of Section 3.1. Class 1, with demand rate λ1 = ηλ, provides ADI and

class 2, with demand rate λ2 = (1− η)λ, does not. We examine the effect of increasing the fraction

of customers with ADI by varying the fraction η while maintaining λ = λ1 + λ2 constant, so that

higher values of η correspond to more customers providing of ADI.

Representative results from numerical experiments are shown in Figure 2 (with k = 1, λ = 0.8,

and b = 100). As we can see in this example, the relative benefit of ADI does not exhibit diminishing

returns with increases in the fraction η of customers with ADI. (The “nearly linear” pattern in the

figure is present for other parameter settings as well. In some cases it is less pronounced.) This

20

λ = 0.4 λ = 0.6 λ = 0.8

ν1 = ν2 k=0 k=1 k=2 k=0 k=1 k=2 k=0 k=1 k=2

0.01 25.07 24.55 24.53 46.38 43.67 43.65 107.24 93.38 92.10

2.08% 2.14% 5.85% 5.90% 12.93% 14.12%

0.02 25.07 23.87 23.85 46.38 41.87 41.74 107.24 91.18 87.81

4.78% 4.86% 9.72% 10.00% 14.98% 18.12%

0.05 25.07 22.32 22.26 46.38 39.35 38.47 107.24 93.70 86.93

10.96% 11.21% 15.17% 17.06% 12.63% 18.94%

0.10 25.07 20.91 20.40 46.38 38.58 36.60 107.24 98.62 91.50

16.58% 18.62% 16.82% 21.10% 8.04% 14.69%

0.20 25.07 20.21 19.00 46.38 39.91 36.33 107.24 102.51 98.04

19.37% 24.20% 13.96% 21.68% 4.42% 8.58%

0.50 25.07 21.15 18.60 46.38 43.16 39.82 107.24 105.49 103.47

15.63% 25.80% 6.95% 14.14% 1.64% 3.52%

1.00 25.07 23.80 20.62 46.38 44.93 43.05 107.24 106.35 105.48

5.06% 17.75% 3.14% 7.18% 0.83% 1.65%

1.50 25.07 23.90 22.65 46.38 45.38 43.93 107.24 106.68 105.90

4.66% 9.64% 2.16% 5.30% 0.53% 1.26%

2.00 25.07 24.05 23.63 46.38 45.51 44.89 107.24 106.75 106.33

4.07% 5.74% 1.89% 3.22% 0.46% 0.85%

Table 2: Average cost and percentage cost reduction (PCR) for systems with b = 100. The

columns labeled “k = 0” show the average cost for systems without ADI.

is in contrast to the typical effect of updating. These differences might, in part, be due to the

fact that with additional updating we provide more information for the same customers while

with expanding ADI to additional customers we provide new information for different customers.

This is significant since we assume that customers announce orders independently of each other,

so having information on some customers does not provide information on when other customers

might announce their own orders. A managerial implication from these observations is that, all else

being equal, and given our independence assumptions, a supplier may be better off expanding ADI

(with limited or no updating) to more of its customers than obtaining more updates from those

customers that already provide ADI.

These results appear to be different from those reported in the literature for systems with

periodic review. For example, Ozer and Wei (2004) carried out a set of experiments where they

varied the number of periods ahead of due dates that demand is announced (they refer to this as

the information horizon), as well as the effective fraction of customers that announce their demand

21

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2

4

6

8

10

12

14

η

Perc

enta

ge C

ost R

educ

tion

(PC

R)

ν = 0.05ν = 0.1ν = 0.2ν = 0.5ν = 1

Figure 2: PCR versus fraction of customers

that provide ADI.

1005020105210.50.20.128

30

32

34

36

38

40

42

44

46

48

1/ ν

Ave

rage

cos

t

Holding cost (no ADI)Holding cost (with ADI)Backorder cost (with ADI)Backorder cost (no ADI)

Figure 3: Average holding and backorder

costs, with and without ADI.

ahead of due dates. Their results show a diminishing marginal cost reduction from increasing the

fraction of customers whose demand is announced ahead of its due date. These differences might be

due to the fact that for a continuous review system, production decisions are made for one unit at a

time. Consequently, the system manager is able to use information about the anticipated due date

of each order to make a decision about whether or not to initiate the production of a replenishment

unit.

4.4 Benefits of ADI to the Customer

In evaluating the benefit of ADI, we have so far taken the perspective of the supplier who manages

the production process. In this section, we consider the impact of ADI on the customer who provides

it. In particular, we address the question of whether or not both supplier and customer benefit

from sharing demand information. It is often argued that ADI can reduce costs of the supplier

(which we observed to be true) and improve the quality of service received by the customer. In our

setting, the latter assertion would mean that customers experience fewer backorders and shorter

fulfillment delays. In Figure 3, we show the breakdown of supplier cost in terms of inventory

holding and backorder cost for examples with b = 50. (By Little’s Law, the average waiting time of

customers is proportional to the average backorder cost, so average waiting times can be deduced

from Figure 3. We emphasize that “waiting time” here refers to the span of time between when an

order becomes due and when the order is satisfied. With ADI, an order becomes due when it exits

the demand leadtime system.) As we can see, ADI does not always reduce the backorder costs. In

some cases, the supplier uses ADI to reduce inventory holding cost at the expense of backorder cost.

Generally, whether backorder cost, holding cost, or both decrease (both cannot increase) depends

22

upon problem parameters. Hence, there is no guarantee that sharing ADI will lead to improved

service levels to the customers.

This, of course, raises the question of why a customer would be willing to provide ADI only to

see service levels suffer. One possible answer is that in practice customers who provide ADI also

require a contractual agreement that service levels be improved or, alternatively, that the penalties

for poor service be increased. For example, the customer could offer ADI, but simultaneously

increase the penalty for backorders. In Figure 4, for a system with a single stage and λ = 0.8, we

illustrate the impact on the cost of the supplier of having customers simultaneously provide ADI

and increase unit backorder costs. The figure shows the average cost for the system with no ADI

and b = 50, as well as the average cost for the system with ADI for different values of b. Figure 5

shows the average number of backorders as a function of b for systems with ADI. Of course, average

costs increase and backorder levels decrease as the backorder penalty increases.

For given b and ν there is a value bmax = bmax(b, ν) such that the supplier is indifferent to not

receiving ADI while using backorder cost b and receiving ADI with mean demand leadtime ν−1

while using backorder cost bmax. From Figure 4, it can be seen that, for instance, if ν = 0.05

then the supplier is indifferent between operating without ADI at b = 50 and operating with ADI

at bmax(50, 0.05) ≈ 69. [The figure also shows that bmax(50, 0.2) ≈ 57 and bmax(50, 0.5) ≈ 53.] In

other words, in exchange for receiving ADI with ν = 0.05, the supplier is willing to accept up to a

40% increase in the backorder penalty rate. Figure 5 shows that providing ADI with ν = 0.05 in

combination with an increase in the backorder cost rate to bmax ≈ 69 causes the average number

of backorders to decrease to roughly 0.50 from the value of 0.67 obtained in the absence of ADI.

However, given a value of ν, even the maximum increase (from b to bmax) in the backorder cost

rate that the supplier will accept may not be sufficient to reduce backorders to the level found

without ADI. For example, if b = 50 and ν = 0.5, then the supplier is willing to increase the

backorder cost to at most bmax ≈ 53 in exchange for ADI; see Figure 4. However, Figure 5 shows

that even with a backorder cost of 53, the average number of backorders with ADI and ν = 0.5 is

about 0.70, which exceeds 0.67 — the average number of backorders for the system without ADI

and b = 50. Obviously, with backorder penalties there are additional financial transfers from the

supplier to the customer, which may compensate for the lower service levels.

In practice, there may be other strategies available to customers to mitigate the negative impact

of ADI. For example, customers could charge their suppliers a fee for the demand information they

provide. Customers could also use the possibility of providing demand information to strengthen

their bargaining position during price negotiation with their suppliers.

23

30 40 50 60 70 80 90 100 11050

60

70

80.26

90

100

110

b

Ave

rage

Cos

t

ν = 0.05ν = 0.2ν = 0.5

Average cost for a system without ADI and b=50

Figure 4: Average cost with ADI versus unit

backorder cost rate.

30 40 50 60 70 80 90 100 1100.3

0.4

0.5

0.6

0.670.7

0.8

0.9

1

1.1

1.2

1.3

b

Ave

rage

Num

ber

of B

acko

rder

s

ν = 0.05ν = 0.2ν = 0.5

Average number of backorders without ADI and b=50

Figure 5: Average number of backorders ver-

sus unit backorder cost rate.

5 Extension to Systems with Sequential Due Dates

In this paper, we have focused on a particular form of ADI updating, in which orders that have

been announced are updated independently of each other. In this section we briefly discuss how

our analysis can be extended to systems where the independent updating assumption does not

hold. In particular, we consider a system in which ADI is revealed through a process where

orders are updated and become due in the same order they are announced. We call this a system

with sequential due dates (SDD). Systems with SDD are similar in all aspects to systems we

have considered so far, except for how orders progress through the leadtime system. For example

consider a supplier who produces a component for a manufacturer. The manufacturer’s production

process is a serial production line comprised of a series of workstations that process jobs on a

first-come, first-served (FCFS) basis. If a workstation is busy, incoming jobs wait in its queue. The

manufacturer informs the supplier each time it releases a job to the line (this may correspond to

the manufacturer receiving an order from its own customers) and updates the supplier each time a

job completes processing at one of the workstations. The supplier uses this information to estimate

when it will receive a delivery request from the manufacturer. Such a request coincides with a job

arriving at the workstation where the component provided by the supplier is needed.

SDD may arise in settings other than manufacturing. Consider, for example, a supplier that

produces a product sold through a single retailer, which continuously reviews its inventory and

follows a (Q, r) ordering policy. This means that the retailer places an order for Q units each time

its inventory position drops to r. The supplier has real-time access to retailer’s point-of-sale data

and is aware of the retailer’s ordering policy. Each order placed with the retailer can be used by

the supplier to update its estimate of the time until the next replenishment order. This updating

24

process progresses through Q stages, culminating in the placement of an order for a single batch of

Q units. From the perspective of the supplier, there is always exactly one announced order, whose

due date is updated each time the retailer’s inventory position changes. Once an order becomes

due, another order is simultaneously announced and starts the same process.

In a system with SDD, the leadtime system can be viewed as a serial queueing system consisting

of n servers (we refer to these as nodes). As we shall illustrate with several examples shortly, the

demand leadtime system could describe the internal processes of the customers that are observable

to the supplier. External arrivals to the demand leadtime system occur at node 1 and progress

sequentially through nodes 1, . . . , n. Completion of service at the n-th node corresponds to an

order becoming due. Service time at node i consists of ki stages, with the duration of each stage

being exponentially distributed with mean 1/νij for the j-th stage at node i. That is, service times

at node i have a phase-type distribution with ki phases in series. Special cases include the Erlang

distribution where all the phases have the same mean and the exponential distribution where the

number of phases is equal to one. Orders at each server are processed one at a time on a FCFS

basis.

Inter-arrival times to the demand leadtime system have a phase type distribution, with one or

more phases in series. To describe this arrival process, we introduce an additional node (node 0)

and let k0 denote the number of phases (or stages) associated with this node. The j-th stage in

node 0 has the exponential distribution with mean 1/ν0j for j = 1, ..., k0. Special cases are a Poisson

arrival process or an arrival process with Erlang inter-arrival times.

The state of the system is described by the pair (x,y) where the scalar x represents the net

inventory level and y = (yij : i = 0, . . . , n, j = 1, . . . , ki) represents the state of the demand

leadtime system. For i 6= 0, yij represents the number of orders in node i that are in stage j. For

stage 1, the state variable yi1 indicates the number of orders that are either waiting for service

at node i or have initiated the first stage of service at node i. Therefore, yi1 is a non-negative

integer that can be arbitrarily large. For j 6= 1, yij indicates whether or not there is an order

that has initiated the j-th stage of service. Since servers process units one at a time, there can

be at most one order at a time in stage j 6= 1. Hence, yij is either 0 or 1 for j = 2, . . . , ki and∑ki

j=2 yij ∈ {0, 1}. For node i = 0, y0j is either 0 or 1 and∑k0

j=1 y0j = 1, since there is exactly

one order at node 0 at all times. In summary, the state space is S := Z × Y, where Y := Y0 ∩ Y1,

Y0 := {y : y0j ∈ {0, 1} for j = 1, . . . , k0 and∑k0

j=1 y0j = 1}, and Y1 := {y : yij ∈ Z+ for j =

1, . . . , ki, i = 1, . . . , n and∑ki

j=2 yij ∈ {0, 1} for i = 1, . . . , n}.

It is possible to model a wide variety of settings through different combinations of the parameters

25

ki and n. We describe three examples below.

Example 1. Consider the example described earlier where a supplier provides a component to a

manufacture whose production system consists of a series of workstations. Let production times at

the supplier be exponentially distributed. Let also the manufacturer produce on a make-to-order

basis while facing a Poisson external demand process. The component provided by the supplier

is used in the (n + 1)th workstation and is expected to be delivered by the supplier as soon as

an order goes through the first n workstations. The manufacturer shares information about when

orders are released into the production system and when they complete an operation at any of the

workstations. In our general framework, this system corresponds to ki = 1 for i = 0, . . . , n.

Example 2. Consider a system similar to the one described in example 1, except that items at

the manufacturer are now processed one unit at time with all the operations carried out on a single

workstation (instead of a series of workstations). At any given time, there may be multiple orders

waiting to be processed in the queue of the workstation, but at most one order undergoing pro-

cessing. Each time the workstation completes an operation, the manufacturer informs the supplier.

The manufacturer also informs the supplier each time a new order arrives to the workstation. This

system corresponds to k0 = 1 and n = 1.

Example 3. Consider the example mentioned earlier of a supplier who has a single customer in the

form of a retailer. Let the retailer face a Poisson demand process with rate ν. The retailer uses a

(Q, r) ordering policy so that it places an order of size Q whenever its own inventory position (sum

of inventory on order and inventory on hand less backorders) reaches r. The retailer’s inventory

position takes on values Q+ r,Q+ r−1, . . . , r +1 with transition times between consecutive values

being exponentially distributed with rate ν. If the supplier has access to the retailer’s inventory

position, then the supplier can use the information to update the expected time at which the retailer

will place an order. From the perspective of the supplier there is exactly one announced order at

a time and the time between updates (there is a total of Q updates) is exponentially distributed

with rate ν. In this setting, one “unit” for the supplier is an order of size Q from the retailer and

the production time for this unit is exponential with rate µ. This case corresponds to k0 = Q and

n = 0 with ν0j = ν for j = 1, . . . , k0.

Define eij to be the vector with 1 in the (i, j)-position and zeros elsewhere, and let e00 be the

vector of zeros. Let V be the set of real-valued functions on S. We denote by v∗ ∈ V the value

function of the MDP. That is, v∗(x,y) is the minimum expected total discounted cost, given the

26

system starts in state (x,y). Let γ := µ+∑n

i=0

∑ki

j=1 νij . The optimality equation is v = T v where

T : V → V is defined by

T v(x,y) := γ−1

[c(x) +

n∑

i=0

ki∑

j=1

νijTijv(x,y) + µTµv(x,y)

]. (11)

The operator Tµ corresponds to the production decision and is defined in Section 3.1. The operators

{Tij} in (11) are defined in (12)–(17) below. The operators {T0j : j < k0} correspond to transitions

in the phase of the external arrival process, T0k0corresponds to an external arrival to the demand

leadtime system or to an order coming due in case n = 0, {Tij : i > 0, j < ki} correspond to a

transition in the phase of a service time, and {Tiki: i > 0} correspond to a transition of an order

between nodes (for i < n) and to an order coming due (for i = n). We have

T0jv(x,y) := v(x,y + [e0,j+1 − e0j ]I{y0j=1}) j = 1, . . . , k0 − 1 (12)

T0k0v(x,y) :=

v(x,y + [e01 − e0k0+ e11]I{y0k0

=1}) when n > 0

v(x − I{y0k0=1},y + [e01 − e0k0

]I{y0k0=1}) when n = 0.

(13)

Ti1v(x,y) := v(x,y + [ei2 − ei1]I{yi1≥1 and yiℓ=0 for ℓ=2,...,ki}) i = 1, . . . , n when ki ≥ 2 (14)

Tijv(x,y) := v(x,y + [ei,j+1 − eij]I{yij≥1}) i = 1, . . . , n; j = 2, . . . , ki − 1 (15)

Tikiv(x,y) := v(x,y + [ei+1,1 − eiki

]I{yiki≥1}) i = 1, . . . , n − 1 (16)

Tnknv(x,y) := v(x − I{ynkn≥1},y − enkn

I{ynkn≥1}) . (17)

Note that Ti1 for i = 1, . . . , n is defined by (16)–(17) when ki = 1.

In preparation for our main result of this section, we introduce an ordering on the space of

(node, phase)-indices. For (p, q), (r, w) ∈ {(i, j) : i = 0, . . . , n; j = 1, . . . , ki} ∪ {(0, 0)}, we define

(p, q) ≺ (r, w) to mean that one of the following two conditions holds: (i) p < r, or (ii) p = r and

q < w. Intuitively, (p, q) ≺ (r, w) means that an order at stage w of node r is closer to being due

than an order at stage q of node p. We will use the following analogs of Conditions (C1)–(C4).

(C1) ∆v(x,y) ≤ ∆v(x + 1,y) for all (x,y) ∈ S.

(C2) ∆v(x,y+epq) ≤ ∆v(x+1,y+erw) for all (x,y) and (p, q) ≺ (r, w) such that y+epq, y + erw ∈ Y.

(C3) ∆v(x,y+erw) ≤ ∆v(x,y+epq) for all (x,y) and (p, q) ≺ (r, w) such that y+epq,y+erw ∈ Y.

(C4) ∆v(x,y) ≤ 0 for all (x,y) ∈ S with x < 0.

When n ≥ 1 and k0 ≥ 2 we also use Condition (C5) below, which is related to the “arrival”

node, 0. Condition (C5) is needed to ensure that Condition (C3) is preserved by T ; see the proof

of Proposition S-1 in the Online Supplement. If n = 0 or k0 = 1, then Condition (C5) is vacuous.

27

(C5) ∆v(x,y+e01+e11) ≤ ∆v(x,y+e0q) for all (x,y) and q ≥ 2 such that y+e0q,y+e01+e11 ∈ Y.

The next theorem, which describes the structure of optimal policies for SDD systems, is the

main result of this section. We give a proof in Section S-8 of the Online Supplement. The argument

parallels the proof of Theorem 1 that is detailed in Section 3.1 of the text and in Section S-1 of the

Online Supplement. Specifically, we show that T preserves Conditions (C1)–(C5), which allows us

to conclude that v∗ satisfies Conditions (C1)–(C5), from which the theorem then follows.

Theorem 3 The state-dependent base-stock policy π∗ = {π∗(x,y)} given by

π∗(x,y) :=

0 if x ≥ sy

1 if x < sy

(18)

where sy := min{x : v∗(x + 1,y) − v∗(x,y) ≥ 0} is optimal. In addition, (a) the base-stock levels

satisfy sy+erw ∈ {sy+epq , sy+epq + 1} for (p, q) ≺ (r, w) such that y + epq,y + erw ∈ Y; and (b)

π∗(x,y) = 1 if x < 0.

As we did for systems with independent updating, it is possible to evaluate the benefit of ADI

with SDD features. Table S-3 in the Online Supplement contains numerical results for systems

with SDD, and shows the effects of parameters ν, k, and ρ.

6 Concluding Comments

In this paper, we considered production-inventory systems where the production facility has access

to ADI in the form of advance order announcements and subsequent updates. The ADI is not

perfect because orders may become due before or after announced expected due dates, the time

between updates is random, and announced orders may be canceled. In addition, only a fraction of

the customers may provide ADI. We considered two schemes through which demand information is

revealed, one in which due dates are independent and the other in which they are sequential. For

each scheme, we formulated the production control problem as a continuous-time Markov decision

process and showed that there is an optimal state-dependent base-stock policy, with base-stock

levels that are non-decreasing in the number of announced orders at each stage of update. We also

showed that the base-stock level increases by at most one unit with a unit increase in the number

of orders at any stage.

In numerical experiments, we observed that the cost reduction to the supplier from the intro-

duction of ADI is sensitive to the number of update stages and the length of each stage. Although

28

adding more update stages is always beneficial, increasing the average length of stages may increase

or decrease cost, with ADI being most valuable when the average stage length is moderate. We

also observed that in many cases, much of the benefit of updating can be achieved with one update.

Although ADI is always beneficial to the supplier, we observed that this may not be the case for the

customers who provide the ADI. In some cases, the supplier uses ADI to reduce inventory at the

expense of higher backorders. We showed that a possible remedy is for the customers to negotiate

higher backorder penalties in exchange for ADI.

There are several avenues for future research. It would be of interest to consider systems

where order sizes are variable and where the actual size of an order is not known exactly until the

order becomes due. This would generalize the model with cancelations by assigning a probability

distribution to orders that allows sizes other than zero or one. It would also be of interest to

consider multiple customer classes with differing backorder costs. Although the problem would be

made difficult by the need for a higher-dimensional state space, we expect there would again be an

optimal state-dependent base-stock policy, where the state would include backorder levels of orders

from each customer class. In addition to production, the policy would specify whether an order

that becomes due should be satisfied from available inventory, if there is any, or backordered. This

decision would of course depend on the backorder cost associated with the order’s class.

Appendix

Proof of Theorem 1. Any stationary policy that uses for each (x,y) ∈ Sm an action that attains

the minimum in Tv∗m(x,y) is optimal. Hence, the policy that prescribes action a = 1 in states

S1m := {(x,y) ∈ Sm : v∗m(x + 1,y) < v∗m(x,y)} and action a = 0 in states S0

m := {(x,y) ∈ Sm :

v∗m(x + 1,y) ≥ v∗m(x,y)} is optimal. By Proposition 1, v∗m satisfies Condition (C1), so π∗ defined

in (7) satisfies π∗(x,y) = 1 for (x,y) ∈ S1m and π∗(x,y) = 0 for (x,y) ∈ S0

m. Hence π∗ is optimal.

To prove (a), note that by Proposition 1, the value function v∗m satisfies conditions (C2) and

(C3). Applying condition (C3) l−j times and using the definition of sy+el, we have ∆v∗m(sy+el

,y+

ej) ≥ ∆v∗m(sy+el,y + el) ≥ 0, which implies sy+el

≥ sy+ej. Also, by condition (C2) and the

definition of sy+ej, we have ∆v∗m(sy+ej

+ 1,y + el) ≥ ∆v∗m(sy+ej,y + ej) ≥ 0, which implies

sy+ej+ 1 ≥ sy+el

. Therefore, sy+ej≤ sy+el

≤ sy+ej+ 1 and hence sy+el

is equal to either sy+ej

or sy+ej+ 1. Finally, part (b) is a consequence of the fact that v∗m satisfies condition (C4).

An alternative proof of the optimality of a state-dependent base-stock policy and for part (a) can

be obtained by casting the problem in terms of service rate control and using results on monotone

optimal policies for continuous-time MDPs in Veatch and Wein (1992).

29

Proof of Theorem 2. The first statement follows from Theorem 3.2 of GH. Lemma S-2 in the

Online Supplement shows that the conditions needed to apply their theorem hold for our problem.

For each m, extend the domain of v∗m from Sm to S by defining v∗m(x,y) := 0 for (x,y) ∈ S\Sm.

To prove the remaining statements, we begin by showing that there exists a pointwise convergent

subsequence of {v∗m}. Lemma 1 below implies for each (x,y) ∈ S that {v∗m(x,y)} is a bounded

sequence of real numbers (note that the bound does not depend upon m). Hence, for each (x,y),

the sequence {v∗m(x,y)} has a convergent subsequence in R.

Let {z1, z2, z3, . . . } be an enumeration of the countable space S [each zi is some element (x,y)

of S]. We now proceed with a diagonalization argument to construct the pointwise convergent

subsequence of {v∗m}. Let {m1,j : j = 1, 2, . . .} be such that limj→∞ v∗m1,j(z1) exists. Next, let

{m2,j : j = 1, 2, . . .} be a subsequence of {m1,j : j = 2, 3, . . .} such that limj→∞ v∗m2,j(z2) exists.

Note also that limj→∞ v∗m2,j(z1) exists, because {m2,j : j = 1, 2, . . .} ⊆ {m1,j : j = 2, 3, . . .}.

Continuing in this fashion, we proceed sequentially to extract subsequences of subsequences so

that for each n we have {mn,j : j = 1, 2, . . .} ⊆ {mn−1,j : j = n, n + 1, . . .} and limj→∞ v∗mn,j(zi)

exists for i = 1, . . . , n. Let mj := mj,j. It can now be seen that limj→∞ v∗mjexists pointwise.

(Alternatively, we may appeal to Tychonoff’s Theorem to reach this conclusion; see, e.g., Bremaud

1999.) Let v∗∗ denote the limit; that is, v∗∗ : S → R is defined to be the function for which

limj→∞ v∗mj(x,y) = v∗∗(x,y) for all (x,y) ∈ S.

Next, we show that the limit v∗∗ is in fact the value function v∗ for the problem with unbounded

jump rates. To do so, it will be helpful to re-write the optimality equation (6) from Section 3.1 as

v = Lmv where operator Lm is given by

Lmv(x,y) :=1

Qm(y)

[c(x) + λI{y<m}v(x,y + e1) +

k∑

i=2

νi−1yi−1v(x,y + ei − ei−1)

+ νkykv(x − 1,y − ek) + µ min{v(x,y), v(x + 1,y)}]

and Qm(y) := β + λI{y<m} +∑k

i=1 νiyi + µ. By rearranging terms, it can be checked that the

equation v = Lmv is equivalent to (6). Keep in mind that T in (6) depends upon m.

For any function v on S and any (x,y) ∈ S, observe that Lmv(x,y) = Lv(x,y) if m > y.

Hence, for any (x,y) ∈ S it follows that

v∗∗(x,y) = limj→∞

v∗mj(x,y) = lim

j→∞Lmj

v∗mj(x,y) = lim

j→∞Lv∗mj

(x,y). (19)

For any function v on S and any (x,y) ∈ S we next re-express Lv(x,y). To this end, for given

30

(x,y) consider the continuous function L(x,y) : Rk+3 → R defined by

L(x,y)(ϕ1, . . . , ϕk+3) :=1

Q(y)

[c(x) + λϕ1 +

k+1∑

i=2

νi−1yi−1ϕi + µ min{ϕk+2, ϕk+3}

].

It can now be seen that

Lv(x,y) = L(x,y)

(v(x,y + e1), v(x,y + e2 − e1), . . . , v(x,y + ek − ek−1),

v(x − 1,y − ek), v(x,y), v(x + 1,y))

.

For any sequence of functions {uj} with uj → u pointwise, we have

limj→∞

Luj(x,y) = limj→∞

L(x,y)

(uj(x,y + e1), uj(x,y + e2 − e1), . . . , uj(x,y + ek − ek−1),

uj(x − 1,y − ek), uj(x,y), uj(x + 1,y))

= L(x,y)

(u(x,y + e1), u(x,y + e2 − e1), . . . , u(x,y + ek − ek−1), (20)

u(x − 1,y − ek), u(x,y), u(x + 1,y))

= Lu(x,y) .

Note that in (20), we may pass the limit inside L(x,y) because L(x,y) is continuous. Applying the

preceding observation with {uj} = {v∗mj} and u = v∗∗ and using (19), it follows that v∗∗(x,y) =

Lv∗∗(x,y). Now, because (x,y) was arbitrary, we see that v∗∗ = Lv∗∗. That is, v∗∗ solves the

optimality equation for the problem with unbounded jump rates. Moreover, it can be seen from

Lemma 1 that v∗∗ is in BR(S) with c1 = β−2(h + b)(λ + µ) and c2 = β−1(h + b). Therefore, it

follows from the first part of the theorem that v∗∗ = v∗; that is, v∗∗ is the value function for the

problem with unbounded jump rates. It can now readily be verified that v∗ = limj→∞ v∗mjsatisfies

conditions (C1) through (C4) with Zk+(m − 1) and Z

k+(m) replaced by Z

k+.

Theorem 3.3 of GH implies that the stationary policy that produces in states S1 := {(x,y) ∈

S : v∗(x + 1,y) < v∗(x,y)} and that idles in states S0 := {(x,y) ∈ S : v∗(x + 1,y) ≥ v∗(x,y)}

is optimal for the problem with unbounded jump rates. By an argument identical to the proof of

Theorem 1, such a policy is a state-dependent base-stock policy that satisfies (a) and (b).

Lemma 1 0 ≤ v∗m(x,y) ≤ β−1(h + b)R(x,y) + β−2(h + b)(λ + µ) < ∞, where v∗m is the value

function for the problem with bounded jump rates in Section 3.1 and the function R is defined

in (10).

Proof. Fix m < ∞ and (x,y) ∈ Sm. To bound v∗m(x,y) from above, it suffices to obtain an upper

bound on the expected discounted cost of using the policy π+ that “always produces” [π+(x′,y′) = 1

31

for all (x′,y′) ∈ Sm]. To this end, we begin by developing an explicit construction of a version of

the continuous-time Markov chain (CTMC) induced by π+. Suppose that {Ai : i = 1, 2, . . . } is an

i.i.d. sequence of uniform [0, 1] random variables and that {Ei : i = 1, 2, . . . } is an i.i.d. sequence of

exponential random variables each with mean Λ−1, independent of {Ai : i = 1, 2, . . . }. Let E0 := 0,

En :=∑n

j=1 Ej for n = 1, 2, . . . , and N(t) := max{n ≥ 0 : En ≤ t}.

Recall the notation y′ =∑k

i=1 y′i. Consider the function f : Sm × [0, 1] → Sm given by

f((x′,y′), a) :=

(x′ + 1,y′) if a ∈[0, µ/Λ

]

(x′,y′ + e1I{y′<m}) if a ∈(µ/Λ, (λ + µ)/Λ

]

(x′,y′ + ei+1 − ei) if a ∈((λ + µ +

∑i−1j=1 νjy

′j)/Λ, (λ + µ +

∑ij=1 νjy

′j)/Λ

]

for i = 1, . . . , k − 1

(x′ − 1,y′ − ek) if a ∈((λ + µ +

∑k−1j=1 νjy

′j)/Λ, (λ + µ +

∑kj=1 νjy

′j)/Λ

]

(x′,y′) otherwise.

For the fixed value (x,y) ∈ Sm, define

(X0,Y0) := (x,y) (21)

(Xn,Yn) := f((Xn−1,Yn−1), An) n = 1, 2, . . .

and (X(t),Y(t)) := (XN(t),YN(t)). Note that {(X(t),Y(t)) : t ≥ 0} has the distribution of the

CTMC induced by π+, as desired.

For n = 1, 2, . . . define

Rn :=∣∣∣{j ∈ {1, . . . , n} : (Xj ,Yj) = (Xj−1 + 1,Yj−1)

}∣∣∣

Un :=∣∣∣{j ∈ {1, . . . , n} : (Xj ,Yj) = (Xj−1,Yj−1 + e1)

}∣∣∣

Dn :=∣∣∣{j ∈ {1, . . . , n} : (Xj ,Yj) = (Xj−1 − 1,Yj−1 − ek)

}∣∣∣

where here | · | is set cardinality. When k = 1, it is possible to visualize Rn, Un, and Dn by graphing

the transitions of {(X(t),Y(t))} on a two-dimensional grid. Then, Rn is how many of the first n

transitions are “to the right”, Un is how many of the first n transitions are “up”, and Dn is how

many of the first n transitions are “diagonal”. From the above definitions, note that

Xn = X0 + Rn − Dn and Yn = Y0 + Un − Dn . (22)

(Again, Yn =∑k

i=1 Yn,i.) For n = 1, 2, . . . also define

Un :=∣∣∣{j ∈ {1, . . . , n} : Aj ∈

(µ/Λ, (λ + µ)/Λ

]}∣∣∣ .

32

By construction, we have

Un ≤ Un and Dn ≤ Y0 + Un . (23)

Next we construct another process that is coupled with {(X(t),Y(t))}. Consider the function

g : Z+ × [0, 1] → Z+ given by

g(z, a) :=

z + 1 if a ∈[0, (λ + µ)/Λ

]

z otherwise.

Define

Z0 := |x| + y

Zn := g(Zn−1, An) n = 1, 2, . . . ,

where (x,y) is as in (21) and Z(t) := ZN(t). Observe that

Zn = Z0 + Rn + Un = |x| + y + Rn + Un . (24)

Combining (21)–(24), we get

|Xn| = |X0 + Rn − Dn| ≤ |X0| + Rn + Dn

≤ |X0| + Rn + Y0 + Un

≤ |X0| + Rn + Y0 + Un

= |x| + Rn + y + Un

= Zn.

Hence, |X(t)| ≤ Z(t). Next, define c†(x) := (h + b)x. Note that c(x) ≤ c†(|x|) and that c†(·) is

increasing on Z+. Therefore,

Eπ+

(x,y)

∫ ∞

t=0e−βtc(X(t))dt = E

∫ ∞

t=0e−βtc(X(t))dt ≤ E

∫ ∞

t=0e−βtc†(|X(t)|)dt ≤ E

∫ ∞

t=0e−βtc†(Z(t))dt,

where E is expectation on the probability space upon which {Ai} and {Ei} are defined and where

the initial state is (x,y). Hence, E∫ ∞t=0 e−βtc†(Z(t))dt is an upper bound on v∗m(x,y).

Regardless of m, the process {Z(t)} has the following “dynamics”: Z(0) = |x| + y and Z(·)

remains in state (say) z ∈ Z+ an exponential amount of time with mean 1/(λ + µ) before moving

to state z + 1 ∈ Z+. The latter fact can be verified by conditioning on the geometric number of

transitions made from z back to z by the embedded process {Zn}. Direct calculations using value

33

iteration [to compute the expected discounted cost accrued by {Z(t)} through the time of its (say)

j-th jump to the right] and induction show that

E

∫ ∞

t=0e−βtc†(Z(t))dt =

(h + b)(|x| + y)

λ + µ + β

i≥0

(λ + µ

λ + µ + β

)i

+h + b

λ + µ + β

i≥1

i

(λ + µ

λ + µ + β

)i

=(h + b)(|x| + y)

β+

(h + b)(λ + µ)

β2< ∞ ,

regardless of m. This completes the proof.

References

Bremaud, P., 1999. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer-

Verlag, New York.

Buzacott, J. A., Shanthikumar, J. G., 1993. Stochastic Models of Manufacturing Systems. Prentice-

Hall, Upper Saddle River, NJ.

Buzacott, J. A., Shanthikumar, J. G., 1994. Safety stock versus safety time in MRP controlled

production systems. Management Science 40, 1678–1689.

Chen, F., Song, J.-S., 2001. Optimal policies for multiechelon inventory problems with Markov-

modulated demand. Operations Research 49, 226–234.

de Vericourt, F., Karaesmen, F., Dallery, Y., 2002. Optimal stock allocation for a capacitated

supply system. Management Science 48, 1486–1501.

DeCroix, G. A., Mookerjee, V. S., 1997. Purchasing demand information in a stochastic-demand

inventory system. European Journal of Operational Research 102, 36–57.

Duenyas, I., Hopp, W. J., 1995. Quoting customer lead times. Management Science 41, 43–57.

Gallego, G., Ozer, O., 2001. Integrating replenishment decisions with advance order information.

Management Science 47, 1344–1360.

Gallego, G., Ozer, O., 2002. Optimal use of demand information in supply chain management. In:

Song, J., Yao, D. (Eds.), Supply Chain Structures: Coordination, Information and Optimization.

Kluwer Academic Publishers, pp. 119–160.

Gallego, G., Ozer, O., 2003. Optimal replenishment policies for multi-echelon inventory problems

under advance order information. Manufacturing Service & Operations Management 5, 157–175.

34

Gavirneni, S., Kapuscinski, R., Tayur, S., 1999. Value of information in capacitated supply chains.

Management Science 45, 16–24.

Gayon, J.-P., Benjaafar, S., de Vericourt, F., 2009. Using imperfect advance demand informa-

tion in production-inventory systems with multiple customer classes. Manufacturing & Service

Operations Management 11, 128–143.

Graves, S. C., Meal, H. C., Dasu, S., Qiu, Y., 1986. Two-stage production planning in a dynamic

environment. In: Axsater, S., Schneeweiss, C., Silver, E. (Eds.), Multi-stage Production Planning

and Control. Springer-Verlag, Berlin.

Gullu, R., 1996. On the value of information in dynamic production/inventory problems under

forecast evolution. Naval Research Logistics 43, 289–303.

Guo, X., Hernandez-Lerma, O., 2003. Continuous-time controlled Markov chains with discounted

rewards. Acta Applicandae Mathematicae 79, 195–216.

Ha, A. Y., 1997. Inventory rationing in a make-to-stock production system with several demand

classes and lost sales. Management Science 43, 1093–1103.

Hariharan, R., Zipkin, P., 1995. Customer-order information, leadtimes, and inventories. Manage-

ment Science 41, 1599–1607.

Heath, D. C., Jackson, P. L., 1994. Modeling the evolution of demand forecasts with application to

safety-stock analysis in production/distribution systems. IIE Transactions 26, 17–30.

Hopp, W. J., Sturgis, M. R., 2001. A simple, robust leadtime-quoting policy. Manufacturing &

Service Operations Management 3, 321–336.

Karaesmen, F., Buzacott, J. A., Dallery, Y., 2002. Integrating advance order information in make-

to-stock production systems. IIE Transactions 34, 649–662.

Karaesmen, F., Liberopoulos, G., Dallery, Y., 2004. The value of advance demand information in

production/inventory systems. Annals of Operations Research 126, 135–158.

Liberopoulos, G., Chronis, A., Koukoumialos, S., 2003. Base stock policies with some unreliable

advance demand information. In: Proceeding of the 4th Aegean International Conference on

Analysis of Manufacturing Systems. pp. 77–86.

Lippman, S., 1975. Applying a new device in the optimization of exponential queueing systems.

Operations Research 23, 687–710.

35

Ozer, O., 2003. Replenishment strategies for distribution systems under advance demand informa-

tion. Management Science 49, 255–272.

Ozer, O., Wei, W., 2004. Inventory control with limited capacity and advance demand information.

Operations Research 52, 988–1000.

Puterman, M. L., 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming.

John Wiley & Sons, New York.

Schwarz, L. B., Petruzzi, N. C., Wee, K., 1997. The value of advance-order information and the

implications for managing the supply chain: an information/control/buffer portfolio perspective,

working paper, Purdue University.

Sethi, S. P., Yan, H., Zhang, H., 2001. Peeling layers of an onion: Inventory model with multiple

delivery modes and forecast updates. Journal of Optimization Theory and Applications 108,

253–281.

Thonemann, U. W., 2002. Improving supply-chain performance by sharing advance demand infor-

mation. European Journal of Operational Research 142, 81–107.

van Donselaar, K., Kopczak, L. R., Wouters, M., 2001. The use of advance demand information in

a project-based supply chain. European Journal of Operational Research 130, 519–528.

Veatch, M. H., Wein, L. M., 1992. Monotone control of queueing networks. Queueing Systems 12,

391–408.

Veatch, M. H., Wein, L. M., 1996. Scheduling a make-to-stock queue: Index policies and hedging

points. Operations Research 44, 634–647.

Wang, T., Toktay, B. L., 2008. Inventory management with advance demand information and

flexible delivery. Management Science 54, 716–732.

Zhu, K., Thonemann, U. W., 2004. Modeling the benefits of sharing future demand information.

Operations Research 52, 136–147.

Zipkin, P. H., 2000. Foundations of Inventory Management. McGraw-Hill, New York.

36

Online Supplement

S-1 Proof of Proposition 1

Proof. The proof of the proposition has two main components. We first establish that for any

v ∈ U , we have that Tv ∈ U . Then we use this fact to show that v∗m ∈ U .

Define Ti by

Tiv(x,y) := yiT′iv(x,y) + (m − yi)v(x,y) i = 1, . . . , k . (S-1)

Operator T defined in (4) can now be expressed as

Tv(x,y) = γ−1[c(x) + λTλv(x,y) +

k∑

i=1

νiTiv(x,y) + µTµv(x,y)].

It is easy to verify that c(·) ∈ U . Therefore, to prove that if v ∈ U then Tv ∈ U , it suffices to show

that if v ∈ U , then Tλv, Tµv, Tiv ∈ U . Clearly, Tλv, Tµv, Tiv ∈ V when v ∈ V . In the remainder

of the proof, we show that v ∈ U implies that Tλv, Tµv, and Tiv satisfy (C1)–(C4). To this end,

suppose v ∈ U .

Condition (C1):

It is straightforward to verify that Tλ and Ti; i = 1, . . . , k preserve condition (C1). For Tµ, we need

to show that ∆Tµv(x,y) is non-decreasing in x. Let x∗y

:= min{x : ∆v(x,y) ≥ 0}. Then,

∆Tµv(x,y) =

∆v(x + 1,y) if x < x∗y− 1,

0 if x = x∗y− 1,

∆v(x,y) if x > x∗y− 1.

Hence, ∆Tµv(x,y) is non-decreasing in x because v is (by assumption).

Condition(C2):

(i) For operator Tλ we need to show that ∆Tλv(x,y+ej) ≤ ∆Tλv(x+1,y+el) for j = 0, . . . , k−1,

and l = j + 1, . . . , k. If∑k

i=1 yi < m − 1, then

∆Tλv(x,y + ej) = ∆v(x,y + ej + e1) ≤ ∆v(x + 1,y + el + e1) = ∆Tλv(x + 1,y + el).

If∑k

i=1 yi = m − 1, then

∆Tλv(x,y + ej) = ∆v(x,y + ej) ≤ ∆v(x + 1,y + el) = ∆Tλv(x + 1,y + el).

The inequalities follow from the fact that v satisfies condition (C2).

S-1

(ii) For operator Ti, we need to show that ∆Tiv(x,y+ej) ≤ ∆Tiv(x+1,y+el) for j = 0, . . . , k−1

and l = j + 1, . . . , k. For i = 1, . . . , k − 1, let J = I{i=j} and L = I{i=l}. When (J,L) ∈

{(0, 0), (0, 1)} the inequalities (S-6) and (S-7) below hold because v satisfies condition (C2).

For (J,L) = (1, 0), the inequalities hold because v satisfies conditions (C2) and (C3).

If yi ≥ 1, we have

∆Tiv(x,y + ej) − ∆Tiv(x + 1,y + el)

= (yi + J)∆v(x,y + ej + ei+1 − ei) + (m − yi − J)∆v(x,y + ej)

−(yi + L)∆v(x + 1,y + el + ei+1 − ei) − (m − yi − L)∆v(x + 1,y + el)

= yi

[∆v(x,y + ej + ei+1 − ei) − ∆v(x + 1,y + el + ei+1 − ei)

](S-2)

+(m − yi − L)[∆v(x,y + ej) − ∆v(x + 1,y + el)

](S-3)

+L[∆v(x,y + ej) − ∆v(x + 1,y + el + ei+1 − ei)

](S-4)

+J[∆v(x,y + ej + ei+1 − ei) − ∆v(x,y + ej)

](S-5)

≤ 0. (S-6)

For additional clarification, observe that if (J,L) = (1, 0), then the term in (S-4) is zero and

we also have i = j and i + 1 = j + 1. So, in (S-5) we have

J[∆v(x,y + ej + ei+1 − ei) − ∆v(x,y + ej)

]= ∆v(x,y + ej+1) − ∆v(x,y + ej).

This expression is non-positive because v is assumed to satisfy condition (C3). The terms in

(S-2) and (S-3) are non-positive because v satisfies condition (C2).

If yi = 0, we have

∆Tiv(x,y + ej) − ∆Tiv(x + 1,y + el)

= J∆v(x,y + ej + J(ei+1 − ei)) + (m − J)∆v(x,y + ej)

−L∆v(x + 1,y + el + L(ei+1 − ei)) − (m − L)∆v(x + 1,y + el)

= (m − L)[∆v(x,y + ej) − ∆v(x + 1,y + el)

]

+L[∆v(x,y + ej) − ∆v(x + 1,y + el + L(ei+1 − ei))

]

+J[∆v(x,y + ej + J(ei+1 − ei)) − ∆v(x,y + ej)

]

≤ 0. (S-7)

Now we consider operator Tk. Let I = I{l=k}. In both cases below (yk ≥ 1 and yk = 0), the

inequalities follow from the fact that v satisfies conditions (C2) and (C3).

S-2

If yk ≥ 1, we have

∆Tkv(x,y + ej) − ∆Tkv(x + 1,y + el)

= yk∆v(x − 1,y + ej − ek) + (m − yk)∆v(x,y + ej)

−(yk + I)∆v(x,y + el − ek) − (m − yk − I)∆v(x + 1,y + el)

= yk

[∆v(x − 1,y + ej − ek) − ∆v(x,y + el − ek)

]

+(m − yk − I)[∆v(x,y + ej) − ∆v(x + 1,y + el)

]

+I[∆v(x,y + ej) − ∆v(x,y + el − ek)

]

≤ 0.

For yk = 0, we have

∆Tkv(x,y + ej) − ∆Tkv(x + 1,y + el)

= m∆v(x,y + ej) − I∆v(x + 1 − I,y + el − Iek) − (m − I)∆v(x + 1,y + el)

= (m − I)[∆v(x,y + ej) − ∆v(x + 1,y + el)

]

+I[∆v(x,y + ej) − ∆v(x + 1 − I,y + el − Iek)

]

≤ 0.

(iii) To verify Tµv satisfies condition (C2), let x∗y+ej

:= min{x : ∆v(x,y + ej) ≥ 0} and x∗y+el

:=

min{x : ∆v(x,y + el) ≥ 0}. By condition (C3) and the definition of x∗y+el

, we have

∆v(x∗y+el

,y + ej) ≥ ∆v(x∗y+el

,y + el) ≥ 0. This implies x∗y+el

≥ x∗y+ej

. Also, by condi-

tion (C2) and the definition of x∗y+ej

, we have ∆v(x∗y+ej

+1,y+el) ≥ ∆v(x∗y+ej

,y+ej) ≥ 0.

This implies x∗y+ej

+ 1 ≥ x∗y+el

. Therefore, x∗y+ej

≤ x∗y+el

≤ x∗y+ej

+ 1 and consequently

x∗y+el

is equal to either x∗y+ej

or x∗y+ej

+ 1.

If x∗y+el

= x∗y+ej

, we distinguish four subcases:

(a) x < x∗y+el

− 2 = x∗y+ej

− 2:

∆Tµv(x + 1,y + el) = ∆v(x + 2,y + el) ≥ ∆v(x + 1,y + ej) = ∆Tµv(x,y + ej).

(b) x = x∗y+el

− 2 = x∗y+ej

− 2:

∆Tµv(x + 1,y + el) = 0 ≥ ∆v(x + 1,y + ej) = ∆Tµv(x,y + ej).

(c) x = x∗y+el

− 1 = x∗y+ej

− 1:

∆Tµv(x + 1,y + el) = ∆v(x + 1,y + el) ≥ 0 = ∆Tµv(x,y + ej).

S-3

(d) x > x∗y+el

− 1 = x∗y+ej

− 1:

∆Tµv(x + 1,y + el) = ∆v(x + 1,y + el) ≥ ∆v(x,y + ej) = ∆Tµv(x,y + ej).

If x∗y+el

= x∗y+ej

+ 1, we distinguish three subcases:

(a) x < x∗y+ej

− 1:

∆Tµv(x + 1,y + el) = ∆v(x + 2,y + el) ≥ ∆v(x + 1,y + ej) = ∆Tµv(x,y + ej).

(b) x = x∗y+el

− 2 = x∗y+ej

− 1:

∆Tµv(x + 1,y + el) = ∆Tµv(x,y + ej) = 0.

(c) x > x∗y+ej

− 1:

∆Tµv(x + 1,y + el) = ∆v(x + 1,y + el) ≥ ∆v(x,y + ej) = ∆Tµv(x,y + ej).

Condition (C3):

(i) For operator Tλ it is straightforward to check that Tλv satisfies condition (C3) when v ∈ U .

(ii) For operator Ti, we need to show that ∆Tiv(x,y+ej+1) ≤ ∆Tiv(x,y+ej) for j = 0, . . . , k−1.

Consider first i = 1, . . . , k − 1, and let J = I{i=j} and K = I{i=j+1}. The three possible

combinations of J and K are (J,K) ∈ {(0, 0), (0, 1), (1, 0)}. The inequalities (S-8) and (S-9)

below follow from the fact that v satisfies condition (C3).

If yi ≥ 1, we have

∆Tiv(x,y + ej+1) − ∆Tiv(x,y + ej)

= (yi + K)∆v(x,y + ej+1 + ei+1 − ei) + (m − yi − K)∆v(x,y + ej+1)

−(yi + J)∆v(x,y + ej + ei+1 − ei) − (m − yi − J)∆v(x,y + ej)

= yi

[∆v(x,y + ej+1 + ei+1 − ei) − ∆v(x,y + ej + ei+1 − ei)

]

+(m − yi − K − J)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

+K[∆v(x,y + ej+1 + ei+1 − ei) − ∆v(x,y + ej)

]

+J[∆v(x,y + ej+1) − ∆v(x,y + ej + ei+1 − ei)

]

≤ 0. (S-8)

S-4

If yi = 0, we have

∆Tiv(x,y + ej+1) − ∆Tiv(x,y + ej)

= K∆v(x,y + ej+1 + K(ei+1 − ei)) + (m − K)∆v(x,y + ej+1)

−J∆v(x,y + ej + J(ei+1 − ei)) − (m − J)∆v(x,y + ej)

= (m − K − J)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

+K[∆v(x,y + ej+1 + K(ei+1 − ei)) − ∆v(x,y + ej)

]

+J[∆v(x,y + ej+1) − ∆v(x,y + ej + J(ei+1 − ei))

]

≤ 0. (S-9)

Now we consider operator Tk. Let I = I{j=k−1}. The inequalities (S-10) and (S-11) below

follow from the fact that v satisfies conditions (C2) and (C3).

If yk ≥ 1, we have

∆Tkv(x,y + ej+1) − ∆Tkv(x,y + ej)

= (yk + I)∆v(x − 1,y + ej+1 − ek) + (m − yk − I)∆v(x,y + ej+1)

−yk∆v(x − 1,y + ej − ek) − (m − yk)∆v(x,y + ej)

= yk

[∆v(x − 1,y + ej+1 − ek) − ∆v(x − 1,y + ej − ek)

]

+(m − yk − I)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

+I[∆v(x − 1,y + ej+1 − ek) − ∆v(x,y + ej)

]

≤ 0. (S-10)

If yk = 0, we have

∆Tkv(x,y + ej+1) − ∆Tkv(x,y + ej)

= I∆v(x − I,y + ej+1 − Iek) + (m − I)∆v(x,y + ej+1) − m∆v(x,y + ej)

= (m − I)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

+I[∆v(x − I,y + ej+1 − Iek) − ∆v(x,y + ej)

]

≤ 0. (S-11)

(iii) For Tµ, we need to show that ∆Tµv(x,y + ej+1) ≤ ∆Tµv(x,y + ej) for j = 0, . . . , k − 1.

In part (iii) of the argument for condition (C2), we showed that x∗y+ej+1

is either x∗y+ej

or

x∗y+ej

+ 1.

If x∗y+ej+1

= x∗y+ej

, we distinguish three subcases:

S-5

(a) x < x∗y+ej

− 1 :

∆Tµv(x,y + ej+1) = ∆v(x + 1,y + ej+1) ≤ ∆v(x + 1,y + ej) = ∆Tµv(x,y + ej).

(b) x = x∗y+ej

− 1 :

∆Tµv(x,y + ej+1) = ∆Tµv(x,y + ej) = 0.

(c) x > x∗y+ej

− 1 :

∆Tµv(x,y + ej+1) = ∆v(x,y + ej+1) ≤ ∆v(x,y + ej) = ∆Tµv(x,y + ej).

If x∗y+ej+1

= x∗y+ej

+ 1, we distinguish four subcases:

(a) x < x∗y+ej

− 1 :

∆Tµv(x,y + ej+1) = ∆v(x + 1,y + ej+1) ≤ ∆v(x + 1,y + ej) = ∆Tµv(x,y + ej).

(b) x = x∗y+ej

− 1 :

∆Tµv(x,y + ej+1) = ∆v(x + 1,y + ej+1) ≤ 0 = ∆Tµv(x,y + ej).

(c) x = x∗y+ej

:

∆Tµv(x,y + ej+1) = 0 ≤ ∆v(x,y + ej) = ∆Tµv(x,y + ej).

(d) x > x∗y+ej

:

∆Tµv(x,y + ej+1) = ∆v(x,y + ej+1) ≤ ∆v(x,y + ej) = ∆Tµv(x,y + ej).

Condition (C4):

It is easy to verify that if v ∈ U , then Tλv and Tiv; i = 1, . . . , k satisfy condition (C4). For Tµ,

when x < 0, we have:

Tµv(x + 1,y) = min{v(x + 2,y), v(x + 1,y)} ≤ v(x + 1,y) ≤ v(x,y).

Therefore,

Tµv(x + 1,y) ≤ min{v(x + 1,y), v(x,y)} = Tµv(x,y).

This completes the first main component of the proof.

To complete the proof of the proposition, let T 1 = T and define T n = T ◦ T n−1;n > 1. By

Propositions 3.1.5 and 3.1.6 of Bertsekas (2001), v∗m = limn→∞ T nv for any bounded function

S-6

v ∈ V . Take v = v0, where v0 is the function that is identically zero on Sm. It is simple to show

by induction that

0 ≤ T nv0(x,y) ≤b + h

γ

n−1∑

j=0

(|x| + j)αj ,

where α := γ−1(λ+µ+m∑k

i=1 νi) ∈ [0, 1). Hence, we have 0 ≤ v∗m(x,y) = limn→∞ T nv0(x,y) < ∞.

Therefore, v∗m is a real-valued function on Sm (i.e., v∗m ∈ V ).

Note that v0 ∈ U , and so it follows from the argument above that Tv0 ∈ U , and consequently

T nv0 ∈ U for each n. Moreover, it can readily be seen that if functions {vn} and u are such that

vn ∈ U for all n and vn → u ∈ V pointwise, then u ∈ U . We have established that T nv0 ∈ U and

that T nv0 → v∗m ∈ V , and therefore it follows that v∗m ∈ U .

S-2 The Average-Cost Optimality Criteria

Consider the IDD framework of Section 3.1 with m < ∞. A direct analog of Theorem 1 holds for

the average cost criteria. To set the stage, for any policy π ∈ Π, its average-cost is given by

Jπ(x,y) := lim supn→∞

Eπ(x,y)

∑n−1l=0 c(Xl)[τl+1 − τl]

Eπ(x,y)[τn]

= lim supn→∞

Eπ(x,y)

∑n−1l=0 c(Xl)

n.

Let J(x,y) := infπ∈Π Jπ(x,y). A policy that gives average cost J(x,y) for all (x,y) ∈ Sm is said

to be optimal for the average-cost problem.

Theorem S-1 Suppose λ < µ. Then there exists a stationary state-dependent base-stock policy

πA = {πA(x,y)} that is optimal for the average-cost problem. Its base-stock levels {sAy} satisfy the

conditions in (a) and (b) in Theorem 1. In addition, the optimal average cost is finite and indepen-

dent of the initial state; i.e., there is a finite constant J such that J(x,y) = J for all (x,y) ∈ Sm.

Proof. The main idea of the proof is to obtain the desired results for the average-cost problem

by using Proposition 1 for the discounted-cost problem and letting β ↓ 0.

Given a discount rate, let v(x,y) := γv∗m(x,y). The optimality equation (6) can be rewritten

as:

v(x,y) = mina∈{0,1}

{c(x) +

Λ

γ

(x′,y′)∈S

p(x,y),(x′,y′)(a)v(x′,y′)}.

Let α = Λ/γ = Λ/(β + Λ), and define hα(x,y) := vα(x,y) − vα(0, e0), where we have appended a

subscript α to indicate dependence on α (and hence on β).

Parts (i) and (ii) of Theorem 7.2.3 in Sennott (1999) state that under conditions I, II, and III

given below there exists a sequence {αn := Λ/(βn + Λ)} and a real-valued function h(·) such that

S-7

αn ↑ 1 and limn→∞ hαn(x,y) = h(x,y). (Note that αn ↑ 1 means βn ↓ 0.) Moreover, the function

h(·) satisfies

J + h(x,y) ≥ mina∈{0,1}

{c(x) +

(x′,y′)∈S

p(x,y),(x′,y′)(a)h(x′,y′)}

, (S-12)

where J := limα↑1(1 − α)vα(x,y) = limα↑1(1 − α)Λαvα(x,y) is a finite constant. By Theorem

7.2.3(ii), any stationary policy that for each (x,y) selects an action that minimizes the right-hand

side of (S-12) is optimal and yields constant average cost J . Hence, properties of the average-cost

optimal policy are determined through function h(·) in much the same way as were properties of

the discounted-cost optimal policy determined through v∗m(·).

In the following we show that h(·) satisfies conditions (C1)–(C4). For (C1), we need to show

that ∆h(x,y) ≤ ∆h(x + 1,y) for (x,y) ∈ Sm. We have

∆hα(x,y) = ∆vα(x,y) ≤ ∆vα(x + 1,y) = ∆hα(x + 1,y),

so ∆h(x,y) = ∆ limn→∞ hαn(x,y) = limn→∞ ∆hαn(x,y) ≤ limn→∞ ∆hαn(x + 1,y) = ∆ limn→∞

hαn(x+1,y) = ∆h(x+1,y). Similar arguments show that h(·) also satisfies conditions (C2)–(C4).

Hence, as in the proof of Theorem 1, it follows that the policy

πA(x,y) :=

0 if x ≥ sAy

1 if x < sAy

with sAy

:= min{x : h(x + 1,y) − h(x,y) ≥ 0} is optimal for the average-cost problem, and that

properties (a) and (b) described in Theorem 1 hold for {sAy} and πA. For additional background

on the above approach, see Section 8.11 of Puterman (1994).

It remains to show that our problem satisfies conditions that allow application of Theorem 7.2.3.

By Theorem 7.5.6 and Corollary 7.5.9 of Sennott (1999), the following conditions are sufficient to

apply Theorem 7.2.3.

(I) There exists a stationary policy and a state z ∈ Sm such that the induced Markov chain

has a positive recurrent class R ⊆ Sm and the expected first passage time and expected first

passage cost (for definitions see Lemma S-1 below) from any state (x,y) ∈ Sm to z are finite.

(II) For each u > 0, the set {(x,y) ∈ Sm : c(x) ≤ u} is finite.

(III) For each state (x,y) ∈ Sm\R, there exists a policy that induces a Markov chain for which the

expected first passage time and expected first passage cost from state z to (x,y) are finite.

S-8

Let z = (0, e0). Define := {(x,y) : (x,y) ∈ Sm} to be the stationary policy that produces if

the net inventory is less than zero and idles if the net inventory is at least zero; i.e. (x,y) = I{x<0}.

In Lemma S-1 below we show that under policy , the class R := {(x,y) ∈ Sm : x ≤ 0} is positive

recurrent and the expected first passage time and cost from any state (x,y) ∈ Sm to (0, e0) are

finite. This verifies that condition I holds. Condition II is clearly satisfied for our case, since

c(x) = hx+ + bx− is convex in x with a minimum at c(0) = 0, and c(x) ↑ ∞ as x ↑ ∞ or as

x ↓ −∞. To prove that condition III holds one may use an argument similar to that in the proof

of the condition I. The details are omitted for brevity.

Lemma S-1 Under policy , the class R := {(x,y) ∈ Sm : x ≤ 0} is positive recurrent. In

addition, E(x,y)T < ∞ and E

(x,y)C < ∞ for all (x,y) ∈ Sm, where T := min{n > 0 : (Xn,Yn) =

(0, e0)} and C :=∑T−1

t=0 c(Xt); that is, the expected first passage time and expected first passage

cost of going from any state (x,y) ∈ Sm to state (0, e0) are finite.

Proof. Define set G := {(x,y) ∈ Sm : x = 0}. It is straightforward to show that under policy ,

starting from a state (x,y) ∈ Sm \ R, the expected first passage time and cost to enter the set G

are both finite. Observe also that under , a Markov chain that starts in R, remains in R; that is

if (x,y) ∈ R, then p(x,y),(x′,y′)((x,y)) = 0 for (x′,y′) /∈ R. Since G ⊂ R and |G| < ∞, to prove

the lemma we need only to show that R is positive recurrent and the expected first passage cost of

going from any state (x,y) ∈ R to state (0, e0) is finite (because by Proposition C.1.4 of Sennott,

positive recurrence of R implies E(x,y)T < ∞ for all (x,y) ∈ R).

Let VR be the set of real-valued functions on R. To show R is positive recurrent it suffices by

Foster’s Criterion (see, e.g., Bremaud, 1999, page 167) to identify a nonnegative function f ∈ VR,

a finite set H ⊂ R, and ǫ > 0 such that

Pf(x,y) ≤ f(x,y) − ǫ for all (x,y) ∈ R \ H, (S-13)

where operator P : VR → VR is defined by Pf(x,y) :=∑

(x′,y′)∈R p(x,y),(x′,y′)((x,y))f(x′,y′).

Define f(x,y) := −x+∑k

i=1 yi, H := G, and ǫ := (µ−λ)/Λ. Note that the condition λ < µ ensures

that ǫ > 0. Let y :=∑k

i=1 yi, L = I{y<m}, and Ii = I{yi≥1} for i = 1, . . . , k. For (x,y) ∈ R \H, we

S-9

have

Pf(x,y) =µ

Λf(x + 1,y) +

λ

Λf(x,y + Le1) +

k−1∑

i=1

νiyi

Λf(x,y + Ii(ei+1 − ei))

+νkyk

Λf(x − Ik,y − Ikek) +

k∑

i=1

νi(m − yi)

Λf(x,y)

≤µ

Λ(−x − 1 + y) +

λ

Λ(−x + y + 1) +

k−1∑

i=1

[νiyi

Λ(−x + y)

]

+νkyk

Λ(−x + y) +

k∑

i=1

[(m − yi)νi

Λ(−x + y)

]

= −x + y −µ

Λ+

λ

Λ

= f(x,y) − ǫ .

Therefore (S-13) holds, and R is positive recurrent. As mentioned above, this also yields E(x,y)T <

∞ for any state (x,y) ∈ R.

Now we show that the expected first passage cost of going from any state (x,y) ∈ R to state

(0, e0) is finite; that is E(x,y)C < ∞. The first step is to show that there exists a nonnegative

function g ∈ VR and a finite set H ⊂ R with (0, e0) ∈ H such that

Pg(x,y) ≤ g(x,y) − c(x) for all (x,y) ∈ R \ H.

Define g(x,y) := θκ−x+y and H := G. For κ > 1, θ > 0, a calculation as above shows that for

(x,y) ∈ R \ H, we have

g(x,y) − Pg(x,y) ≥θ

Λκ−x+y−1

[(λ + µ)κ − µ − λκ2

]. (S-14)

For κ ∈ (1, µ/λ) the term in square brackets in (S-14) is strictly positive. Hence, for such κ and

with θ large enough, the right-hand side of (S-14) exceeds c(x) = hx+ + bx− for all (x,y) ∈ R \H.

The assumption λ < µ ensures that (1, µ/λ) is non-empty. By Corollary C.2.4 of Sennott, the above

proves that E(0,e0)C < ∞. Finally, by Proposition C.2.2(iv) of Sennott it follows that E

(x,y)C < ∞

for any state (x,y) ∈ R. This completes the proof.

S-3 Extension of Theorem 1 to a Random Number of Updates

To extend the model to a setting with a random number of updates, recall that we need to replace

T ′i in (5) by T ′

i given by

T ′iv(x,y) := (1 − qi)T

′iv(x,y) + qiv(x − I{yi≥1},y − eiI{yi≥1}) i = 1, . . . , k

S-10

as defined in (8).

Define operator T by

T v(x,y) :=

k∑

i=1

νi

[yiT

′iv(x,y) + (m − yi)v(x,y)

]

=k∑

i=1

νi

[(1 − qi)Tiv(x,y) + qiT iv(v,y)

](S-15)

where operators {Ti} are defined in (S-1) and operators {T i} are defined by

T iv(x,y) := yiv(x − I{yi≥1},y − eiI{yi≥1}) + (m − yi)v(x,y) i = 1, . . . , k .

The optimality equation for a random number of updates can now be expressed as v = T v where

T is given by

T v(x,y) := γ−1[c(x) + λTλv(x,y) + T v(x,y) + µTµv(x,y)

]. (S-16)

To prove that Theorem 1 holds in the setting with a random number of updates, we need only

prove that T preserves conditions (C1)–(C4). Other steps in the proof carry over without change.

We have seen already (in the proof of Proposition 1) that Tλ, Tµ, and Ti, i = 1, . . . , k preserve

conditions (C1)–(C4). To prove that T preserves conditions (C1)–(C4), it is sufficient [in view of

the preceding fact and (S-15) and (S-16)] to show that T preserves conditions (C1)–(C4) where

Tv(x,y) :=∑k

i=1 νiqiT iv(x,y). Below, we prove this.

It is easy to check that T i preserves conditions (C1) and (C4) for each i = 1, . . . , k. It then

follows that T preserves conditions (C1) and (C4). We next turn to conditions (C2) and (C3).

Condition (C2):

Suppose that v ∈ U . We will prove for each i = 1, . . . , k − 1 that T iv satisfies condition (C2), from

which it follows that Tv satisfies condition (C2). [Note that T k = Tk, where Tk is defined in (S-1).

We already verified that Tk preserves condition (C2) in the proof of Proposition 1.]

Fix i ∈ {0, . . . , k − 1}, j ∈ {0, . . . , k − 1}, and l ∈ {j + 1, . . . , k}. Let J = I{i=j} and L = I{i=l}.

We will prove that ∆T iv(x,y + ej) ≤ ∆T i(x + 1,y + el).

S-11

Suppose first that yi ≥ 1. Then we have

∆T iv(x,y + ej) − ∆T iv(x + 1,y + el)

= (yi + J)∆v(x − 1,y + ej − ei) + (m − yi − J)∆v(x,y + ej)

−(yi + L)∆v(x,y + el − ei) − (m − yi − L)∆v(x + 1,y + el)

= yi

[∆v(x − 1,y + ej − ei) − ∆v(x,y + el − ei)

]

+(m − yi − L)[∆v(x,y + ej) − ∆v(x + 1,y + el)

]

+L[∆v(x,y + ej) − ∆v(x,y + el − ei)

]

+J[∆v(x − 1,y + ej − ei) − ∆v(x,y + ej)

]

≤ 0,

where the inequality follows by considering each of the three possibilities (J,L) = (0, 0), (0, 1), (1, 0)

separately and using the fact that v is assumed to satisfy conditions (C2) and (C3).

Suppose next that yi = 0. Then

∆T iv(x,y + ej) − ∆T iv(x + 1,y + el)

= J∆v(x − J,y + ej − Jej) + (m − J)∆v(x,y + ej)

−L∆v(x + 1 − L,y + el − Lel) − (m − L)∆v(x + 1,y + el)

= (m − L)[∆v(x,y + ej) − ∆v(x + 1,y + el)

]

+L[∆v(x,y + ej) − ∆v(x + 1 − L,y + el − Lel)

]

+J[∆v(x − J,y + ej − Jej) − ∆v(x,y + ej)

]

≤ 0.

where again the inequality follows by considering each of the three possibilities for (J,L) separately

and using the fact that v satisfies conditions (C2) and (C3).

Condition (C3)

Suppose again that v ∈ U . It turns out that it may be that T iv does not satisfy condition (C3).

So, we will show directly that Tv satisfies condition (C3), rather than trying to work with the

individual {T i}. This will complete the proof that if v ∈ U , then Tv ∈ U .

Fix j ∈ {0, . . . , k − 1}. We want to show that ∆Tv(x,y + ej+1) ≤ ∆Tv(x,y + ej). For each

i = 1, . . . , k let Ji = I{i=j}, and Ki = I{i=j+1}.

S-12

For i = 1, . . . , k, if yi ≥ 1 then

∆T iv(x,y + ej+1) − ∆T iv(x,y + ej)

= (yi + Ki)∆v(x − 1,y + ej+1 − ei) + (m − yi − Ki)∆v(x,y + ej+1)

− (yi + Ji)∆v(x − 1,y + ej − ei) − (m − yi − Ji)∆v(x,y + ej)

= yi

[∆v(x − 1,y + ej+1 − ei) − ∆v(x − 1,y + ej − ei)

]

+ (m − yi − Ki)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

+ Ki

[∆v(x − 1,y + ej+1 − ei) − ∆v(x,y + ej)

]

+ Ji

[∆v(x,y + ej) − ∆v(x − 1,y + ej − ei)

]

and if yi = 0 then

∆T iv(x,y + ej+1) − ∆T iv(x,y + ej)

= (yi + Ki)∆v(x − Ki,y + ej+1 − Kiei) + (m − yi − Ki)∆v(x,y + ej+1)

− (yi + Ji)∆v(x − Ji,y + ej − Jiei) − (m − yi − Ji)∆v(x,y + ej)

= yi

[∆v(x − 1,y + ej+1 − Kiei) − ∆v(x − 1,y + ej − Jiei)

]

+ (m − yi − Ki)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

+ Ki

[∆v(x − 1,y + ej+1 − Kiei) − ∆v(x,y + ej)

]

+ Ji

[∆v(x,y + ej) − ∆v(x − 1,y + ej − Jiei)

].

Therefore,

∆Tv(x,y + ej+1) − ∆Tv(x,y + ej) =

k∑

i=1

νiqi

[∆T iv(x,y + ej+1) − ∆T iv(x,y + ej)

]

=

k∑

i=1

Ai +

k∑

i=1

Bi +

k∑

i=1

Ci +

k∑

i=1

Di (S-17)

S-13

where

Ai =

νiqiyi

[∆v(x − 1,y + ej+1 − ei) − ∆v(x − 1,y + ej − ei)

]if yi ≥ 1

0 if yi = 0

Bi = νiqi(m − yi − Ki)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

Ci =

νiqiKi

[∆v(x − 1,y + ej+1 − ei) − ∆v(x,y + ej)

]if yi ≥ 1

νiqiKi

[∆v(x − Ki,y + ej+1 − Kiei) − ∆v(x,y + ej)

]if yi = 0

Di =

νiqiJi

[∆v(x,y + ej) − ∆v(x − 1,y + ej − ei)

]if yi ≥ 1

νiqiJi

[∆v(x,y + ej) − ∆v(x − Ji,y + ej − Jiei)

]if yi = 0 .

We have that Ai ≤ 0 and Bi ≤ 0 for all i = 1, . . . , k because v satisfies condition (C3). If

j 6= 0, then the only non-zero terms in the third and fourth summations in (S-17) are Cj+1 and

Dj . Therefore,

k∑

i=1

Ci +k∑

i=1

Di = νj+1qj+1

[∆v(x − 1,y) − ∆v(x,y + ej)

]+ νjqj

[∆v(x,y + ej) − ∆v(x − 1,y)

]

= (νj+1qj+1 − νjqj)[∆v(x − 1,y) − ∆v(x,y + ej)

]

Recall that v satisfies condition (C2), and therefore ∆v(x − 1,y) − ∆v(x,y + ej) ≤ 0. When

νj+1qj ≥ νjqj, we have that∑k

i=1 Ci +∑k

i=1 Di ≤ 0. Note that this is the only place where we have

used the condition that νiqi is non-decreasing in i. We have now shown that ∆Tv(x,y + ej+1) −

∆Tv(x,y + ej) =∑k

i=1 Ai +∑k

i=1 Bi +∑k

i=1 Ci +∑k

i=1 Di ≤ 0, as desired.

If j = 0, then∑k

i=1 Ci +∑k

i=1 Di = ν1q1[∆v(x − 1,y) − ∆v(x,y)] ≤ 0, where the inequality

holds because v satisfies condition (C1). Again, we have ∆Tv(x,y + ej+1) − ∆Tv(x,y + ej) ≤ 0,

which completes the argument.

S-4 Extension of Theorem 1 to Systems with Cancelations

In this section we re-define some of the notation used in the preceding section.

Let {T i} be defined by

T iv(x,y) := yiv(x,y − eiI{yi≥1}) + (m − yi)v(x,y) i = 1, . . . , k

and let T be defined by

Tv(x,y) :=

k∑

i=1

νipiT iv(x,y) .

S-14

By an argument identical to that at the beginning of Section S-3, to extend Theorem 1 to the

setting with cancelations, we need only prove that T preserves conditions (C1)–(C4). That is, we

just need to show that if v ∈ U , then Tv ∈ U .

Suppose that v satisfies conditions (C1)–(C4); that is, suppose v ∈ U . It can be easily seen that

T iv satisfies conditions (C1) and (C4) from which it follows that Tv satisfies conditions (C1) and

(C4). We next check that Tv satisfies conditions (C2) and (C3).

Condition (C2): Suppose i ∈ {1, . . . , k}, j ∈ {0, . . . , k − 1}, and l ∈ {j + 1, . . . , k}. Let J = I{i=j}

and L = I{i=l}. Note that if i = k, then J = 0. If yi ≥ 1 we have

∆T iv(x,y + ej) − ∆T iv(x + 1,y + el)

= (yi + J)∆v(x,y + ej − ei) + (m − yi − J)∆v(x,y + ej)

−(yi + L)∆v(x + 1,y + el − ei) − (m − yi − L)∆v(x + 1,y + el)

= yi

[∆v(x,y + ej − ei) − ∆v(x + 1,y + el − ei)

]

+(m − yi − J)[∆v(x,y + ej) − ∆v(x + 1,y + el)

]

+L[∆v(x + 1,y + el) − ∆v(x + 1,y + el − ei)

]

+J[∆v(x,y + ej − ei) − ∆v(x + 1,y + el)

]

≤ 0.

If yi = 0, we have

∆T iv(x,y + ej) − ∆T iv(x + 1,y + el)

= J∆v(x,y + ej − Jei) + (m − J)∆v(x,y + ej)

−L∆v(x + 1,y + el − Lei) − (m − L)∆v(x + 1,y + el)

= (m − J)[∆v(x,y + ej) − ∆v(x + 1,y + el)

]

+L[∆v(x + 1,y + el) − ∆v(x + 1,y + el − Lei)

]

+J[∆v(x,y + ej − Jei) − ∆v(x + 1,y + el)

]

≤ 0.

We have proved that each T iv satisfies condition (C2), and therefore Tv satisfies condition (C2).

Condition (C3): Suppose j ∈ {0, . . . , k − 1}. For i = 1, . . . , k, let Ji = I{i=j} and Ki = I{i=j+1}.

S-15

For i = 1, . . . , k, if yi ≥ 1 we have

∆T iv(x,y + ej+1) − ∆T iv(x,y + ej)

= (yi + Ki)∆v(x,y + ej+1 − ei) + (m − yi − Ki)∆v(x,y + ej+1)

−(yi + Ji)∆v(x,y + ej − ei) − (m − yi − Ji)∆v(x,y + ej)

= yi

[∆v(x,y + ej+1 − ei) − ∆v(x,y + ej − ei)

]

+(m − yi − Ki)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

+Ki

[∆v(x,y + ej+1 − ei) − ∆v(x,y + ej)

]

+Ji

[∆v(x,y + ej) − ∆v(x,y + ej − ei)

]

and if yi = 0 we have

∆T iv(x,y + ej+1) − ∆T iv(x,y + ej)

= (yi + Ki)∆v(x,y + ej+1 − Kiei) + (m − yi − Ki)∆v(x,y + ej+1)

−(yi + Ji)∆v(x,y + ej − Jiei) − (m − yi − Ji)∆v(x,y + ej)

= yi

[∆v(x,y + ej+1 − Kiei) − ∆v(x,y + ej − Jiei)

]

+(m − yi − Ki)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

+Ki

[∆v(x,y + ej+1 − Kiei) − ∆v(x,y + ej)

]

+Ji

[∆v(x,y + ej) − ∆v(x,y + ej − Jiei)

].

Therefore,

∆Tv(x,y + ej+1) − ∆Tv(x,y + ej) =

k∑

i=1

νipi

[∆T iv(x,y + ej+1) − ∆T iv(x,y + ej)

]

=k∑

i=1

Ai +k∑

i=1

Bi +k∑

i=1

Ci +k∑

i=1

Di (S-18)

where

Ai = νipiyi

[∆v(x,y + ej+1 − ei) − ∆v(x,y + ej − ei)

]

Bi = νipi(m − yi − Ki)[∆v(x,y + ej+1) − ∆v(x,y + ej)

]

Ci = νipiKi

[∆v(x,y + ej+1 − ei) − ∆v(x,y + ej)

]

Di = νipiJi

[∆v(x,y + ej) − ∆v(x,y + ej − ei)

]

Note that the preceding expressions combine the yi ≥ 0 and yi = 0 cases.

S-16

We have that Ai ≤ 0 and Bi ≤ 0 for all i = 1, . . . , k because v satisfies condition (C3). If

j 6= 0, then the only non-zero terms in the third and fourth summations in (S-18) are Cj+1 and

Dj . Therefore,

k∑

i=1

Ci +k∑

i=1

Di = νj+1pj+1

[∆v(x,y) − ∆v(x,y + ej)

]+ νjpj

[∆v(x,y + ej) − ∆v(x,y)

]

= (νj+1pj+1 − νjpj)[∆v(x,y) − ∆v(x,y + ej)

].

We have assumed that v satisfies condition (C3), from which we have ∆v(x,y)−∆v(x,y+ej) ≥ 0.

Hence, if νj+1pj+1 ≤ νjpj , then∑k

i=1 Ci +∑k

i=1 Di ≤ 0. If j = 0, then∑k

i=1 Ci +∑k

i=1 Di = 0.

Thus, we have proved that Tv satisfies condition (C3), which completes the argument.

S-5 Extension of Theorem 1 to Systems with Lost Sales

To model lost sales, in (5) we replace operator T ′k by T ′

k defined in (9), and we re-define the state

space to be Z+×Zk+(m) and the cost function to be c(x) = hx. Similar to the proof of Proposition 1,

the operator T in the optimality equation can be written as

Tv(x,y) = γ−1[c(x) + λTλv(x,y) +

k−1∑

i=1

νiTiv(x,y) + νkTkv(x,y) + µTµv(x,y)]

(S-19)

where {Ti : i = 1, . . . , k − 1} are defined in (S-1) and Tk is defined by

Tkv(x,y) :=yiT′kv(x,y) + (m − yk)v(x,y)

=yi

[v([x − I{yk≥1}]

+,y − ekI{yk≥1}) + cLSI{yk≥1 and x=0}

]+ (m − yk)v(x,y) .

To prove that Theorem 1 [except property (b), which deals with backorders] holds in this

setting, we shall just need to show that the value function satisfies conditions (C1)–(C3). Note

that condition (C4) is not relevant in the setting with lost sales. To prove the desired result, we

introduce the following additional condition:

(C4)′ ∆v(x,y) ≥ −cLS for all x ≥ 0, y ∈ Zk+(m).

We will next show that if function v satisfies conditions (C1)–(C3) and (C4)′, then Tv satisfies

conditions (C1)–(C3) and (C4)′. That is, we will show that operator T in (S-19) preserves condi-

tions (C1)–(C3) and (C4)′. As in the proof of Proposition 1, it then follows that the value function

satisfies the conditions (C1)–(C3). It also follows that the value function satisfies condition (C4)′,

but this fact is not needed for the theorem.

Suppose that v satisfies conditions (C1)–(C3) and (C4)′.

S-17

From the proof of Proposition 1, we have that Tλv, {Tiv : i = 1, . . . , k − 1}, and Tµv satisfy

conditions (C1)–(C3). We next show that Tkv satisfies conditions (C1)–(C3). Note that Tkv(x,y) =

Tkv(x,y) when x > 0 or yk = 0, where Tk is defined in (S-1). Hence, the inequalities in conditions

(C1)–(C3) hold when x > 0 or yk = 0; this was shown in the proof of Proposition 1. Therefore, to

show that Tkv satisfies conditions (C1)–(C3), we need only show that the inequalities hold when

x = 0 and yk ≥ 1. That is, we just need to show that

∆Tkv(0,y) ≤ ∆Tk(1,y) for all y ∈ Zk+(m) with yk ≥ 1 (S-20)

∆Tkv(0,y + ej) ≤ ∆Tkv(1,y + el) for all y ∈ Zk+(m − 1) with yk ≥ 1, j = 0, . . . , k − 1, and l = j + 1, . . . , k

(S-21)

∆Tkv(0,y + ej+1) ≤ ∆Tkv(0,y + ej) for all y ∈ Zk+(m − 1) with yk ≥ 1, and j = 0, . . . , k − 1.

(S-22)

To check that (S-20) holds, suppose that y is such that yk ≥ 1. We have

∆Tkv(0,y) − ∆Tkv(1,y) = (m − yk)∆v(0,y) − ykcLS − yk∆v(0,y − ek) − (m − yk)∆v(1,y)

= (m − yk)[∆v(0,y) − ∆v(1,y)

]− yk

[cLS + ∆v(0,y − ek)

]

≤ 0

where the inequality holds because v satisfies conditions (C1) and (C4)′. Hence, (S-20) holds. Next

we check that (S-21) holds. Fix j ∈ {0, . . . , k − 1} and l ∈ {j + 1, . . . , k} and let I = I{l=k}. Then

∆Tkv(0,y + ej) − ∆Tkv(1,y + el) = (m − yk)∆v(0,y + ej) − ykcLS

− (yk + I)∆v(0,y + el − ek) − (m − yk − I)∆v(1,y + el)

= (m − yk − I)[∆v(0,y + ej) − ∆v(1,y + el)

]

− yk

[cLS + ∆v(0,y + el − ek)

]

+ I[∆v(0,y + ej) − ∆v(0,y + el − ek)

]

≤ 0,

where the inequality holds because v satisfies conditions (C2), (C3), and (C4)′. Finally, we verify

S-18

that (S-21) holds:

∆Tkv(0,y + ej+1) − ∆Tkv(0,y + ej) = (m − yk − I)∆v(0,y + ej+1) − (yk + I)cLS

− (m − yk)∆v(0,y + ej) + ykcLS

= (m − yk − I)[∆v(0,y + ej+1) − ∆v(0,y + ej)

]

− I[cLS + ∆v(0,y + ej)

]

≤ 0,

because v satisfies conditions (C3) and (C4)′. We have now shown that Tkv satisfies conditions (C1)–

(C3) and therefore Tv satisfies (C1)–(C3).

To complete the proof of the extension of Theorem 1, we next show that Tv satisfies condi-

tion (C4)′. This follows easily from (S-19) and the fact that γ = β + λ + µ + m∑k

i=1 νi.

S-6 Additional Material for Section 3.2

In this section we describe results of GH (Guo and Hernandez-Lerma 2003), and show how they

may be applied in our setting. GH consider continuous time MDPs with countable state space S

and discount rate β. The set of allowable actions in state i ∈ S is A(i). The conditional transition

rate from state i ∈ S to state j 6= i under action a ∈ A(i) is given by q(j|i, a) where q(j|i, a) ≥ 0

for i 6= j. Let q(i) := supa∈A(i) qi(a) < ∞ where qi(a) := −q(i|i, a) :=∑

j 6=i q(j|i, a). The reward

rate is r(i, a) in state i ∈ S under action a ∈ A(i).

To put our model into their framework, take S = S, β = β, A(i) = {0, 1}, r((x,y), a) = −c(x)

and for (x′,y′) 6= (x,y) take

q((x′,y′)|(x,y), a) =

µI{a=1} if (x′,y′) = (x + 1,y)

λ if (x′,y′) = (x,y + e1)

νiyi if (x′,y′) = (x,y + ei+1 − ei) i = 1, . . . , k − 1

νkyk if (x′,y′) = (x − 1,y − ek)

0 otherwise.

With the above, we have q(x,y)(a) = −q((x,y)|(x,y), a) = λ +∑k

i=1 νiyi + µI{a=1}, and q(x,y) =

λ +∑k

i=1 νiyi + µ. GH introduce the following assumptions.

Assumption A. There exist a sequence {Sm : m ≥ 1} of subsets of S, a non-negative function R

on S and constants b ≥ 0 and c0 ∈ (−∞,∞) such that

S-19

(1) Sm ↑ S and supi∈Smq(i) < ∞ for each m ≥ 1;

(2) limm→∞[infj /∈SmR(j)] = +∞; and

(3)∑

j∈S q(j|i, a)R(j) ≤ c0R(i) + b for all i ∈ S and a ∈ A(i).

Assumption B. With c0 and R as in Assumption A:

(1) either c0 ≤ 0 or c0 − β < 0 when c > 0; and

(2) there exist non-negative constants M1 and M2 such that |r(i, a)| ≤ M1 + M2R(i) for all i ∈

S and a ∈ A(i) .

Assumption C.

(1) For each i ∈ S, A(i) is compact;

(2) the functions r(i, a), q(i|j, a) and∑

j∈S q(j|i, a)R(j) are continuous in a ∈ A(i) for each fixed

i, j ∈ S; and

(3) there exist a non-negative function w′ on S and constants c′ > 0, b′ > 0, and M ′ > 0 such

that q(i)R(i) ≤ M ′w′(i) and∑

j∈S q(j|i, a)w′(j) ≤ c′w′(i) + b′ for all (i, a).

Parts (b) and (c) of Theorem 3.2 of GH state that if Assumptions A and B hold, then the value

function v∗ is the unique solution in BR(S) := {v : there exist constants c1, c2 ≥ 0 so that |v(i)| ≤

c1 + c2R(i) for all i ∈ S} of the optimality equation,

v(i) =1

β + q(i)sup

{r(i, a) +

j 6=i

q(j|i, a)v(j) + [q(i) − qi(a)]v(i)}

i ∈ S . (S-23)

Part (e) of Theorem 3.2 states that if, in addition, Assumption C holds, then there exists an optimal

stationary policy. When Assumptions A, B, and C(3) hold, then Theorem 3.3 of GH states that if

{a∗(i)} satisfy

v∗(i) =1

β + q(i)

{r(i, a∗(i)) +

j 6=i

q(j|i, a∗(i))v∗(j) +[q(i) − qi(a

∗(i))]v∗(i)

}i ∈ S , (S-24)

then {a∗(i)} is an optimal (stationary) policy.

Before we proceed, observe that the equation v = Lv in Section 3.2 is simply (S-23) specialized

to our context, and multiplied by −1 to express our cost minimization as a (negative) revenue

maximization.

Lemma S-2 The system with IDD and unbounded jump rates satisfies Assumptions A, B, and C

of GH.

S-20

Proof. Let R(x,y) := |x| +∑k

i=1 yi = |x| + y be as defined in (10) and let Sm = Sm. Then

Assumptions A(1) and A(2) hold trivially. For A(3) and B(1) we need to identify constants c0 ≤ 0

and b ≥ 0 so that

λR(x,y + e1) +

k−1∑

i=1

νiyiR(x,y + ei+1 − ei) + νkykR(x − 1,y − ek) + µI{a=1}R(x + 1,y)

− [λ +k∑

i=1

νiyi + µI{a=1}]R(x,y) ≤ c0R(x,y) + b (S-25)

for all (x,y) ∈ S and a ∈ {0, 1}. The expression on the left side above simplifies to λ[R(x,y +

e1)−R(x,y)] + νkyk[R(x− 1,y− ek)−R(x,y)] + µI{a=1}[R(x + 1,y)−R(x,y)], which is bounded

above by λ + µ. Hence (S-25) holds with c0 = 0 and b = λ + µ. Thus, Conditions A(3) and B(1)

hold. For B(2) we need constants M1,M2 ≥ 0 so that c(x) ≤ M1 + M2R(x,y), for all (x,y) ∈ S.

It is easy to see that B(2) holds with M1 = 0 and M2 = b + h.

The compactness and continuity Assumptions C(1) and C(2) hold trivially because our action

space is finite. For C(3), we need to identify a non-negative function w′ : S → S and constants

b′ ≥ 0, c′ > 0, and M ′ > 0 such that

(λ +k∑

i=1

νiyi + µ)R(x,y) ≤ M ′w′(x,y) for all (x,y) ∈ S (S-26)

and

λw′(x,y + e1) +

k−1∑

i=1

νiyiw′(x,y + ei+1 − ei) + νkykw

′(x − 1,y − ek) + µI{a=1}w′(x + 1,y)

− [λ +

k∑

i=1

νiyi + µI{a=1}]w′(x,y) ≤ c′w′(x,y) + b′ (S-27)

Let w′(x,y) := (|x| + y)2 and M ′ := λ + µ + max{ν1, . . . , νk}. Note that (S-26) holds when

x = y = 0. Otherwise, we have either |x| ≥ 1 or y ≥ 1 (or both), and hence

λ +k∑

i=1

νiyi + µ ≤ M ′(|x| + y) .

S-21

Multiplying the above by R(x,y) yields (S-26). The left-hand side of (S-27) simplifies to

λ[w′(x,y + e1) − w′(x,y)] + νkyk[w′(x − 1,y − ek) − w′(x,y)]

+ µI{a=1}[w′(x + 1,y) − w′(x,y)]

≤ λ[w′(x,y + e1) − w′(x,y)] + µ[w′(x + 1,y) − w′(x,y)]

≤ 2λ(|x| + y) + λ + 2µ(|x| + y) + µ

= 2(λ + µ)(|x| + y) + (λ + µ)

≤ 2(λ + µ)w′(x,y) + (λ + µ).

Therefore, C(3) holds with c′ := 2(λ + µ) and b′ := λ + µ. This completes the proof.

S-7 Systems with IDD and Low Arrival Rate

For small enough λ, a system with no ADI will hold no inventory and produce to order, giving an

average cost of approximately C1 := λb/µ; this expression comes from ignoring the possibility of

multiple orders being present at once (which is not unreasonable when λ is very small) and noting

that jobs arrive at rate λ and incur cost at rate b during the time it takes to produce one unit,

which has mean 1/µ. For a system with ADI, it will again be best not to hold inventory when no

jobs are in the demand leadtime system. When an order is announced, then the decision is whether

or not to produce one unit in advance of the order be coming due. Again ignoring the possibility of

multiple orders being present simultaneously, if the decision is not to produce upon announcement

of an order, then the long-run average cost will again be roughly C1 = λb/µ. If the decision is to

commence production upon announcement of an order, then we can derive an approximation for the

average cost by conditioning on whether the production is completed prior to the order becoming

due [which occurs with probability µ/(µ + ν)] or not [which occurs with probability ν/(µ + ν)].

By standard properties of exponential random variables, the order will generate an average holding

cost of h/ν conditional upon the unit completing production prior to the order becoming due.

Similarly, the order will generate an average backorder cost of b/µ conditional upon the unit not

completing production prior to the order coming due. Putting it together, the long-run average

cost is approximately C2 := λ[ µµ+ν

hν + ν

µ+νbµ ].

Hence, for the system with ADI, it will be better to produce upon announcement of an order

provided C2 < C1. Rearranging terms, it follows that it is better to produce if ν > ν∗(µ) := hµ/b

and it is better to wait for the order to become due otherwise. It follows that if ν > ν∗(µ) then

PCR = 100 × (JN − JA)/JN ≈ 100 × (C1 − C2)/C1 = 100 × (bµν − hµ2)/(bµν + bν2) for λ small.

S-22

Likewise, if ν ≤ ν∗(µ) then PCR ≈ 0 for λ small. A similar analysis is possible for general k.

S-8 Extra Material for Section 5

Proof of Theorem 3. To prove (a), note that by Proposition S-1, the value function v∗ satisfies

conditions (C2) and (C3). Applying Condition (C3) and using the definition of sy+erw , we have

∆v∗(sy+erw ,y + epq) ≥ ∆v∗(sy+erw ,y + erw) ≥ 0, which implies sy+erw ≥ sy+epq . Also, by

Condition (C2) and the definition of sy+epq , we have ∆v∗(sy+epq + 1,y + erw) ≥ ∆v∗(sy+epq ,y +

epq) ≥ 0, which implies sy+epq + 1 ≥ sy+erw . Therefore, sy+epq ≤ sy+erw ≤ sy+epq + 1 and hence

sy+erw is equal to either sy+epq or sy+epq + 1. Finally, part (b) is a consequence of the fact that v∗

satisfies Condition (C4).

Proposition S-1 The value function v∗ satisfies Conditions (C1)–(C5).

Proof. If T preserves Conditions (C1)–(C5), then it will follow by an argument identical to that

at the end of the proof of Proposition 1 that v∗ satisfies Conditions (C1)–(C5). Hence, we need

only prove that if v ∈ V satisfies Conditions (C1)–(C5), then so does T v.

To this end, suppose hereafter that v satisfies Conditions (C1)–(C5).

We will check that T v satisfies Conditions (C1)–(C5). As before, it is sufficient to verify that

each of {Tijv} and Tµv individually satisfies the five conditions. For Conditions (C1) and (C4)

the verification is essentially identical to the approach used for Conditions C1 and C4 in the proof

of Proposition 1; hence we omit the details. Likewise, the verification that Tµv satisfies the five

conditions is virtually identical to the corresponding arguments in Proposition 1. Hence, we present

only a few comments where there are minor differences in the verification for Tµv.

We begin by checking Conditions (C2) and (C3). For this, suppose (p, q) ≺ (r, w), and y +

epq,y + erw ∈ Y, and let Jij = I{(p,q)=(i,j)} and Lij = I{(r,w)=(i,j)}.

To verify that Tµv satisfies Conditions (C2) and (C3), let x∗(y) := min{x : ∆v(x,y + epq) ≥ 0}

for y ∈ Y. Similar to IDD systems, we have that x∗y+erw

is either x∗y+epq

or x∗y+epq

+ 1, because v

satisfies Conditions (C2) and (C3). Proof that Tµv does indeed satisfy Conditions (C2) and (C3)

follows by considering cases directly analogous to those used for Tµ in the proof of Proposition 1.

Condition (C2): For operators {Tij}, we need to show that ∆Tijv(x,y+epq) ≤ ∆Tijv(x+1,y+erw).

(a) For i = 0, j = 1, . . . , k0 − 1, the operator Tij = T0j is given by (12).

S-23

If y0j = 1 then

∆T0jv(x,y + epq) = ∆v(x,y + epq + e0,j+1 − e0j)

≤ ∆v(x + 1,y + erw + e0,j+1 − e0j)

= ∆T0jv(x + 1,y + erw) ,

where the inequality holds because v is assumed to satisfy Condition (C2).

If y0j 6= 1 then

∆T0jv(x,y + epq) = ∆v(x,y + epq + J0j(e0,j+1 − e0j))

≤ ∆v(x + 1,y + erw + L0j(e0,j+1 − e0j))

= ∆T0jv(x + 1,y + erw) .

The inequality above can be checked by looking at the three possible values of (J0j , L0j). If

(J0j , L0j) ∈ {(0, 0), (1, 0), (0, 1)}, the inequality holds because v satisfies Condition (C2).

(b) For i = 0, j = k0, the operator Tij = T0k0is given by (13). Let I1 = I{n=0}.

If y0k0= 1 then

∆T0k0v(x,y + epq) = ∆v(x − I1,y + epq + e01 − e0k0

+ (1 − I1)e11)

≤ ∆v(x + 1 − I1,y + erw + e01 − e0k0+ (1 − I1)e11)

= ∆T0k0v(x + 1,y + erw) .

If y0k06= 1

∆T0k0v(x,y + epq) = ∆v(x − J0k0

I1,y + epq + J0k0(e01 − e0k0

+ (1 − I1)e11))

≤ ∆v(x + 1 − L0k0I1,y + erw + L0k0

(e01 − e0k0+ (1 − I1)e11))

= ∆T0k0v(x + 1,y + erw) .

The inequalities above follow from the fact that v satisfies Conditions (C2)–(C3).

(c) For i = 1, . . . , n, j = 1 the operator Tij = Ti1 is given by (14) when ki ≥ 2. Let I2 =

I{(p,q)6=(i,l) for l=2,...,ki} and I3 = I{(r,w)6=(i,l) for l=2,...,ki}.

If yi1 ≥ 1 and yil = 0 for l = 2, . . . , ki then

∆Ti1v(x,y + epq) = ∆v(x,y + epq + I2(ei2 − ei1))

≤ ∆v(x + 1,y + erw + I3(ei2 − ei1))

= ∆Ti1v(x + 1,y + erw).

S-24

If yi1 = 0 and yil = 0 for l = 2, . . . , ki

∆Ti1v(x,y + epq) = ∆v(x,y + epq + Ji1(ei2 − ei1))

≤ ∆v(x + 1,y + erw + Li1(ei2 − ei1))

= ∆Ti1v(x + 1,y + erw).

If yil ≥ 1 for some l = 2, . . . , ki then

∆Ti1v(x,y + epq) = ∆v(x,y + epq)

≤ ∆v(x + 1,y + erw)

= ∆Ti1v(x + 1,y + erw).

The inequalities above follow from the fact that v satisfies Conditions (C1)–(C3).

(d) For i = 1, . . . , n, j = 2, . . . , ki − 1, the operator Tij is given by (15).

If yij = 1

∆Tijv(x,y + epq) = ∆v(x,y + epq + ei,j+1 − eij)

≤ ∆v(x + 1,y + erw + ei,j+1 − eij)

= ∆Tijv(x + 1,y + erw).

If yij = 0

∆Tijv(x,y + epq) = ∆v(x,y + epq + Jij(ei,j+1 − eij))

≤ ∆v(x + 1,y + erw + Lij(ei,j+1 − eij))

= ∆Tijv(x + 1,y + erw).

The inequalities above follow from the fact that v satisfies Conditions (C1) and (C2).

(e) For i = 1, . . . , n and j = ki, the operator Tikiis given by (16) and (17) . Let I4 = I{i=n}.

Using the fact that v satisfies Conditions (C1)–(C3) we have the following.

If yiki≥ 1 then

∆Tikiv(x,y + epq) = ∆v(x − I4,y + epq − eiki

+ (1 − I4)ei+1,1)

≤ ∆v(x + 1 − I4,y + erw − eiki+ (1 − I4)ei+1,1)

= Tikiv(x + 1,y + erw) .

S-25

If yiki= 0 then

∆Tikiv(x,y + epq) = ∆v(x − Jiki

I4,y + epq + Jiki(−eiki

+ (1 − I4)ei+1,1))

≤ ∆v(x + 1 − LikiI4,y + erw + Liki

(−eiki+ (1 − I4)ei+1,1))

= Tikiv(x + 1,y + erw).

This completes the verification that each Tijv satisfies Condition (C2).

Condition (C3): For operators {Tij}, we need to show that ∆Tijv(x,y + erw) ≤ ∆Tijv(x,y + epq).

(a) For i = 0, j = 1, . . . , k0 − 1, the operator Tij = T0j is given by (12).

If y0j = 1 then

∆T0jv(x,y + erw) = ∆v(x,y + erw + e0,j+1 − e0j)

≤ ∆v(x,y + epq + e0,j+1 − e0j)

= ∆T0jv(x + 1,y + epq).

If y0j 6= 1 then

∆T0jv(x,y + erw) = ∆v(x,y + erw + L0j(e0,j+1 − e0j))

≤ ∆v(x,y + epq + J0j(e0,j+1 − e0j))

= ∆T0jv(x,y + epq).

The inequalities above follow the fact that v satisfies condition (C3).

(b) For i = 0, j = k0, the operator Tij = T0k0is given by (13). Let I1 = I{n=0}.

If y0k0= 1 then

∆T0k0v(x,y + erw) = ∆v(x − I1,y + erw + e01 − e0k0

+ (1 − I1)e11)

≤ ∆v(x − I1,y + epq + e01 − e0k0+ (1 − I1)e11)

= ∆T0k0v(x,y + epq).

If y0k06= 1 then

∆T0k0v(x,y + erw) = ∆v(x − L0k0

I1,y + erw + L0k0(e01 − e0k0

+ (1 − I1)e11))

≤ ∆v(x − J0k0I1,y + epq + J0k0

(e01 − e0k0+ (1 − I1)e11))

= ∆T0k0v(x,y + epq).

The inequalities above hold because v satisfies Conditions (C1)–(C3) and (C5).

S-26

(c) For i = 1, . . . , n, j = 1, the operator Tij = Ti1 is given by (14) when ki ≥ 2. Let I2 =

I{(p,q)6=(i,l) for l=2,...,ki} and I3 = I{(r,w)6=(i,l) for l=2,...,ki}.

If yi1 ≥ 1 and yil = 0 for l = 2, . . . , ki then

∆Ti1v(x,y + erw) = ∆v(x,y + erw + I3(ei2 − ei1))

≤ ∆v(x,y + epq + I2(ei2 − ei1))

= ∆Ti1v(x,y + epq).

If yi1 = 0 and yil = 0 for l = 2, . . . , ki then

∆Ti1v(x,y + erw) = ∆v(x,y + erw + Li1(ei2 − ei1))

≤ ∆v(x,y + epq + Ji1(ei2 − ei1))

= ∆Ti1v(x,y + epq).

If yil ≥ 1 for some l = 2, . . . , ki then

∆Ti1v(x,y + erw) = ∆v(x,y + erw)

≤ ∆v(x,y + epq)

= ∆Ti1v(x,y + epq) .

The inequalities hold because v satisfies condition (C3).

(d) For i = 1, . . . , n, j = 2, . . . , ki − 1, the operator Tij is given by (15).

If yij = 1 then

∆Tijv(x,y + erw) = ∆v(x,y + erw + ei,j+1 − eij)

≤ ∆v(x,y + epq + ei,j+1 − eij)

= ∆Tijv(x,y + epq) .

If yij = 0 then

∆Tijv(x,y + erw) = ∆v(x,y + erw + Lij(ei,j+1 − eij))

≤ ∆v(x,y + epq + Jij(ei,j+1 − eij))

= ∆Tijv(x,y + epq) .

The inequalities above follow from the fact that v satisfies condition (C3).

S-27

(e) For i = 1, . . . , n, j = ki, the operator Tij = Tikiis given by (16) and (17) . Let I4 = I{i=n}.

Using the fact that v satisfies Conditions (C1)–(C3) we have the following.

If yiki≥ 1 then

∆Tikiv(x,y + erw) = ∆v(x − I4,y + erw − eiki

+ (1 − I4)ei+1,1)

≤ ∆v(x − I4,y + epq − eiki+ (1 − I4)ei+1,1)

= ∆Tikiv(x,y + epq).

If yiki= 0 then

∆Tikiv(x,y + erw) = ∆v(x − Liki

I4,y + erw + Liki(−eiki

+ (1 − I4)ei+1,1))

≤ ∆v(x − JikiI4,y + epq + Jiki

(−eiki+ (1 − I4)ei+1,1))

= ∆Tikiv(x,y + epq) .

This completes the verification that Tijv satisfies Conditions (C1) and (C3) for all (i, j).

Condition (C5): Turning to Condition (C5), consider y and q ≥ 2 such that y+e0q,y+e01+e11 ∈ Y.

By the comments that precede Condition (C5) in Section 5, it suffices for us to hereafter consider

only cases with k0 ≥ 2 and n ≥ 1.

To verify that Tµv satisfies Condition (C5), observe first that

∆v(x,y + e0q) ≤ ∆v(x + 1,y + e01 + e11). (S-28)

To see this note that ∆v(x,y + e0q) ≤ ∆v(x,y + e01) = ∆v(x,y + e01 + e00) ≤ ∆v(x + 1,y + e01 +

e11), where the first and second inequalities follow because v satisfies Conditions (C3) and (C2),

respectively. From (S-28) and the fact that v satisfies Condition (C5), it follows that x∗y+e01+e11

is either x∗y+e0q

or x∗y+e0q

+ 1. Proof that Tµv satisfies Condition (C5) follows by considering

cases directly analogous to those used to show that Tµv satisfies Condition (C3) in the proof of

Proposition 1.

For {Tij} we need to show that ∆Tijv(x,y + e01 + e11) ≤ ∆Tijv(x,y + e0q) for all (x,y) and

q ≥ 2 such that y + e0q,y + e01 + e11 ∈ Y.

(a) For i = 0, j = 1, . . . , k0 − 1, the operator Tij = T0j is given by (12). Since v satisfies

Conditions (C3) and (C5), we have that

∆T0jv(x,y + e01 + e11) = ∆v(x,y + e01I{j 6=1} + e02I{j=1} + e11)

≤ ∆v(x,y + e0jI{j 6=q} + e0,j+1I{j=q})

= ∆T0jv(x,y + e0q).

S-28

(b) For i = 0, j = k0, the operator Tij = T0k0is given by (13). Recall that we need only consider

the case with k0 ≥ 2 and n ≥ 1. Since v satisfies Condition (C5), we have

∆T0k0v(x,y + e01 + e11) = ∆v(x,y + e01 + e11)

≤ ∆v(x,y + [e01 + e11]I{q=k0} + e0qI{q 6=k0})

= ∆T0k0v(x,y + e0q).

(c) For i = 1, . . . , n, j = 1, the operator Tij = Ti1 is given by (14) when ki ≥ 2.

For i 6= 1 we have

∆Ti1v(x,y + e01 + e11) = ∆v(x,y + e01 + e11 + [ei2 − ei1]I{yi1≥1 and yiℓ=0 for ℓ=2,...,ki})

≤ ∆v(x,y + e0q + [ei2 − ei1]I{yi1≥1 and yiℓ=0 for ℓ=2,...,ki})

= ∆Ti1v(x,y + e0q),

because v satisfies Condition (C5).

For i = 1, let I ′ = I{y1ℓ=0 for ℓ=2,...,k1}. Then

∆T11v(x,y + e01 + e11) = ∆v(x,y + e01 + e12I′ + e11[1 − I ′])

≤ ∆v(x,y + e0q + [e12 − e11]I{y11≥1 and y1ℓ=0 for ℓ=2,...,k1})

= ∆T11v(x,y + e0q),

because v satisfies Conditions (C3) and (C5).

(d) For i = 1, . . . , n, j = 2, . . . , ki − 1, operator Tij is given by (15). Since v satisfies Condi-

tion (C5), we have

∆Tijv(x,y + e01 + e11) = ∆v(x,y + e01 + e11 + [ei,j+1 − eij ]I{yij≥1})

≤ ∆v(x,y + e0q + [ei,j+1 − eij]I{yij≥1})

= ∆Tijv(x,y + e0q).

(e) For i = 1, . . . , n − 1, j = ki, the operator Tij = Tikiis given by (16).

If i > 1 or if both i = 1 and k1 > 1, then we have

∆Tikiv(x,y + e01 + e11) = ∆v(x,y + e01 + e11 + [ei+1,1 − eiki

]I{yiki≥1})

≤ ∆v(x,y + e0q + [ei+1,1 − eiki]I{yiki

≥1})

= ∆Tikiv(x,y + e0q),

S-29

because v satisfies Condition (C5).

If i = 1 and k1 = 1 we have

∆T11v(x,y + e01 + e11) = ∆v(x,y + e01 + e21)

≤ ∆v(x,y + e0q + [e21 − e11]I{y11≥1})

= ∆T11v(x,y + e0q),

because v satisfies Conditions (C3) and (C5).

(f) For i = n, j = kn, the operator Tij = Tnknis given by (17).

If n > 1 or if both n = 1 and k1 > 1, then we have

∆Tnknv(x,y + e01 + e11) = ∆v(x − I{ynkn≥1},y + e01 + e11 − enkn

I{ynkn≥1})

≤ ∆v(x − I{ynkn≥1},y + e0q − enknI{ynkn≥1})

= ∆Tnknv(x,y + e0q),

because v satisfies Condition (C5).

If n = 1 and k1 = 1 we have

∆T11v(x,y + e01 + e11) = ∆v(x − 1,y + e01)

≤ ∆v(x − I{y11≥1},y + e0q − e11I{y11≥1})

= ∆T11v(x,y + e0q),

because v satisfies Conditions (C2) and (C5).

This completes the verification that Tijv satisfies Condition (C5).

S-30

S-9 Additional Numerical Results

λ = 0.4 λ = 0.6 λ = 0.8

ν1 = ν2 k=0 k=1 k=2 k=0 k=1 k=2 k=0 k=1 k=2

0.01 6.67 6.64 6.62 13.00 12.89 12.88 30.96 29.09 28.42

0.51% 0.78% 0.84% 0.87% 6.04% 8.20%

0.02 6.67 6.56 6.56 13.00 12.68 12.67 30.96 28.34 27.55

1.66% 1.67% 2.45% 2.50% 8.47% 11.01%

0.05 6.67 6.35 6.32 13.00 12.20 12.10 30.96 27.70 27.04

4.84% 5.25% 6.13% 6.92% 10.54% 12.68%

0.1 6.67 6.10 6.10 13.00 11.77 11.60 30.96 27.86 27.00

8.52% 8.55% 9.46% 10.77% 10.61% 12.79%

0.2 6.67 5.82 5.80 13.00 11.44 11.00 30.96 28.15 27.10

12.73% 13.04% 12.03% 15.38% 9.06% 12.47%

0.5 6.67 5.59 5.36 13.00 11.33 10.84 30.96 29.14 27.96

16.21% 19.57% 12.82% 16.62% 5.87% 9.68%

1.0 6.67 5.27 5.01 13.00 11.72 11.01 30.96 29.95 29.70

20.97% 24.89% 9.81% 15.31% 3.26 % 4.07%

1.5 6.67 5.34 5.08 13.00 12.37 11.06 30.96 30.15 29.76

19.98% 23.78% 4.85% 14.89% 2.60% 3.89%

2.0 6.67 5.49 5.08 13.00 12.82 12.68 30.96 30.33 29.93

17.74% 23.81% 1.38% 2.46% 2.02 % 3.34%

Table S-1: Average cost and percentage cost reduction (PCR) for systems with IDD (b =

10). The columns labeled “k = 0” show the average cost for systems without ADI.

S-31

λ = 0.4 λ = 0.6 λ = 0.8

ν1 = ν2 k=0 k=1 k=2 k=0 k=1 k=2 k=0 k=1 k=2

0.01 19.33 18.34 18.22 34.44 32.98 32.84 80.26 71.28 70.01

5.15% 5.72% 4.24% 4.65% 11.19% 12.77%

0.02 19.33 17.87 17.86 34.44 31.83 31.77 80.26 69.21 67.40

7.57% 7.60% 7.58% 7.74% 13.77% 16.03%

0.05 19.33 16.93 16.89 34.44 30.00 29.63 80.26 69.60 65.61

12.41% 12.61% 12.88% 13.98% 13.29% 18.26%

0.10 19.33 15.93 15.80 34.44 29.12 28.10 80.26 72.34 67.00

17.58% 18.26% 15.46% 18.41% 9.87% 16.52%

0.20 19.33 14.98 14.60 34.44 29.35 27.00 80.26 75.48 71.00

22.52% 24.47% 14.77% 21.60% 5.95% 11.54%

0.50 19.33 15.92 13.65 34.44 31.76 28.81 80.26 78.18 76.20

17.64% 29.39% 7.77% 16.35% 2.59% 5.05%

1.0 19.33 16.37 15.01 34.44 32.80 32.00 80.26 79.21 78.50

15.29% 22.35% 4.75% 7.08% 1.31% 2.19%

1.50 19.33 16.86 16.02 34.44 33.85 32.06 80.26 79.38 78.86

12.77% 17.11% 1.71% 6.92% 1.09% 1.75%

2.00 19.33 17.27 16.23 34.44 33.92 33.75 80.26 79.54 79.50

10.67% 16.06% 1.51% 2.00% 0.89% 0.95%

Table S-2: Average cost and percentage cost reduction (PCR) for systems with IDD (b =

50). The columns labeled “k = 0” show the average cost for systems without ADI.

S-32

ν11 = ν21 ρ = 0.4 ρ = 0.6 ρ = 0.8

n=0 n=1 n=2 n=0 n=1 n=2 n=0 n=1 n=2

0.41 25.07 24.19 23.97 - - - - - -

3.49% 4.39%

0.61 25.07 21.10 20.21 46.38 43.51 42.69 - - -

15.84% 19.39% 6.21% 7.96%

0.81 25.07 20.35 19.06 46.38 39.29 37.69 107.24 98.73 96.74

18.80% 23.96% 15.30% 18.74% 7.94% 9.79%

1.00 25.07 20.95 19.10 46.38 40.12 36.40 107.24 93.82 87.33

16.41% 23.79% 13.51% 21.53% 12.52% 18.57%

2.00 25.07 23.55 21.68 46.38 44.74 42.39 107.24 105.91 104.40

6.05% 13.51% 3.53 % 8.62 % 1.25% 2.66%

5.00 25.07 24.42 23.98 46.38 45.81 45.40 107.24 106.90 106.66

2.57% 4.35% 1.25 % 2.11% 0.32% 0.54%

10.00 25.07 24.75 24.48 46.38 46.11 45.88 107.24 107.09 106.95

1.26% 2.33% 0.59 % 1.09% 0.15% 0.27%

Table S-3: The case of SDD: Average cost and percentage cost reduction between systems with ADI

and systems with no ADI (b = 100) with ki = 1. When ρ > νi1 the leadtime system is not stable.

S-33

References

Bertsekas, D. P., 2001. Dynamic Programming and Optimal Control, Volume 2, 2nd Edition. Athena

Scientific, Belmont, MA.

Bremaud, P., 1999. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer-

Verlag, New York.

Guo, X., Hernandez-Lerma, O., 2003. Continuous-time controlled Markov chains with discounted

rewards. Acta Applicandae Mathematicae 79, 195–216.

Puterman, M. L., 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming.

John Wiley & Sons, New York.

Sennott, L. I., 1999. Stochastic Dynamic Programming and the Control of Queueing Systems. John

Wiley & Sons, New York.

S-34