An options-based approach to coordinating distributed decision systems

12
Decision Support An options-based approach to coordinating distributed decision systems Daniel R. Ball a,, Abhijit Deshmukh b , Nikunj Kapadia c a Department of Management and Decision Sciences, Leon Hess Business School, Monmouth University, 400 Cedar Avenue, West Long Branch, NJ 07764-1898, USA b School of Industrial Engineering, Purdue University, West Lafayette, IN 47907, USA c Department of Finance & Operations Management, Isenberg School of Management, University of Massachusetts, Amherst, MA 01003, USA article info Article history: Received 5 June 2012 Accepted 22 May 2014 Available online 17 July 2014 Keywords: Distributed decision making Multi-agent systems Uncertainty modeling Real options Risk management abstract Engineering and operations management decisions have become increasingly complex as a result of recent advances in information technology. The increased ability to access and communicate information has resulted in expanded system domains consisting of multiple agents, each exhibiting autonomous decision-making capabilities, with potentially complex logistics. Challenges regarding the management of these systems include heterogenous utility drivers and risk preferences among the agents, and various sources of system uncertainty. This paper presents a distributed options-based model that manages the impact of multiple forms of uncertainty from a multi-agent perspective, while adapting as both the stream of information and the capabilities of the agents are better known. Because the actions of decision makers may have an impact on the evolution of underlying sources of uncertainty, this endogenous relationship is modeled and a solution approach developed that converges to an equilibrium system state and improves the performance of agents and the system. The final result is a distributed options-based decision-making policy that both responds to and controls the evolution of uncertainty in large-scale engineering and operations management domains. Ó 2014 Elsevier B.V. All rights reserved. 1. Introduction Technology and operations management decisions have become increasingly complex in recent years. The increased ability to access and communicate information has resulted in expanded system domains consisting of multiple agents (decision-makers), with each exhibiting autonomous decision-making capabilities. Although these systems may possess the ability to perform and achieve benefits that are beyond those capable of smaller networks (Weiss, 1999), the following challenges are presented. First, the agents may possess individual utility drivers, thus creating an environment that must balance the goals of multiple agents and the overall system. Because an agent may exhibit a finite perfor- mance capacity, choices must be made regarding how to most effectively utilize each agent. These decisions are further compli- cated due to both continuous forms of uncertainty that provide a constant source of variability, as well as discrete, and sometimes rare, events that may provide dramatic disruptions to the system processes. Furthermore, the actions of agents may have an endog- enous impact on system characteristics, performance capacity, and underlying sources of uncertainty. Because these systems possess dynamic properties, continuous information updating creates a decision timing issue for the agents where the benefits and costs of waiting for additional information must be considered. This paper develops a distributed decision-making approach that incorporates decision flexibility to manage the impact of multiple uncertainties from a multi-agent perspective, and improve both agent and system performances. Consider the increase in technological innovation and the resulting impact on operations management. As information and communication technologies have increased, the competitive land- scape has also increased resulting in a global network of suppliers, producers, and distributors. Many of these supply chain entities are contract manufacturers and compete among other firms for their place in the chain. Because it is imperative to operate in a lean manner, costs must be kept to a minimum and resources allocated to yield a maximum utility. This utility can be viewed in two ways. From a systems point of view, production of the final good or ser- vice needs to be performed and delivered to the end customer in the most cost effective manner. However, each entity in the supply chain may be considered an agent (e.g., an individual supplier, producer, distributor) that makes its decisions based on satisfying its own economic utility drivers. The resulting decisions that each individual agent makes may be further complicated by various sources of uncertainty. For http://dx.doi.org/10.1016/j.ejor.2014.05.037 0377-2217/Ó 2014 Elsevier B.V. All rights reserved. Corresponding author. Tel.: +1 (732)923 4642; fax: +1 (732)263 5518. E-mail addresses: [email protected] (D.R. Ball), [email protected] (A. Deshmukh), [email protected] (N. Kapadia). European Journal of Operational Research 240 (2015) 706–717 Contents lists available at ScienceDirect European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor

Transcript of An options-based approach to coordinating distributed decision systems

Page 1: An options-based approach to coordinating distributed decision systems

European Journal of Operational Research 240 (2015) 706–717

Contents lists available at ScienceDirect

European Journal of Operational Research

journal homepage: www.elsevier .com/locate /e jor

Decision Support

An options-based approach to coordinating distributed decision systems

http://dx.doi.org/10.1016/j.ejor.2014.05.0370377-2217/� 2014 Elsevier B.V. All rights reserved.

⇑ Corresponding author. Tel.: +1 (732)923 4642; fax: +1 (732)263 5518.E-mail addresses: [email protected] (D.R. Ball), [email protected]

(A. Deshmukh), [email protected] (N. Kapadia).

Daniel R. Ball a,⇑, Abhijit Deshmukh b, Nikunj Kapadia c

a Department of Management and Decision Sciences, Leon Hess Business School, Monmouth University, 400 Cedar Avenue, West Long Branch, NJ 07764-1898, USAb School of Industrial Engineering, Purdue University, West Lafayette, IN 47907, USAc Department of Finance & Operations Management, Isenberg School of Management, University of Massachusetts, Amherst, MA 01003, USA

a r t i c l e i n f o

Article history:Received 5 June 2012Accepted 22 May 2014Available online 17 July 2014

Keywords:Distributed decision makingMulti-agent systemsUncertainty modelingReal optionsRisk management

a b s t r a c t

Engineering and operations management decisions have become increasingly complex as a result ofrecent advances in information technology. The increased ability to access and communicate informationhas resulted in expanded system domains consisting of multiple agents, each exhibiting autonomousdecision-making capabilities, with potentially complex logistics. Challenges regarding the managementof these systems include heterogenous utility drivers and risk preferences among the agents, and varioussources of system uncertainty. This paper presents a distributed options-based model that manages theimpact of multiple forms of uncertainty from a multi-agent perspective, while adapting as both thestream of information and the capabilities of the agents are better known. Because the actions of decisionmakers may have an impact on the evolution of underlying sources of uncertainty, this endogenousrelationship is modeled and a solution approach developed that converges to an equilibrium system stateand improves the performance of agents and the system. The final result is a distributed options-baseddecision-making policy that both responds to and controls the evolution of uncertainty in large-scaleengineering and operations management domains.

� 2014 Elsevier B.V. All rights reserved.

1. Introduction

Technology and operations management decisions havebecome increasingly complex in recent years. The increased abilityto access and communicate information has resulted in expandedsystem domains consisting of multiple agents (decision-makers),with each exhibiting autonomous decision-making capabilities.Although these systems may possess the ability to perform andachieve benefits that are beyond those capable of smaller networks(Weiss, 1999), the following challenges are presented. First, theagents may possess individual utility drivers, thus creating anenvironment that must balance the goals of multiple agents andthe overall system. Because an agent may exhibit a finite perfor-mance capacity, choices must be made regarding how to mosteffectively utilize each agent. These decisions are further compli-cated due to both continuous forms of uncertainty that provide aconstant source of variability, as well as discrete, and sometimesrare, events that may provide dramatic disruptions to the systemprocesses. Furthermore, the actions of agents may have an endog-enous impact on system characteristics, performance capacity, andunderlying sources of uncertainty. Because these systems possess

dynamic properties, continuous information updating creates adecision timing issue for the agents where the benefits and costsof waiting for additional information must be considered. Thispaper develops a distributed decision-making approach thatincorporates decision flexibility to manage the impact of multipleuncertainties from a multi-agent perspective, and improve bothagent and system performances.

Consider the increase in technological innovation and theresulting impact on operations management. As information andcommunication technologies have increased, the competitive land-scape has also increased resulting in a global network of suppliers,producers, and distributors. Many of these supply chain entities arecontract manufacturers and compete among other firms for theirplace in the chain. Because it is imperative to operate in a leanmanner, costs must be kept to a minimum and resources allocatedto yield a maximum utility. This utility can be viewed in two ways.From a systems point of view, production of the final good or ser-vice needs to be performed and delivered to the end customer inthe most cost effective manner. However, each entity in the supplychain may be considered an agent (e.g., an individual supplier,producer, distributor) that makes its decisions based on satisfyingits own economic utility drivers.

The resulting decisions that each individual agent makes maybe further complicated by various sources of uncertainty. For

Page 2: An options-based approach to coordinating distributed decision systems

D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717 707

example, production facilities typically exhibit a finite resourcecapacity and challenging decisions must be made when devisingoperating schedules and determining which customer orders to sat-isfy. These decisions are complicated by both the uncertain nature ofthe resource performance and customer order rates. A producer mayengage in a contract with an initial customer that utilizes its fullcapacity, but is then unable to satisfy an additional customer orderthat may yield even greater profits, thus jeopardizing its competitiveposition in the supply chain. From the customer’s perspective,resource selection and contracting decisions may be based on suchfactors as a desired delivery date or price. These contracted condi-tions, however, may ultimately be subject to the performanceuncertainty of the production resource. If the resource can producewith lower levels of process variability, then it may be able to quotebetter delivery dates and lower prices. Therefore, both the customerand producer benefit by more stability in the resource operations.

When devising a decision-making policy for this type of system,it is important to not only consider exogenous factors, but also rec-ognize any endogenous parameter relationships. In this example, apossible endogenous relationship may exist between productionstability and the types of customer orders that the resource pro-cesses. By recognizing this relationship, the resource may be usedto process customer orders that encourage performance stability.These particular orders may be ones that are easier to process orprovide smoother scheduling with fewer disruptions and reducedsetup costs. The resulting improvements to operational stabilitymay therefore yield reductions in future production uncertainty.

In order to effectively manage these types of systems, a deci-sion-making approach must be employed that retains flexibilityas the stream of new information arrives, while hedging the impactof uncertainty for multiple agents and incorporating the endoge-nous relationship between agent decisions, system responses,and future performance capacity. Without flexibility, each agentmust act immediately based on the information currently avail-able. Because there is now an increasing amount of real-time sys-tem information available, it may be very beneficial to postponethe timing of any decision and re-evaluate based on updated sys-tem states. The approach developed in this paper is based on theconcept of dynamic flexibility using options-based decision poli-cies. It should be noted that this model does not utilize the BlackScholes options pricing model (Black & Scholes, 1973) or any otherclosed form solution method that is commonly used in the litera-ture. In an effort to provide insights into systems that do not meetthe strict assumptions of these closed form solution approaches, adepiction of a multi-agent resource allocation system wasdesigned and a numerical solution presented in this paper. Thismodel is tested to evaluate the impact of managing uncertaintyfrom a distributed decision-making perspective with respect toimprovements in both agent utilities and system properties whileadhering to limited and finite capacity resource constraints.

This paper is organized as follows. Section 2 provides an overviewof literature encompassing the three primary areas used as the basisof this paper: real options, options exercise games, and risk manage-ment in engineering and operational systems. In Section 3, a specificmulti-agent system is defined and a distributed options-basedmodel is developed that includes both agents’ perspectives whileaccounting for the endogenous relationship between agent deci-sions, system performance, and the impact on the underlying sourceof system uncertainty. This distributed options-based policy istested numerically in Section 4 and concluding remarks arepresented in Section 5. A table giving descriptions of the variablesused in the mathematical formulations is presented in Appendix A(online only). Specific model details pertaining to the multi-agentcase study consisting of a task and resource agent are included inAppendix B and C (online only), respectively.

2. Background information

The model developed in this paper encompasses three primaryresearch areas: (1) the extension of options-based decision theoryinto an engineering and operations management domain; (2) theimpact of competitive agents on the ultimate decision-making pro-cess in an options framework; and (3) current approaches towardmanaging the impact of uncertainty in large-scale engineering andoperational systems. The first area demonstrates the flexibility ofutilizing options pricing concepts in non-financial domains; thesecond area introduces a relatively new area of research referredto as ‘‘options exercise games’’; and the final area provides adomain for which the findings of this paper may be applied. Thissection provides a review of some of the more relevant researchin these areas to the scope of this paper.

Since the seminal work of Black and Scholes (1973) and of Merton(1973), options pricing concepts have been used extensively to valuefinancial assets, with the theory then extended to real assets (Myers,1977) and commonly referred to as ‘‘real options’’. Some of the com-mon types of real options used in operations and project managementdecisions include the following: the option to defer investment deci-sions; a time-to-build option for staged investments; the option toalter the scale of operations through expansion, contraction, shut-downs, or restarts; the option to abandon operations or a project;the option to switch outputs, inputs, or operating modes; growthoptions; and complex multiple interacting options (Trigeorgis,1996). Because managerial and operational decisions may be consid-ered at least partially irreversible, there is a value to waiting for moreinformation about the system prior to making a decision. Detailedaccounts of both the theory and applications of real options are pro-vided in Dixit and Pindyck (1994) and Trigeorgis (1996). Similar tothese applications, the focus area of this paper is with decisions per-taining to real assets (e.g., production capacity). However, this paperfurther extends these existing options-based concepts into a domainconsisting of multiple decision makers.

In the supply chain situation described in Section 1, each agentmay interact in a competitive manner to maximize its own utility.Consequently, the concept of options exercise games has recentlyevolved and provides an intersection of traditional real options anal-ysis with game theory. In many situations, the ultimate utilitygained from an investment in a real asset is affected by the invest-ment strategies and actions of other agents in the system. This char-acteristic of real options differs from financial options in manyapplication domains. For example, real assets with respect to realestate development often have a finite elasticity of demand; devel-opers may have finite capacities; there is a limited supply of optionsavailable; and there is typically less than perfect competition amongdevelopers (Williams, 1993). Financial assets, however, do not exhi-bit these restrictions and the exercise of a financial option by anyagent does not change the characteristics of the security or theoption (Grenadier, 2002). In order to account for the impact of otheragents’ exercise strategies on the underlying value of a real optionand the subsequent optimal exercise policy for a given agent, gametheoretic principles must be included in the analysis. The model pre-sented in this paper incorporates game theoretic options to accountfor the changes in system characteristics and agent payoffs thatoccur due to the presence of other agents. Because resource capac-ities are finite, a task agent’s allocation option exercise policy mustaccount for the expected actions of other task agents competing forthe resource’s services. This scenario provides an extended domainfor the application of options exercise games concepts when allocat-ing resources in large-scale engineering and operational systems.

Traditional work in large-scale engineering and operational sys-tems, such as production systems and supply chain networks, hasfocused primarily on improving the overall cost-efficiency of the

Page 3: An options-based approach to coordinating distributed decision systems

Fig. 1. Overview of dual policy dynamics with endogenous system characteristicsin a distributed decision-making domain.

708 D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717

system. Whereas these models provide an efficient and low costapproach to managing the system, there is little flexibility torespond in case any portion of the system does not perform asoriginally planned. Due to the cascading nature of these systems,the impact of even one portion of a network that does not operateas planned may propagate throughout the entire system, and thushave a dramatic effect on the overall performance of the system.Therefore, a disruption to an inflexible system may result insignificant and widespread financial and performance costs.

The ability for a network to create flexibility and respond touncertainties may allow businesses to realize efficient current per-formance and potentially be able to quickly reconfigure and benefitfrom new unforeseen market opportunities. As evident from theterrorist events of September 11, 2001, the United States northeastpower outage of August 2003, the aftermath of Hurricane Katrinain August 2005, the Deepwater Horizon oil spill of April 2010,the Tohoku earthquake and subsequent tsunami of March 2011,and Hurricane Sandy in October 2012, this flexibility will also helpto build more secure and resilient supply chain networks from anational security and emergency response standpoint (Rice &Caniato, 2003a; Rice & Caniato, 2003b). Snyder and Daskin(2005) manage system risk through strategic facility locationplanning. This type of proactive approach is used to incorporaterisk management in the planning stage as opposed to being unpre-pared and reacting to failures during system operations. A reviewof some more common day-to-day sources of operational risksare presented in Chopra and Sodhi (2004) and include major dis-ruptions, production delays, information system risks, forecastingrisks, intellectual property risks, procurement risks, inventoryrisks, and capacity risks. This attention that has been drawn tomanaging the impact of uncertainty in large-scale systems maybe applied to both continuous operational uncertainties and thediscrete disruptions that may be rare, but result in major impact.

Competition and game theory have been incorporated into thetraditional supply chain models for systems with multiple suppli-ers. Elmaghraby (2000) provides a literature review regarding thecompetition for supply contracts and sourcing policies. Minner(2003) provides a good review of multiple supplier inventory mod-els to mitigate the effects of shortages situations, as well as relatedinventory problems that arise in the areas of reverse logistics andmulti-echelon systems. Babich, Burnetas, and Ritchken (2007)and Babich (2006) utilized both game theory and real options toanalyze contracting for a single retailer and multiple supplierswhen subject to supplier default risk during their production leadtimes. Ball and Deshmukh (2013) developed a cooperative realoptions approach utilizing an incentive (or disincentive) pricingscheme to hedge systemic risks when coordinating multi-agentdecisions in supply chain and resource allocations systems.

This paper builds upon these concepts to provide two keycontributions to this area: (1) flexibility in distributed decision-making from multi-agent perspectives and (2) endogeneitybetween agent actions and an underlying source of uncertainty.Therefore, this paper extends the current research to develop ageneral distributed decision-making model that manages theimpact of uncertainty from multiple agent perspectives in such away that both agents benefit (e.g., both the retailer and supplierin a supply chain network), while accounting for endogenousparameter modeling that yields overall system improvements. Thisgeneral model can be used to incorporate distributed decision pol-icies and risk management in a wide range of resource-constraineddomains, including engineering and system design processes, newtechnology development, enterprise systems, healthcare systems,homeland security, emergency preparedness and response, globaloperations, and supply chain management.

3. Modeling distributed decision-making under uncertainty

3.1. System overview

For the purpose of this paper, the following model is con-structed to manage the impact of uncertainty from the perspec-tives of two agents: a task and resource agent. The dynamics ofthis general system are explained throughout Section 3 and illus-trated in Fig. 1. Appendix A (online only) has a summary table ofdescriptions of the notations used in the paper. In this system,the resource performs operations on a task to produce an output(e.g., product for resale, consulting service). Tasks arrive at thesystem with the agent’s goal of allocating them to a resource forprocessing and an ultimate utility. The resource may initiateprocessing a task at set discrete time slots. At any given time slot,the resource presents its current production capacity in the form ofa conditionally guaranteed process rate, which would conse-quently assure the task agent of a guaranteed delivery time andcost for the finished product. Because production systemscommonly exhibit continuous random fluctuations in processtimes for various task types and sizes, it is assumed that this asso-ciated resource process rate evolves stochastically according to aknown distribution (e.g., the square-root mean reversion process).The decision for the task agent to request processing based on theterms of this current guaranteed process rate and give up theoption to wait for a more beneficial rate (and associated comple-tion time and cost) in the future is framed as an irreversible invest-ment decision by the task agent and evaluated using options-basedsolution techniques.

Additional tasks randomly arrive to the system, thus creating adynamic environment that may limit the choices available to thetask agent. Task types are heterogeneous and provide varying lev-els of utility benefit to the resource production system. Because theresource is of finite capacity and can only process one task at atime, it will consequently maintain a preference level for each tasktype based on the task’s respective utility contribution to the sys-tem. The current task agent recognizes this inherent competitionfor the resource’s capacity, and therefore factors the risk associatedwith postponing the allocation decision and subsequentlyexperiencing preemption from a preferred task that may arrive tothe system prior to making this decision. If preemption occurs, thetask may not be eligible to begin processing until the next availabletime slot. Therefore, flexibility in decision timing from the perspec-tive of the task agent may be beneficial with respect to ensuring amore economic and timely delivery of the final product; waiting

Page 4: An options-based approach to coordinating distributed decision systems

D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717 709

too long may reduce the availability of the resource due to thepotential arrival of competing, and preferred, tasks.

Once the task agent makes its decision to accept the conditionsof a given process rate and request processing from the resource,the resource must then decide whether to allow this current taskto exercise its allocation option or deny processing in case a morebeneficial task arrives in the future. The resource agent recognizesan endogenous relationship between its actual realized systemperformance costs (e.g., due to idleness and overtime factors) andthe ultimate stability of the production operations, and the respec-tive decision policy is constructed accordingly. Because of this rela-tionship, the resource agent aligns its utility driver to one thatprefers to process tasks that will have a positive effect on its futurestability and performance capabilities. This type of situation mayoccur when the starting and stopping of production is minimized,and setups are therefore held to a minimum. Therefore, more sta-ble production operations may yield a reduction in the overalluncertainty of the system. In this particular system, reduced per-formance uncertainty will lower the risk posed by the stochasticresource process rate and experienced by both the task andresource agents. So, from the resource agent’s viewpoint, flexibilityin its decision timing will allow it to better control both the imme-diate utility gained from processing various task types as well asfuture production stability.

A major contribution of this paper is that this particular systemand model utilizes a dual options-based policy where both the taskand resource agents value a similar option, but from two distinctviewpoints. The task agent considers the option to request process-ing at the current guaranteed rate (and corresponding time andcost conditions) or postpone this decision until a future time per-iod. The resource agent, however, considers the option of eitheraccepting a task’s request for processing, or instead postponing thisprocessing decision to evaluate future opportunities that may bemore beneficial to its specific utility function. Thus, each agentessentially possesses an option of either enacting a processingdecision or waiting for more information regarding the system,however from a different perspective.

Furthermore, the underlying sources of uncertainty are vieweddifferently by the task and resource agents. The task agent consid-ers the sources of uncertainty to be exogenous and, therefore,makes its decisions in an effort to respond to the potential impactsposed by these uncertainties (i.e., the stochastic process rate andthe discrete arrivals of potentially preempting tasks). However,the resource agent considers the future task arrivals to be exoge-nous, but realizes that there is an endogenous relationshipbetween its decisions and the future evolution of the system’suncertainty and, therefore, makes its decisions in a way that con-trols this evolution by carefully selecting tasks that encouragesystem stability. Therefore, the dual policy extracts value fromthe system by making decisions in a manner that yields a morebeneficial evolution of system uncertainty. The reader may referagain to the dynamic process illustrated in Fig. 1 to assist with thisunderstanding. This model now both responds to the impact ofmultiple forms of uncertainty in the system and from the perspec-tives of multiple agents, and also manages the future propagationof an underlying source of process uncertainty. The aim of thispaper is to demonstrate both the model development and benefitsusing a general theoretical multi-agent system; however, the the-ory may be applied to other domains exhibiting this type of agentand system behavior.

3.2. System uncertainty, entities, and state dynamics

The model system environment has been designed to allow fortask type heterogeneity and support the decision-making processfor both the task and resource agents with multiple forms of

uncertainty and dynamic states. Multiple state parameters arerequired to characterize the system environment at any given timeand support the complex decision-making processes.

Assume that RðtÞ is the known instantaneous process rate of theresource at time t. Processing that occurs at the resource during thetime interval ðt; t þ dtÞ is then modeled using this known processrate (i.e., RðtÞ) and the changes in this process rate that areexpected to occur over this interval (i.e., dR). It is thus assumedthat the process rate of a resource evolves stochastically over atime interval of length dt according to the following square-root,mean-reversion process:

dR ¼ gðR� RÞdt þ rffiffiffiRp

dz ð1Þ

where g is the speed of reversion towards the mean, R is the meanlevel of the process rate, r is the variance parameter (i.e., the stan-dard deviation of the change in process rate for the resource), and dzis the increment of a Wiener process. Therefore, the resource pro-cessing rate may fluctuate up or down, but tends to revert towardthe mean value R either through natural means or with some levelof management control (e.g., rescheduling of workers). It is assumedthat g P 0;R P 0, and the initial process rate value is a non-negative constant. For general notational purposes, let ~f ðRÞ denoteany specific parameters that define the process type (i.e.,~f ðRÞ ¼ ðl;g;r;RÞ for the square-root mean reversion process ofEq. (1)). This autoregressive square-root process has been used inthe financial literature to model the movement of short-terminterest rates (Cox, Ingersoll, & Ross, 1979; Cox, Ingersoll, & Ross,1985).

New tasks arrive to the system according to a Poisson process ofrate k. Let X 2 R denote the task type and, more specifically, Xi is

the task type for task i and ~f ðXÞ is the appropriate distributiondefining the system’s heterogenous task types. Both the revenueutility and size of each task are distinguished for each specific tasktype X and determined from their respective conversion functions,bV ðXÞ and x̂ðXÞ. Therefore, bV i ¼ ~fbV i

ðXiÞ and x̂i ¼ xi ti0

� �¼ ~f x̂i

ðXiÞ are

the respective general conversion functions for the revenue utilityand initial size for each task i 2 A, where A is the set of all tasks thatarrive to the system.

Once a new task has arrived, the agent-specific information fortask i at any time t is characterized by !iðtÞ and may include thetask type Xi, the number of units remaining to be processed xiðtÞ,and the expiration time Ti (i.e., the final point in time that the taskmay be allocated to the resource for processing). To accommodatemore detailed systems, let Nc ¼ NcðtÞ represent each entity c in thesystem at time t. These entities may include the resource or variousqueue positions and c 2 C, where C is the set of all entities. Giventhe system dynamics, let _N represent the rate of expected changein entity states over the time period ðt; t þ dtÞ. Thus, _N is a functionof both the new task arrival rate k and the resource process ratestandard deviation r; and the overall expected change in _N overðt; t þ dtÞ is

dN ¼ _N � dt ð2Þ

Because an agent’s decision policy may be dependent on the state ofmultiple variables, let YðtÞ be defined as the collection of anypertinent state variables at time t. For this system, letYðtÞ ¼ ðRðtÞ;!ðtÞ;NðtÞÞ. This system state structure allows for themodel to incorporate all of the pertinent real-time information foreach agent and system entity.

3.3. Task agent policy

Task agents develop their options-based task allocation exercisepolicy to determine the optimal decision based on the current andexpected future states of the system. This policy is constructed to

Page 5: An options-based approach to coordinating distributed decision systems

3

710 D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717

hedge the impact of future state uncertainty due to both (1) thestochastic resource process rate RðtÞ, and (2) the threat ofpreemption from a higher valued task that may arrive during thetime period ðt; t þ dtÞ. This flexible decision policy is based on theappropriate termination value X½YðtÞ; t�, continuation valuef ½YðtÞ; t�, and resulting option value when using the Bellman equa-tion (see Eq. (3)).

Now that the sources of system uncertainty have been identi-fied, it is important to demonstrate how these fluctuations impactthe behavior of the task agent. Each task i enters the system at timeti

0 with the task agent’s goal of having it processed by the resourcefor an ultimate terminal payoff utility. This task allocation oppor-tunity value is now written as F½RðtÞ;NðtÞ; t� and may be evaluatedusing the Bellman equation

F½RðtÞ;NðtÞ;t� ¼max X½RðtÞ;NðtÞ;t�; 11þqdt

E½FðRþdR;NþdN;tþdtÞjR;N�� �

ð3Þ

where R and N change to ðRþ dRÞ and ðNþ dNÞ, respectively, overthe small time interval of length dt as given by Eqs. (1) and (2),respectively. The first term on the right-hand side of Eq. (3) repre-sents the termination value and the second term represents thecontinuation value when discounted at the agent-specified discountrate q.

Because the net utility value earned by the task agent is afunction of the time that it takes the resource to process the task,the utility function has been generally specified as X½Xi;RðsÞ; s�.Upon exercise of the option, the task agent is guaranteed the fol-lowing net utility value associated with the process rate at time s

X½Xi;RðsÞ; s� ¼ V ½Xi;RðsÞ� �Z s

ti0

wðsÞds ð4Þ

where X½Xi;RðsÞ; s� is the net payoff if the option is exercised at times, for a guaranteed rate RðsÞ, and task type Xi; V ½Xi;RðsÞ� is the netknown utility revenue gained by this agent; and

R sti

0wðsÞds are any

applicable penalty waiting costs incurred prior to the exercise ofthe option. Note that Eq. (4) can be written in a more general formwith ½�� ¼ ½YðtÞ; t� or X ¼ Xi.

Because the ultimate payoff utility is a function of the guaran-teed process rate, there is an incentive for the task agent to waitfor a more beneficial process rate that would lead to lower process-ing costs and more timely delivery (i.e., thus yielding a highervalue of X½Xi;RðsÞ; s�), but at the penalty of forgoing other produc-tion opportunities or potential risk of being preempted by thefuture arrival of tasks deemed by the resource agent to be of ahigher priority. The goal for the task agent is to determine the crit-ical process rate (designated as R�ðtÞ), for any given amount of timein the system, where it is optimal to exercise the allocation optionand request processing as opposed to holding it and waiting for abetter guaranteed process rate.

The governing partial differential equation for the task alloca-tion opportunity value is now constructed using the approach pre-sented in Dixit and Pindyck (1994). For notational simplicity, thederivation may represent F½RðtÞ;NðtÞ; t�;RðtÞ, and NðtÞ as F;R, andN, respectively. If the option is not exercised at any given pointin time t, then the change in the value of the task allocation oppor-tunity over the time interval of length dt is

dF ¼ @F@R

dRþ @F@N

dNþ @F@t

dt ð5Þ

Because RðtÞ is an Ito process, Ito’s Lemma may be used to deter-mine the differential in the option value (i.e., dF) over dt. Using Ito’sLemma, substituting Eqs. (1) and (2), recognizing that E½dF� over dtmust earn the agent-specified discount rate q, and assuming that Rlies within the continuation region, Eq. (5) may be modified to yield

the following partial differential equation that is satisfied byF½RðtÞ;NðtÞ; t�:

12r2RFRR þ gðR� RÞFR þ Ft þ _NFN � qF ¼ 0 ð6Þ

subject to the following boundary conditions:

F½RðTÞ;NðTÞ; t ¼ T� ¼ f ð7ÞF½R�ðtÞ;NðtÞ; t� ¼ X½R�ðtÞ;NðtÞ; t� ð8ÞFR½R�ðtÞ;NðtÞ; t� ¼ XR½R�ðtÞ;NðtÞ; t� ð9Þ

The terminal boundary condition is presented in Eq. (7) and indi-cates that the task agent may be subject to a penalty cost f if itallows the option to expire at the maturity date T (i.e., the tasknever gets submitted for processing). The boundary conditions ofEqs. (8) and (9) represent the ‘‘value-matching’’ and ‘‘smooth-past-ing’’ conditions as described in Dixit and Pindyck (1994). For thispaper, the task agent’s allocation option value and decision policywas solved using a numerical approximation method presented inHull and White (1990).

3.3.1. Continuation value functionWhereas the termination value is given by Eq. (4), determining

the continuation value is more complicated and must account foreach possible system scenario. If a current task i decides to post-pone its allocation decision from time t until t þ dt, the followingthree possible scenarios may exist:

1. No new task arrival (Case 1). If no new task arrives duringðt; t þ dtÞ, then the current task i may either exercise itsoption at time t þ dt or continue until time t þ 2dt.Because tasks arrive according to a Poisson process of ratek, this scenario occurs with a probability of ð1� kdtÞ overthe time interval of length dt. The discounted continuationvalue for this case is

f 1½YðtÞ; t� ¼1

1þ qdtE½FðYðt þ dtÞ; t þ dtÞjYðtÞ; t� ð10Þ

where E½FðYðt þ dtÞ; t þ dtÞjYðtÞ; t� is the expected value ofthe option at time t þ dt given the current system stateconditions.

2. Arrival of a lesser (or equal) valued task (Case 2). Theremay be an arrival of a task j where Xj 6 Xi; the value ofthis new task does not exceed that of the current one.Assuming that only a higher valued task will cause pre-emption, the current task i will then retain the right toeither exercise its option at time t þ dt or continue untiltime t þ 2dt. This result is equivalent to the situationwhere no new task arrives (i.e., Case 1) and occurs witha probability of kdt � PðXj 6 XiÞ. The discounted continua-tion value for this case is equivalent to f 1½YðtÞ; t� and isrewritten here as

f 2½YðtÞ; t� ¼1

1þ qdtE½FðYðt þ dtÞ; t þ dtÞjYðtÞ; t� ð11Þ

. Arrival of a higher valued task (Case 3). A task j may arrive tothe system such that Xj > Xi, with an occurrence probabil-ity of kdt � PðXj > XiÞ. Consequently, this task j is of highervalue and will preempt the current task i for a time perioddependent on Xj and E½Rðt þ dtÞjRðtÞ�. For this case, theexpected continuation value at time t þ dt must be calcu-lated. This expected continuation value is necessarybecause of the heterogeneity in arriving task types andthe duration of any resulting resource blockage is depen-dent on both the arriving task type and the expected pro-cess rate quoted upon arrival. So, the discountedcontinuation value for this case is written as

Page 6: An options-based approach to coordinating distributed decision systems

D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717 711

f 3½YðtÞ; t� ¼1

1þ qdtE½f bðt þ dtÞjYðtÞ; t� ð12Þ

where E½f bðt þ dtÞjYðtÞ; t� is the expected overall continua-tion value due to blocking at time t þ dt.

The total expected continuation value is then calculated using

f ½YðtÞ; t� ¼ ð1� kdtÞ � f 1½YðtÞ; t� þ kdt � PðXj 6 XiÞ � f 2½YðtÞ; t�þ kdt � PðXj > XiÞ � f 3½YðtÞ; t� ð13Þ

Therefore, this overall continuation value is based on f 1½YðtÞ; t�;f 2½YðtÞ; t�; f 3½YðtÞ; t�, and the probabilities that each of these statecases occur. (Note that discounting already is accounted for duringthe calculation of f 1; f 2, and f 3).

3.3.2. Resource blocking and utilization informationBecause the resource only accepts new tasks for processing at

discretized time periods, it is important to evaluate any potentialresource blocking scenarios when determining this continuationvalue. Given the assumption that tasks arrive randomly accordingto a Poisson distribution of rate k and, once arrived, the task type isdrawn from the distribution ~f ðXÞ, the expected blocking time foreach task type and associated process rate must be determined.This model accommodates heterogeneous task types and incorpo-rates their respective preemption impact by evaluating the block-ing time (and discretized time periods) as a function of the processrate RðtÞ, task type X, and corresponding initial task size x̂.

First, let Mbmax ¼ Mbmax ½~f ðXÞ;RðsÞ� be the maximum number ofblocking periods given that a task of type Xmaxð~f ðXÞÞ is guaranteeda rate RðsÞ, and the allowable allocation times are discretized inintervals of length dt. (Note that, for simplicity, the conventionused in this paper is to use dt ¼ Dt to represent both continuousand discretized time interval lengths). Then,

Mbmax ½~f ðXÞ;RðsÞ� ¼x̂max=RðsÞ

dt

����þ ð14Þ

where Mbmax is x̂max=RðsÞdt rounded up to the nearest integer;

Xmax ¼ Xmaxð~f ðXÞÞ is the maximum task type given the task type distri-bution function ~f ðXÞ, and x̂max ¼ x̂maxðXmaxÞ is the respective maximumtask size. Let �Xðt; t þ dtÞ indicate the task arrivals during the time per-iod ðt; t þ dtÞ (i.e., �Xðt; t þ dtÞ ¼ 0 if no arrival; �Xðt; t þ dtÞ ¼ Xj if a task jarrives). Then, let PbðXi;

~f ðXÞÞ ¼ PðXjðt þ dtÞ > XiðtÞjð�Xðt; t þ dtÞ > 0ÞÞbe the probability that task i is blocked by an arriving task j at timet þ dt, given that task j arrives during ðt; t þ dtÞ.

It is now necessary to determine the task type cutoff points thatdistinguish each blocking period value. If Mbmax ¼ 1, then blockingoccurs for only one period and no additional cutoff information isneeded. However, if Mbmax > 1, then the blocking cutoff task typevalues for each possible number of blocking time periodsMb ¼ Mb½X;RðsÞ� must be calculated, where Mb 2 ð1;Mbmax Þ. LetXjðMbÞ be this cutoff point for an arriving task j and Mb blockingperiods. Therefore,

Mb½Xj;RðsÞ� ¼x̂jðXjÞ=RðsÞ

dt

����þ ð15Þ

is the number of blocking time periods rounded up to the nearestinteger, and at the exact cutoff point

Mb½Xj;RðsÞ� ¼xjðXjÞ=RðsÞ

dtð16Þ

Then, rearranging Eq. (16) and solving for xjðXjÞ yields

xjðXjÞ ¼ Mb � dt � RðsÞ ð17Þ

which represents the size of an arriving task j at the exact cutoffpoint for a blocking time period of Mb. The corresponding task type

at this cutoff point is then evaluated using the inverse functionX ¼ ~f�ðxÞ and, more specifically, Xj ¼ ~f�ðxjÞ for task j.

Finally, the probabilities associated with each possible blockingscenario are determined based on the distribution of arriving tasktypes ~f ðXÞ and the system’s task preference criteria. Note that thisprocess is illustrated in Ball (2007) and Appendix B (online only)for a specific example system where ~f ðXÞ follows a standard uni-form distribution (i.e., minimum 0; maximum 1) and the utilityrelationship is UðXjÞ > UðXiÞ if Xj > Xi.

3.3.3. General solution algorithmA general solution algorithm has been developed based on this

task agent policy using a recursive stochastic dynamic program-ming approach and is presented in this section. The allocationoption for any eligible task i, where i 2 A, may be requested forexercise by the task agent at any eligible time during the periodðti

0; TiÞ. (Note that an eligible task is defined as the highest valuedtask in the queueing system based on the resource’s task prefer-ence; and, the resource is empty and available to accept a taskfor processing). So, if t ¼ Ti, then the allocation option for task ihas expired and the resulting value is F½Ti� ¼ f ½Ti� ¼ fi, where f ½Ti�is the discounted continuation value at time Ti, and fi representsa terminal boundary condition value for task i (i.e., any penalty costincurred for not allocating the task for processing). In addition,f t!TiðtÞ is defined as the discounted continuation value if

resource blocking is expected to cause continuation from time tuntil at least the expiration date. Because the option expires attime Ti, this situation results in f t!Ti

ðtÞ ¼ f ½Ti� and, as previouslynoted, f ½Ti� ¼ fi.

At times ti0 6 t < Ti, an eligible task agent i may either request

to exercise its allocation option or postpone this decision until timet þ dt (unless t ¼ Ti � dt when continuation will lead to the expira-tion time Ti). Exercising the option will yield the termination valueX½RðsÞ; s;Xi� as calculated using Eq. (4).

Continuation will lead to one of the three cases presented inSection 3.3.1, with respective discounted expected continuationvalues of f 1; f 2, and f 3 evaluated using Eqs. (10)–(12), respectively.The discounted continuation value without any new taskarrival (i.e., Case 1, f 1) and with the arrival of a lower valued,non-blocking task (i.e., Case 2, f 2) are equivalent and evaluatedusing Eqs. (10) and (11), respectively (i.e., f 1 ¼ f 2 ¼ 1

1þqdt

E½Fðt þ dtÞjYðtÞ; t�).The continuation value with the arrival of a blocking task (i.e.,

Case 3, f 3) requires the calculation of the expected continuationvalue for all possible blocking scenarios. For this case, it may bepossible to be blocked either until the task’s expiration date, orfor a lesser time period that still allows for the current task i tobe processed prior to its expiration date. The discounted value ofcontinuing until expiration (i.e., expected blocking at least untilthe task’s expiration date) is evaluated from

f t!TiðtÞ ¼ 1

1þ qdtE½f t!Ti

ðt þ dtÞ� ð18Þ

where E½f t!Tiðt þ dtÞ� is the expected continuation-to-expiration

value next period (i.e., t þ dt).It is also possible that blocking will occur for a time period that

ends prior to the expiration date and the task agent may retain theopportunity to allocate the task at some time period in the future.In this case, it is necessary to evaluate the continuation value foreach possible number of blocking periods. Define f b;Mb

ðtÞ as thecontinuation value at time t if the resource is blocked for Mb

periods. Now, the expected overall continuation value with block-ing is necessary for use with Eq. (12) because it is not known inadvance what type of task will actually arrive and preempt the cur-rent task. Because the blocking time is a function of the task type

Page 7: An options-based approach to coordinating distributed decision systems

712 D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717

(see Eqs. (15) and (16)), the expected blocking time is required andalready factored into the next-period expected continuation valuedue to blocking (i.e., f b;Mb

ðt þ dtÞ).The continuation value for a given number of blockage periods

is dependent on both the number of blocking periods Mb and thecurrent process rate RðtÞ. If Mb ¼ 1, then blocking occurs for onlyone period and the associated continuation value is equivalent tothe general continuation value for this period, f b;Mb¼1ðtÞ ¼f ½YðtÞ; t�. However, if Mb > 1, then blocking will occur for morethan one period and there are two possible scenarios. First, ifTb½Xi;RðsÞ�P Ti � t, then blocking will occur at least until expira-tion (where Tb½Xi;RðsÞ� is the total blocking time for task i’s typeXi and a guaranteed process rate RðsÞ; and Ti � t is the timeremaining until task i expires), and f b;Mb

ðtÞ ¼ 11þqdt E½f t!Ti

ðt þ dtÞ�.However, if Tb½Xi;RðsÞ� < Ti � t, then the respective preemptionduration will end prior to the task’s expiration date andf b;MbðtÞ ¼ 1

1þqdt E½f b;Mb�1ðt þ dtÞ�, where E½f b;Mb�1ðt þ dtÞ� is the

expected continuation value next period (i.e., t þ dt) if blockingoccurs for the remaining Mb � 1 time periods. The overallexpected continuation value due to blocking at time t is thencalculated from

E½f bðtÞ� ¼XMbmax

Mb¼1

PbðMbjYðtÞÞ � f b;MbðtÞ ð19Þ

where PbðMbjYðtÞÞ is the probability of the resource being blockedfor Mb time periods given the system state and dynamic parame-ters, and f b;Mb

ðtÞ is the respective impact on the continuation value.Using this information and given the current task type Xi and sys-tem dynamics (i.e., ~f ðRÞ;~f ðXÞ; k), the recursive dynamic program-ming algorithm may be used to numerically solve for the currentperiod continuation value using Eq. (13).

The overall option value at any time may then be evaluatedusing the Bellman equation presented as Eq. (3), which is restatedhere as

F½YðtÞ; t� ¼ max X½YðtÞ; t�; f ½YðtÞ; t�f g

where X½YðtÞ; t� is the termination value calculated from Eq. (4) andf ½YðtÞ; t� is the expected continuation value determined from Eq.(13). The optimal options-based task allocation exercise policy forany task i 2 A may then be determined by solving the system ofequations and the Bellman equation for R�ðtÞ at all timesti

0 6 t 6 Ti. The reader is referred to Ball (2007) and Appendix B(online only) for specific details pertaining to the task agent’s policyfor the example system tested in this paper.

3.4. Resource agent policy

The resource agent constructs its decision policy in such a man-ner that extracts value from the endogenous nature of its source ofperformance uncertainty and improves its ability to maintain a sta-ble operational system. For any given time t that the resource isunoccupied and an eligible task (i.e., the highest valued task cur-rently in the queue) agent i requests to exercise its allocationoption, the resource agent must evaluate its option to either (1)accept the current task for processing, or (2) postpone processingand wait until a future time ðt þ dtÞ and observe both the changesin the system state YðtÞ and the associated expected impact anychanges have on its expected utilization DE½eU �. Note that, althoughthere may be an endogenous relationship between many types ofperformance metrics, the act of planning for increased utilizationis most appropriate for the system explored in this paper. For thisreason, only the resource utilization eU (and associated idlenesscosts) are included in the resource agent’s policy. The same theo-retical approach, however, may be used to include planning forother performance cost metrics, such as overtime costs.

This resource decision is based on the relationship between thetask type X and the resource’s utility preference (i.e., UðXjÞ > UðXiÞif Xj > Xi) and performance stability (i.e., dr

dX < 0). It is furtherassumed that the resource agent bases its decision on the fact thatit will either begin processing a task at time t or time t þ dt, but notat both times. This assumption provides a conservative measure oftask processing that may be representative of various resource lim-itations, such as the supply of raw materials, manpower, or excessstart-up costs.

Let wðtÞ be a resource decision indicator, where wðtÞ ¼ 0 if theresource postpones processing at time t and wðtÞ ¼ 1 if theresource accepts the current task agent’s request for processingat time t. The value of wðtÞ is chosen by the resource based on acomparison of the expected impact on its utilization if it eitheraccepts or rejects the current task’s (i.e., task i) request for process-ing. If the resource accepts task i’s request for processing at timet ¼ s, then wðtÞ ¼ 1 and

E½eUiðtÞjwðtÞ ¼ 1;YðtÞ� ¼ x̂iðXiÞ=RðsÞTb;i

ð20Þ

where E½eUiðtÞjwðtÞ ¼ 1;YðtÞ� is the expected utilization impact onthe resource over the time period ðt; t þ Tb;iÞ if it accepts the currenttask i at time t; x̂iðXiÞ is the initial size of task i;RðsÞ is the guaran-teed process rate at time s, and Tb;i ¼ Tb;i½Xi;RðsÞ� is the expectedresource utilization (blockage) time when processing task i.

If the resource decides to postpone the processing decision fromtime t until t þ dt, then wðtÞ ¼ 0 and the expected utilizationimpact is designated as E½eUðt þ dtÞjwðtÞ ¼ 0;YðtÞ�. This impact isdependent, however, on whether or not a new task arrives duringðt; t þ dtÞ as well as the expected system state at time t þ dt (i.e.,E½Yðt þ dtÞjYðtÞ�). Let �Xðt; t þ dtÞ ¼ 0 if no new task arrives duringðt; t þ dtÞ; and �Xðt; t þ dtÞ > 0 if a new task does arrive duringðt; t þ dtÞ. Given that new tasks arrive to the system according toa Poisson process of rate k, the probability that an arrival occursduring the time interval ðt; t þ dtÞ is kdt; likewise, the probabilitythat no arrival occurs during ðt; t þ dtÞ is ð1� kdtÞ. Therefore,E½eUðt þ dtÞjwðtÞ ¼ 0;YðtÞ� is calculated using the expected impactof both scenarios and is expressed as

E½eUðt þ dtÞjwðtÞ ¼ 0;YðtÞ� ¼ ð1� kdtÞ � E½eUðt þ dtÞjwðtÞ¼ 0;YðtÞ; �Xðt; t þ dtÞ ¼ 0� þ kdt � E½eUðt þ dtÞjwðtÞ¼ 0;YðtÞ; �Xðt; t þ dtÞ > 0� ð21Þ

where E½eUðt þ dtÞjwðtÞ ¼ 0;YðtÞ; �Xðt; t þ dtÞ ¼ 0� and E½eUðt þ dtÞjwðtÞ ¼ 0;YðtÞ; �Xðt; t þ dtÞ > 0� represent the impact on the expectedresource utilization given that the resource postpones processingat time t and there are either no new task arrivals or a task arrival,respectively, during the time period ðt; t þ dtÞ.

The resource agent evaluates both the system impacts if it wereto accept or postpone the processing decision (E½eUiðtÞjwðtÞ ¼ 1;YðtÞ� and E½eUðt þ dtÞjwðtÞ ¼ 0;YðtÞ�, respectively) and selects thedecision indicator such that

wðtÞ¼ 0; if E½eUiðtÞjwðtÞ¼1;YðtÞ�< E½eUðtþdtÞjwðtÞ¼0;YðtÞ�; or

1; if E½eUiðtÞjwðtÞ¼1;YðtÞ�P E½eUðtþdtÞjwðtÞ¼0;YðtÞ�

(ð22Þ

So, the resource agent will accept the task agent’s request to exer-cise its allocation option provided that the expected impact on itsutilization is non-negative. The reader is referred to Ball (2007)and Appendix C (online only) for specific details pertaining to theresource agent’s policy for the example system tested in this paper.

Page 8: An options-based approach to coordinating distributed decision systems

D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717 713

3.5. System performance, endogeneity, and equilibrium

Although both the task and resource agents make theirdecisions to hedge the impact of system uncertainty, this modelalso accounts for the actual costs that occur once these decisionshave been made and the system evolves. Thus, although a taskagent exercises its allocation option and is assured a guaranteedprocess rate RðsÞ and associated delivery time, the actual time thatthe task will be completed is still unknown due to processinguncertainty. Consequently, various system costs will also beunknown (i.e., the resource idleness costs or overtime costsincurred to assure that the quoted delivery date is satisfied). Asthe system process capacity evolves according to the distributionof RðtÞ, the task request will be processed and these systemperformance costs will be realized.

Let the total actual performance cost (designated as H) experi-enced by the system for a time horizon ðt0; tÞ be represented by

Hððt0; tÞjYðt0; tÞÞ ¼XZ

z¼1

hzðt0; tÞ ð23Þ

where hzðt0; tÞ is the performance cost metric z (e.g., idleness, over-time) accumulated over this time range ðt0; tÞ; z ¼ f1; . . . ; Zg, and Zis the total number of performance cost measures.

It should now be recognized that there may be an indirect rela-tionship between these system performance costs and the futurestability of the production system. A stable system in this casemay be one that consists of lowered and/or more constant levelsof production rate uncertainty. It may be noted that each of thesecosts does not merely represent an exogenous effect of the processuncertainty, but also possesses a direct impact on the evolution ofthis uncertainty.

Therefore, it may be deduced that some systems will yieldmore stable production systems if these performance costs arereduced (i.e., dr

dH > 0). The recognition of this relationship providesthe resource with a means of basing its processing decisions ontheir expected impact to future production stability, and allowingfor a policy that both responds to, and controls, the underlyingsource of system risk. Given the relationship between task sizeand task type (i.e., xðXÞ), and the nature of the system, it maybe observed that higher valued tasks will reduce the system per-formance costs (i.e., dH

dX < 0) and result in more stable productionsystems (i.e., dr

dX < 0). Therefore, it may be assumed for this systemthat r is monotonically related to both H and X (i.e., r is mono-tonically increasing with respect to H and decreasing withrespect to X). This relationship supports the notion that theresource agent will prefer higher valued tasks; thus,UðXjÞ > UðXiÞ if Xj > Xi.

In order to model this endogenous impact between systemperformance costs and r, let n denote a time interval index suchthat Hn ¼ Hððtn;0; TnÞjYðt � dt; tÞÞ and tn;0 ¼ Tn�1. Based on theunderstanding that the system’s current performance uncertaintymay be a function of its recent stability performance (i.e.,rn ¼ fnðHn�2;Hn�1;rn�1Þ), define the current level of uncertaintyto be

rn ¼ cðHn�2;Hn�1Þ � rn�1 ð24Þ

where cðHn�2;Hn�1Þ is a specified multiplicative factor. Thus, thesystem’s current and future levels of operational stability are basedon previous periods’ performance metrics. This linkage provides forthe recognition that the agent decisions may be made to bothrespond to, and control the future propagation of, system uncer-tainty and corresponding risk.

As the model is applied to multiple time periods, the level ofrisk in the system should approach an equilibrium value. Now, if

both agents act consistently with their respective optimal decisionpolicies, this equilibrium state may be representative of a Nashequilibrium.

3.6. Summary of the solution algorithm

This section presented both task and resource agent policies tomanage their respective risks, and an overall process to reachsystem equilibrium. The solution algorithm developed in thissection is summarized in Algorithm 1 to highlight the overall flowof logic and the ordering of different decision processes.

Algorithm 1. Distributed options algorithm for task/resourceallocation system

4. Case study: Project allocation in construction industry

4.1. Overview

This section presents a case study of the major project selectionprocess in construction industry. The firm in this case study hasresources to process a single major project, or a task, at any giventime. Projects, with different payoffs to the firm, arrive randomlyinto the system, and the firm receives revenue only after a projectis completed. The processing rate of the firm varies over time giventhe personnel and equipment availability. The firm incurs losses ifthe resources are idle, and the projects incur waiting costs if theyare not processed within a certain time horizon. In this discussion

Page 9: An options-based approach to coordinating distributed decision systems

714 D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717

we consider a simplified case where the firm has a choice betweenat most two projects at any given time for expositional clarity.

The example system consists of two queue slots (i.e., the‘‘queueing system’’) and one resource. Therefore, for this system,let Nc ¼ NcðtÞ represent the state of each of these entities at time t,where c is equal to either Q 1 ¼ Queue 1, Q 2 ¼ Queue 2, orR ¼ Resource. For this example, let the entity state indicate the num-ber of tasks at that location at any given time t; therefore, Nc ¼ 0 ifentity c is empty or Nc > 0 if it is occupied at time t. It is assumedfor this system that each entity may not hold more than one taskat any given time (i.e., maxfNcg ¼ 1;8c 2 C, where C is the set of allentities), and the entire system has a maximum capacity of two tasksat any given time (i.e.,

Pc2CNc 62 at any time t). Furthermore, let

XðNcÞ and iðNcÞ represent any task type and index, respectively, thatis located at entity Nc; where X½NcðtÞ� and i½NcðtÞ� more specificallyrepresent these respective task characteristics at time t. (Note thateach task type is more properly written as X½iðNcÞ� but, because eachentity may hold a maximum of only one task, this notation is simpli-fied to XðNcÞ ¼ X½iðNcÞ�).

Each system entity and the logic behind any task dynamics forthis example system are now explained further. The Resource is thesystem’s resource group that processes the tasks. Queue 1 is thequeue slot that, at any given time, will contain the highest valuedtask not currently being processed by the resource. Consequently,Queue 2 is the queue slot that contains the lower valued task (ifmore than one task is currently in the queueing system). Therefore,if only one task is in the queueing system (i.e., NQ1 þ NQ2 ¼ 1), thenthis task will be located in Queue 1. However, if there are two tasksin the queueing system (i.e., NQ1 þ NQ2 ¼ 2;NQ1 > 0, and NQ2 > 0),then task iðNQ1 Þ will be such that XðNQ 1 Þ > XðNQ2 Þ.

If a task j arrives to the system at time t, then the following sit-uations may arise. If NQ1 þ NQ2 þ NR ¼ 2, then the system capacityhas been achieved and task j balks the system (and may go toanother resource agent’s queueing system). Otherwise, ifNQ1 þ NQ 2 þ NR 6 1, then there is a maximum of one task in thesystem, Queue 2 is initially empty, and the entity that task j entersis dependent on the current state of the system YðtÞ. If taski ¼ i½NQ1 ðtÞ� > 0 and Xj > X½NQ1 ðtÞ�, then task i moves to Queue 2and task j enters the system at Queue 1. However, if taski ¼ i½NQ1 ðtÞ� > 0 and Xj 6 X½NQ1 ðtÞ�, then task i remains in Queue 1and task j enters the system at Queue 2. Furthermore, ifi½NQ1 ðtÞ� ¼ 0 (i.e., Queue 1 is empty), then any arriving task willautomatically enter Queue 1 regardless of the state of the Resource.

Once a task, either the current task i or any arriving task j, is inQueue 1 and the resource is empty (i.e., NR ¼ 0), then the task is eli-gible to request processing by the resource according to itsoptions-based exercise policy (i.e., if RðtÞP R�ðtÞ for the taskagent). If this task iðNQ1 Þ successfully exercises its allocationoption, then task iðNQ1 Þ begins processing at the Resource andany task in Queue 2 moves to Queue 1. Once a task has completedprocessing, then any task in Queue 1 becomes eligible to requestprocessing based on the state of the system and its exercise policy.Specific details pertaining to the task and resource options-baseddecision policies for this example system are included in Ball(2007) and Appendix B and C (online only), respectively.

Tasks arrive randomly to this system according to a Poissonprocess of rate k and the type of each arriving task is defined where~f ðXÞ follows the standard uniform distribution. In this example sys-tem, X may be representative of the priority, complexity, or poten-tial resource (or system) utility impact of a given task type. For anytask i 2 A, both the revenue utility and initial task size are functionsof the task type Xi as determined by their respective conversionfunctions, bV ðXÞ and x̂ðXÞ. For this example system, these functionsare both assumed to be linear with respect to task type X and maybe expressed for task i as ~fbV i

ðXiÞ ¼ Xi � eV and ~f x̂iðXiÞ ¼ Xi � ~x0, whereeV and ~x0 are the revenue utility and initial task size conversion

factor constants, respectively. Therefore, the revenue utility andinitial task size functions for task i are given by bV i ¼ Xi � eV andx̂i ¼ Xi � ~x0, respectively. Furthermore, the task agent’s net utilityrevenue function is given by

V ½Xi;RðsÞ� ¼ bV i �x̂iðXiÞRðsÞ cp ð25Þ

where bV i is the utility revenue gained by the task agent for realizingthe completed job, x̂iðXiÞ is a measure of the task size or complexity,RðsÞ is the guaranteed process rate, and cp is the resource processingcost rate.

The revenue utility and initial task size conversion factors aregiven by eV ¼ $10 million and ~x0 ¼ 2 units, respectively. Becausethe task types follow the standard uniform distribution, thebaseline task (i.e., task i) type is assumed to be representative ofthe mean range value (i.e., Xi ¼ 0:50). Therefore, the task agenthas one request (of size x̂i ¼ Xi � ~x0 ¼ 1 unit) to be allocated tothe resource for processing and, once processed, the finished taskis worth $5 million (i.e., bV i ¼ Xi � eV ¼ $5 million). The time scaleis set so that the task request enters the system at time ti

0 ¼ 0and has three months to commit to the resource for processing(i.e., Ti ¼ 0:25 years). Initially, the task agent does not incur anydirect penalty waiting costs, but does incur a 5% time-basedopportunity cost reflective of other production opportunities (i.e.,q ¼ 0:05). New tasks arrive to the system according to a Poissondistribution of rate k ¼ 25 tasks per year (i.e., approximately onenew eligible task for this particular system every 14.6 days). Forthis example system, there is no expiration penalty cost if theallocation option expires at the maturity date (i.e., f ¼ 0).

The resource charges a processing fee for this particular tasktype at a rate of approximately $1:096 million per day (cp ¼ $400million per year). This production fee includes all time, materials,and subcontractural work necessary to complete the job and hasbeen combined into a single processing rate term. The actual sys-tem performance costs are defined by the idle and overtime costs.For these performance costs, cidle ¼ $1 per year of system idlenessand cot ¼ $1:5 per task unit subject to overtime costs (thus yieldinga relative cost ratio of cot

cidle¼ 1:5). The mean processing time for this

baseline task type is 3.65 days (R ¼ 100 units per year) with a stan-dard deviation of r ¼ 15. These resource values yield a coefficientof variation value of approximately 0.15, which would represent aproduction system exhibiting a low level of continuous variability(Hopp & Spearman, 2001). Management control over the resourceallows the production rate to revert to this mean value at a rate ofg ¼ 15.

4.2. Experimental algorithm

Based on the task/resource agent dual options-based decisionpolicies and the resulting system performance costs incurred overany given time period, the model was tested using the theoreticalconcepts presented in this paper to incorporate the endogenousimpact that these decisions have on the underlying source of sys-tem uncertainty. Thus, the agents are able to both respond toand control systemic risk using their respective dual options-basedpolicy. The model was tested using simulation experiments untilthe dual policies produced convergence in the system’s level ofuncertainty and its associated impact on the performance costmetrics.

Simulations were conducted over a time horizon ðt0; eT Þ and thesystem performance costs were evaluated for each iteration n. Foreach iteration, the changes in system performance costs due toboth resource idleness (i.e., Dhidle;n) and overtime (i.e., Dhot;n) wereevaluated and compared to a specific tolerance factor �. The test

Page 10: An options-based approach to coordinating distributed decision systems

0 0.05 0.1 0.15 0.2 0.2580

85

90

95

100

105

110

115

120

Time, t

Crit

ical

Pro

cess

Rat

e, R

* (t)

σ = 5σ = 10σ = 15

Fig. 2. Critical process rate (R�ðtÞ) versus time (t) for the task agent’s policy atdifferent levels of process uncertainty (r).

10 11 12 13 14 15 16 170

5

10

15

20

25

30

35

40

45

50

Process Standard Deviation, σ

Perc

ent E

xpec

ted

Gai

n, E

[V+ ]''

Fig. 3. Percent expected gain (E½Vþ�00) versus process uncertainty (r) for the taskagent when using the options-based policy.

D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717 715

case was terminated if these results were within the specific toler-ance limit factor �, or

Dhidle;n ¼hidle;n � hidle;n�1

hidle;n�1

���� ���� < � ð26Þ

and

Dhot;n ¼hot;n � hot;n�1

hot;n�1

���� ���� < � ð27Þ

where Dhidle;n and Dhot;n are the changes in system performancecosts from iterations n� 1 to n due to resource idleness and pro-cessing overtime, respectively. For the example system tested inthis numerical study, eT ¼ 1 year and � ¼ 0:10.

If either Dhidle;n P � or Dhot;n P �, then the iteration process wascontinued by incrementing n by one (i.e., n ¼ nþ 1) and re-runningthe simulation over the time horizon ðt0; eT Þ using an updated valuefor the system uncertainty level. This updated rn value wasobtained using Eq. (24) and a multiplier factor cðHn�2;Hn�1Þ of

cðHn�2;Hn�1Þ ¼Hn�1

Hn�2ð28Þ

Therefore, Eq. (24) may be rewritten as

rn ¼Hn�1

Hn�2� rn�1 ð29Þ

Thus, the system’s future source of uncertainty rn (i.e., for iterationn) was adjusted based on r from the previous simulation run (i.e.,rn�1) and the ratio of the performance costs for iterations n� 1and n� 2. This relationship serves to connect both the idlenessand overtime conditions with the future levels of system stability.With each new iteration run n and updated level of process rateuncertainty (rn), the characteristic evolution of the resource’s per-formance may be affected. Iterations were continued by increment-ing n to nþ 1 for each simulation run until the tolerance conditionsfor the changes in both the overtime and idleness costs wereachieved, as specified in Eqs. (26) and (27), respectively.

4.3. Results

The model exhibiting distributed, options-based decision poli-cies with endogeneity presented in this paper was analyzed byconsidering the resulting impacts to the task agent, the resourceagent, and the overall system. The task agent’s options-based pol-icy, and the percentage expected gain in utility of using this policy(E½Vþ�00), was evaluated with respect to changes in both theresource’s performance uncertainty (r) and the rate of additionaltask arrivals to the system (k). As shown in Fig. 2, the critical pro-cess rate R�ðtÞ is higher for increased levels of r, thus indicatingthat the task agent is more likely to postpone its allocation decisionin systems exhibiting higher levels of production and processinguncertainty. Because the level of risk that the task agent is sub-jected to increases with respect to r, the value of using the flexibleoptions-based task allocation policy increases at higher levels ofsystem processing uncertainty as shown in Fig. 3. Furthermore,because there is a finite resource capacity and higher levels ofnew task arrival rates increase the threat of preemption, both thecritical process rate R�ðtÞ and the overall task agent benefit whenusing this flexible approach decrease with respect to increasing kas shown in Figs. 4 and 5, respectively. It should also be noted thatthe value to the task agent of retaining the allocation optiondecreases as the time to maturity approaches. This effect resultsin a decreasing critical process rate R�ðtÞ as the time in the systemincreases (see Figs. 2 and 4).

Given that the task agent’s initial decision is to request the exer-cise of its option and begin processing at the current guaranteed

rate RðtÞ, the resource agent focuses its decision policy to respondto this request in such a manner that extracts value from theendogenous nature of its source of performance uncertainty andimproves its ability to maintain stable system operations whileprocessing the highest valued tasks. The relative expected gain tothe resource agent when using its policy (E½Vþ�0) versus both thetask type X and the guaranteed process rate RðtÞ are illustratedfor a sample system state using the three-dimensional surface plotincluded as Fig. 6. These results indicate that the resource agentbenefit increases as the current task type decreases. Therefore,the resource’s ability to postpone processing of both smaller andlower prioritized task requests improves the overall expected util-ity for X 6 0:15.

Now that both the task and resource agents have developed andbenefit from their respective options-based decision policies, themodel was tested to determine the impact of these dual policieson the equilibrium state of the system. The model was simulatedfor various initial levels of r using this dual policy untilconvergence was achieved according to Eqs. (26) and (27) andthe results are shown in Figs. 7 and 8.

Page 11: An options-based approach to coordinating distributed decision systems

0 0.05 0.1 0.15 0.2 0.25

80

90

100

110

120

130

140

Time, t

Crit

ical

Pro

cess

Rat

e, R

* (t)

λ = 0λ = 20 tasks/yrλ = 40 tasks/yr

Fig. 4. Critical process rate (R�ðtÞ) versus time (t) for the task agent’s policy atdifferent levels of new task arrival rate (k).

0 5 10 15 20 25 30 35 400

10

20

30

40

50

60

70

Task Arrival Rate, λ

Perc

ent E

xpec

ted

Gai

n, E

[V+ ]''

Fig. 5. Percent expected gain (E½Vþ�00) versus new task arrival rate (k) for the taskagent when using the options-based policy.

80100

120140

00.05

0.10.15

0

0.5

1

1.5

2

2.5

3

Current Process Rate, RTask Type, X

Rel

ativ

e E

xpec

ted

Gai

n, E

[V+ ]'

0.5

1

1.5

2

2.5

Fig. 6. Relative expected gain ðE½Vþ�0Þ versus task type (X) and current process rate(R) for the resource agent when using the options-based policy.

5

10

15

Equi

libriu

m P

roce

ss R

ate

Stan

dard

Dev

iatio

n

Initial Process Rate Standard Deviation, σ0

4 6 8 10 12 14 16 18-50

0

50

Cha

nge

in P

roce

ss R

ate

Stan

dard

Dev

iatio

n at

Equ

ilibriu

m, (

%)

σ*Δσ*

Fig. 7. Resource process rate standard deviation (r�) and the percentage change inprocess rate standard deviation (Dr�) at equilibrium versus the initial process ratestandard deviation (r0) when using the dual options-based decision policies.

6 8 10 12 14 16-50

-40

-30

-20

-10

0

10

20

30

40

50

Initial Process Rate Standard Deviation, σ0

Cha

nge

in S

yste

m P

erfo

rman

ce C

osts

at E

quilib

rium

, (%

)Δθidle

*

Δθot*

ΔΘ*

ig. 8. Change in system performance costs (idle, overtime, and total) at equilib-um versus initial process rate standard deviation (r0) when using the dualptions-based decision policies.

716 D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717

Frio

For each simulation experiment, the system was managed bythe actions of both the task and resource agents and the followingresults were determined at system equilibrium: standard deviationof the process rate (r�); change in process rate uncertainty (Dr�);change in system idle costs Dh�idle

� �; change in system overtime

costs (Dh�ot); and the change in total system performance costs(DH�). The results presented in Fig. 7 show that, with the exceptionof very low levels of initial production uncertainty (i.e., r ¼ 5), theoverall stability of the system improves at equilibrium, withthe greatest changes occurring at higher levels of r (i.e., approxi-mate reductions in r of 19% for the baseline case (r ¼ 15) and24% when r ¼ 17:5). These equilibrium reductions in productionuncertainty (and, therefore, increases in operational stability) arealso manifested in the changes in system performance costs. Basedon the results presented in Fig. 8, it should be noted that, althoughthe system idleness costs increase due to the ability of the agentsto postpone allocation and processing decisions, the ability to man-

Page 12: An options-based approach to coordinating distributed decision systems

D.R. Ball et al. / European Journal of Operational Research 240 (2015) 706–717 717

age the impact of system uncertainties results in increasing reduc-tions of the system overtime costs with respect to the initial levelof processing uncertainty (i.e., approximately 42% for the baselinecase (r ¼ 15) and 47% when r ¼ 17:5). The combination ofchanges in both the idleness and overtime costs results in an over-all reduction of the total system performance costs of approxi-mately 19% for the baseline case (i.e., r ¼ 15) and up to 24%when r ¼ 17:5. These system improvements increase with respectto the level of production uncertainty, thus following from theprevious experimental results presented in this paper that demon-strate increasing agent gains when using options-based decisionpolicies in systems exhibiting higher levels of uncertainty.

5. Conclusions

The primary objective of this paper was to develop a distributeddecision-making approach that manages risk from a multi-agentperspective in large-scale engineering and operational systems,and improves both agent and system performance utilities underresource capacity constraints. The general approach used todevelop the theoretical models for this paper was based on theintersection of engineering and operational systems management,the concept of dynamic flexibility using options-based decisionpolicies, and game theory. The effect of managing the impact ofmultiple sources of uncertainty from a distributed decision-mak-ing perspective was evaluated with respect to improvements inboth agent utilities and system properties while adhering to anylimited and finite capacity resource constraints. By properly iden-tifying an agent’s decision options and the relation to any underly-ing exogenous or endogenous system uncertainties, a wide rangeof resource-constrained domains may be modeled and solvedusing the distributed, options-based framework developed in thispaper. Possible application domains include the management ofengineering and system design processes, new technologydevelopment, enterprise systems, homeland security, healthcaresystems, emergency preparedness and response systems, globaloperations, and supply chain networks.

The theoretical model developed in Section 3 was tested using acase study in Section 4. A primary contribution of this paper is topresent a distributed decision-making model that manages theimpact of uncertainty from a multi-agent perspective. In an effortto avoid limiting the model to the rare situations where conditionssuitable to utilizing Black–Scholes options pricing techniques maybe appropriate, this new model has been developed to be flexibleso that it may be modified and applied to a much wider scope ofreal system domains. Because each of these domains may requireunique modifications to the model, the general model developed

in this paper was tested numerically in Section 4 and the ultimatefindings should be translatable to a much wider range of technol-ogy and operations management systems.

Appendix. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.ejor.2014.05.037.

References

Babich, V., 2006. Vulnerable options in supply chains: Effects of suppliercompetition. SSRN Working Paper Series.

Babich, V., Burnetas, A. N., & Ritchken, P. H. (2007). Competition and diversificationeffects in supply chains with supplier default risk. Manufacturing & ServiceOperations Management, 9, 123–146.

Ball, D. R. (2007). Managing multi-agent risk and system uncertainty using options-based decision policies. Ph.D. thesis. University of Massachusetts Amherst.

Ball, D. R., & Deshmukh, A. (2013). A cooperative options-based strategy forcoordinating supply chain and resource allocation decisions. InternationalJournal of Management and Decision Making, 12, 259–285.

Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. TheJournal of Political Economy, 81, 637–654.

Chopra, S., & Sodhi, M. S. (2004). Managing risk to avoid supply-chain breakdown.MIT Sloan Management Review, 46, 53–61.

Cox, J. C., Ingersoll, J. E., & Ross, S. A. (1979). Duration and the measurement of basisrisk. The Journal of Business, 52, 51–61.

Cox, J. C., Ingersoll, J. E., & Ross, S. A. (1985). A theory of the term structure ofinterest rates. Econometrica, 53, 385–408.

Dixit, A. K., & Pindyck, R. S. (1994). Investment under uncertainty. PrincetonUniversity Press.

Elmaghraby, W. J. (2000). Supply contract competition and sourcing policies.Manufacturing & Service Operations Management, 2, 350–371.

Grenadier, S. R. (2002). Option exercise games: An application to the equilibriuminvestment strategies of firms. The Review of Financial Studies, 15, 691–721.

Hopp, W. J., & Spearman, M. L. (2001). Factory physics: Foundations of manufacturingmanagement (2nd ed.). Irwin McGraw-Hill.

Hull, J., & White, A. (1990). Valuing derivative securities using the explicit finitedifference method. The Journal of Financial and Quantitative Analysis, 25, 87–100.

Merton, R. C. (1973). Theory of rational option pricing. The Bell Journal of Economicsand Management Science, 4, 141–183.

Minner, S. (2003). Multiple-supplier inventory models in supply chainmanagement: A review. International Journal of Production Economics, 265–279.

Myers, S. C. (1977). Determinants of corporate borrowing. Journal of FinancialEconomics, 5, 147–175.

Rice, J. B., & Caniato, F. (2003a). Building a secure and resilient supply network.Supply Chain Management Review, 7, 22–30.

Rice, J. B., & Caniato, F. (2003b). Supply chain response to terrorism: Creatingresilient and secure supply chains. Technical Report. MIT Center forTransportation and Logistics.

Snyder, L. V., & Daskin, M. S. (2005). Reliability models for facility location: Theexpected failure cost case. Transportation Science, 39, 400–416.

Trigeorgis, L. (1996). Real options: Managerial flexibility and strategy in resourceallocation. The MIT Press.

Weiss, G. (Ed.). (1999). Multiagent systems: A modern approach to distributed artificialintelligence. The MIT Press.

Williams, J. T. (1993). Equilibrium and options on real assets. The Review of FinancialStudies, 6, 825–850.