OPTIMIZATION METHODS IN INTENSITY MODULATED...
Transcript of OPTIMIZATION METHODS IN INTENSITY MODULATED...
OPTIMIZATION METHODS IN INTENSITY MODULATED RADIATION THERAPYTREATMENT PLANNING
By
DIONNE M. ALEMAN
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
2007
1
c© 2007 Dionne M. Aleman
2
To my ever-patient wife Nancy, and to my father Roberto, who, if not for the
shortcomings of current cancer treatments, might still be with us today
3
ACKNOWLEDGMENTS
Many thanks to Nancy Huang, Christopher Fox and Bart Lynch for so helpfully and
happily explaining the physics of medical physics to me on a wide range of topics, even
when those topics are not relevant to my own research.
This work was supported in part by the NSF Alliances for Graduate Education and
the Professoriate, the NSF Graduate Research Fellowship and NSF grant DMI-0457394.
4
TABLE OF CONTENTS
page
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
CHAPTER
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1 Intensity Modulated Radiation Therapy Treatment Planning . . . . . . . . 121.2 Dissertation Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.1 Fluence Map Optimization . . . . . . . . . . . . . . . . . . . . . . . 131.2.2 Beam Orientation Optimization . . . . . . . . . . . . . . . . . . . . 141.2.3 Fractionation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.2.4 Modeling the Dose Deposition of a Beam . . . . . . . . . . . . . . . 15
1.3 Contribution Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.3.1 Fluence map optimization . . . . . . . . . . . . . . . . . . . . . . . 161.3.2 Beam Orientation Optimization . . . . . . . . . . . . . . . . . . . . 171.3.3 Fractionation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.3.4 Modeling the Dose Deposition of a Beam . . . . . . . . . . . . . . . 19
2 FLUENCE MAP OPTIMIZATION . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.3 Model Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.4 Spatial Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.5 A Primal-Dual Interior Point Algorithm for FMO . . . . . . . . . . . . . . 25
2.5.1 Primal-Dual Interior Point Algorithm . . . . . . . . . . . . . . . . . 282.5.2 Hessian Approximations . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.2.1 Single Hessian Approximation . . . . . . . . . . . . . . . . 292.5.2.2 BFGS Hessian Update . . . . . . . . . . . . . . . . . . . . 30
2.5.3 Insignificant Beamlets . . . . . . . . . . . . . . . . . . . . . . . . . . 302.5.4 Warm Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.6.1 How Small of a Duality Gap is Necessary? . . . . . . . . . . . . . . 332.6.2 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . . 342.6.3 Clinical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.6.4 Spatial Coefficient Results . . . . . . . . . . . . . . . . . . . . . . . 372.6.5 Warm Start Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5
3 BEAM ORIENTATION OPTIMIZATION . . . . . . . . . . . . . . . . . . . . . 46
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.3 Model Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.4 Mixed-Integer Model Formulation . . . . . . . . . . . . . . . . . . . . . . . 503.5 Beam Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.6 A Response Surface Approach to BOO . . . . . . . . . . . . . . . . . . . . 54
3.6.1 Overview of Response Surfaces . . . . . . . . . . . . . . . . . . . . . 553.6.2 Determining the Next Observation . . . . . . . . . . . . . . . . . . . 58
3.6.2.1 Maximizing the expected improvement . . . . . . . . . . . 593.6.2.2 Obtaining an upper bound on the uncertainty . . . . . . . 593.6.2.3 Branch-and-Bound . . . . . . . . . . . . . . . . . . . . . . 61
3.6.3 Method of Obtaining the Next Observation . . . . . . . . . . . . . . 693.7 Neighborhood Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.7.2 Neighborhood Search Approaches . . . . . . . . . . . . . . . . . . . 703.7.3 A Deterministic Neighborhood Search Method for BOO . . . . . . . 70
3.7.3.1 Neighborhood Definition . . . . . . . . . . . . . . . . . . . 713.7.3.2 Neighbor Selection . . . . . . . . . . . . . . . . . . . . . . 723.7.3.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 72
3.7.4 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . 733.7.4.1 Neighborhood Definition . . . . . . . . . . . . . . . . . . . 753.7.4.2 Neighbor Selection . . . . . . . . . . . . . . . . . . . . . . 753.7.4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 753.7.4.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.7.5 A New Neighborhood Structure . . . . . . . . . . . . . . . . . . . . 773.8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.8.1 Evaluating Plan Quality . . . . . . . . . . . . . . . . . . . . . . . . 793.8.1.1 Target coverage . . . . . . . . . . . . . . . . . . . . . . . . 793.8.1.2 Critical structure sparing . . . . . . . . . . . . . . . . . . 80
3.8.2 Response Surface Method Results . . . . . . . . . . . . . . . . . . . 813.8.2.1 Proof of concept . . . . . . . . . . . . . . . . . . . . . . . 833.8.2.2 Adding a non-coplanar beam to a coplanar solution . . . . 843.8.2.3 Clinical results . . . . . . . . . . . . . . . . . . . . . . . . 85
3.8.3 Neighborhood Search Method Results . . . . . . . . . . . . . . . . . 883.8.3.1 Add/Drop algorithm results . . . . . . . . . . . . . . . . . 893.8.3.2 Simulated Annealing results . . . . . . . . . . . . . . . . . 893.8.3.3 Clinical results . . . . . . . . . . . . . . . . . . . . . . . . 91
3.9 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . . . . 923.9.1 Response Surface Conclusions . . . . . . . . . . . . . . . . . . . . . 923.9.2 Neighborhood Search Conclusions . . . . . . . . . . . . . . . . . . . 95
6
4 FRACTIONATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 964.2 Model Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.1 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . . 1014.3.2 Clinical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1024.3.3 Spatial Coefficient Results . . . . . . . . . . . . . . . . . . . . . . . 103
4.4 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . . . . 111
5 A MONTE CARLO METHOD FOR MODELING DOSE DEPOSITION . . . . 120
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1205.2 Monte Carlo Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1215.3 Dose Distribution of a Beamlet . . . . . . . . . . . . . . . . . . . . . . . . 121
5.3.1 Depth-Dose Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1225.3.2 Lateral Penumbra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.4 Methodology to Model a Beamlet . . . . . . . . . . . . . . . . . . . . . . . 1245.4.1 Modeling the Depth-Dose Curve . . . . . . . . . . . . . . . . . . . . 1255.4.2 Modeling the Lateral Penumbra . . . . . . . . . . . . . . . . . . . . 128
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1325.6 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . . . . 138
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7
LIST OF TABLES
Table page
2-1 Average run times for 5-beam treatment plans . . . . . . . . . . . . . . . . . . . 36
2-2 FMO value obtained using ε = 0.001 . . . . . . . . . . . . . . . . . . . . . . . . 36
2-3 Comparison of duality gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2-4 Performance measures of interior point method warm starts . . . . . . . . . . . 43
2-5 Performance measures of projected gradient method warm starts . . . . . . . . . 44
3-1 Sparing criteria varies for each critical structure . . . . . . . . . . . . . . . . . . 80
3-2 Sizes of test cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3-3 Minimum FMO value obtained and time required to obtain it . . . . . . . . . . 86
3-4 Target coverage achieved by the treatment plans . . . . . . . . . . . . . . . . . . 86
3-5 Percentage of plans in which an organ is spared . . . . . . . . . . . . . . . . . . 87
3-6 Definitions of implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4-1 Case sizes and run times using identical algorithm and weighting parameters . . 102
4-2 Sparing criteria varies for each critical structure . . . . . . . . . . . . . . . . . . 103
5-1 Computation times in minutes of Monte Carlo simulations . . . . . . . . . . . . 132
5-2 Computation times for dose distribution fits . . . . . . . . . . . . . . . . . . . . 134
5-3 Variation of fits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8
LIST OF FIGURES
Figure page
2-1 Progression of duality gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2-2 Dose received by targets as a function of the duality gap . . . . . . . . . . . . . 35
2-3 Dose received by saliva glands as a function of the duality gap . . . . . . . . . . 35
2-4 Quality of DVHs for various duality gaps . . . . . . . . . . . . . . . . . . . . . . 37
2-5 The spatial coefficients used for two cases . . . . . . . . . . . . . . . . . . . . . 38
2-6 Comparison of spatial and non-spatial treatment plans . . . . . . . . . . . . . . 39
2-7 Comparison of spatial and non-spatial treatment plans . . . . . . . . . . . . . . 40
3-1 A linear accelerator and the available movements . . . . . . . . . . . . . . . . . 46
3-2 FMO value as a function of two angles . . . . . . . . . . . . . . . . . . . . . . . 51
3-3 Initial regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3-4 Partitioning a region into subregions . . . . . . . . . . . . . . . . . . . . . . . . 67
3-5 Accounting for symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3-6 The flip neighborhood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3-7 Selection probabilities in Nh(θ) and N Fh (θ) . . . . . . . . . . . . . . . . . . . . 78
3-8 Proof of concept results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3-9 Comparison of response surface, Add/Drop and equi-spaced targets . . . . . . . 87
3-10 Comparison of response surface, Add/Drop and equi-spaced targets . . . . . . . 88
3-11 Add/Drop and simulated annealing comparison of FMO convergence . . . . . . 90
3-12 Comparison of Add/Drop and 7-beam equi-spaced plans . . . . . . . . . . . . . 93
3-13 Comparison of simulated annealing and 7-beam equi-spaced plans . . . . . . . . 93
4-1 Target DVHs, saliva DVHs and axial slices in Fractions 1 and 2 . . . . . . . . . 104
4-2 Target DVHs, saliva DVHs and axial slices in Fractions 1 and 2 . . . . . . . . . 105
4-3 Target DVHs, saliva DVHs and axial slices in Fractions 1 and 2 . . . . . . . . . 106
4-4 Target DVHs, saliva DVHs and axial slices in Fractions 1 and 2 . . . . . . . . . 107
4-5 Target DVHs, saliva DVHs and axial slices in Fractions 1 and 2 . . . . . . . . . 108
9
4-6 Target DVHs, saliva DVHs and axial slices in Fractions 1 and 2 . . . . . . . . . 109
4-7 Target DVHs, saliva DVHs and axial slices in Fractions 1 and 2 . . . . . . . . . 110
4-8 DVHs and axial slices in Fractions 1 and 2 using spatial coefficients . . . . . . . 112
4-9 DVHs and axial slices in Fractions 1 and 2 using spatial coefficients . . . . . . . 113
4-10 DVHs and axial slices in Fractions 1 and 2 using spatial coefficients . . . . . . . 114
4-11 DVHs and axial slices in Fractions 1 and 2 using spatial coefficients . . . . . . . 115
4-12 DVHs and axial slices in Fractions 1 and 2 using spatial coefficients . . . . . . . 116
4-13 DVHs and axial slices in Fractions 1 and 2 using spatial coefficients . . . . . . . 117
4-14 DVHs and axial slices in Fractions 1 and 2 using spatial coefficients . . . . . . . 118
5-1 Dose distribution of a single beamlet in various tissues . . . . . . . . . . . . . . 122
5-2 Colorwash of the lateral penumbra of a finite sized pencil beam . . . . . . . . . 124
5-3 Plot of the lateral penumbra of a finite sized pencil beam . . . . . . . . . . . . . 125
5-4 Observed depth-dose curve in water for several histories . . . . . . . . . . . . . . 126
5-5 Polynomial fits of several histories . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5-6 Variation of polynomial fit as function of degree . . . . . . . . . . . . . . . . . . 128
5-7 An error function and an error function pair . . . . . . . . . . . . . . . . . . . . 129
5-8 Lateral penumbra for several numbers of Monte Carlo histories . . . . . . . . . . 130
5-9 Error function fits of several histories . . . . . . . . . . . . . . . . . . . . . . . . 131
5-10 Error function pairs summed to approximate a beamlet in water . . . . . . . . . 135
5-11 Depth-dose curves in muscle tissue. . . . . . . . . . . . . . . . . . . . . . . . . . 135
5-12 Lateral penumbra curves in muscle tissue. . . . . . . . . . . . . . . . . . . . . . 136
5-13 Depth-dose curves in lung tissue. . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5-14 Lateral penumbra curves in lung tissue. . . . . . . . . . . . . . . . . . . . . . . . 137
5-15 Depth-dose curves in heterogeneous muscle and lung tissue. . . . . . . . . . . . 138
5-16 Variation of fits as a function of number of histories . . . . . . . . . . . . . . . . 139
10
Abstract of Dissertation Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of theRequirements for the Degree of Doctor of Philosophy
OPTIMIZATION METHODS IN INTENSITY MODULATED RADIATION THERAPYTREATMENT PLANNING
By
Dionne M. Aleman
December 2007
Chair: H. Edwin RomeijnMajor: Industrial and Systems Engineering
The design of a treatment plan for intensity modulated radiation therapy a
mathematical programming problem which is not yet satisfactorily solved. Current
techniques include dividing the problem into several subproblems, which are then solved
sequentially. My research addresses several of these subproblems, particularly, beam
orientation optimization (BOO), fluence map optimization (FMO) and fractionation.
The integration of the BOO and FMO subproblems is considered, as well as improved
techniques to model the dose deposition of a beamlet.
11
CHAPTER 1INTRODUCTION
1.1 Intensity Modulated Radiation Therapy Treatment Planning
Every year, approximately 1.4 million people in the United States alone are newly
diagnosed with cancer (American Cancer Society, [1]). More than half of these patients
will receive some form of radiation therapy (Murphy et al. [2], Perez and Brady [3]), and
approximately half of these patients may significantly benefit from conformal radiation
therapy (Steel [4]). During this therapy, beams of radiation pass through a patient,
thereby killing both cancerous and normal cells. Although some patients die of their
disease despite sophisticated treatment methods, many patients may suffer unpleasant side
effects as a result of the radiation therapy which may severely detract from the patient’s
quality of life.
Thus, the radiation treatment must be carefully planned so that a clinically
prescribed dose is delivered to targets containing cancerous cells so that the cancer
will be eradicated. Simultaneously, a small enough dose must be delivered to the nearby
organs and tissues (called critical structures) so that they may survive the treatment. This
is achieved by irradiating the patient using several beams sent at different orientations
spaced around the patient so that the intersection of these beams includes the targets,
which thus receive the highest radiation dose, whereas the critical structures receive
radiation from some, but not all, beams and may thus be spared. Currently, a technique
called intensity modulated radiation therapy (IMRT) is considered to be the most effective
radiation therapy for many forms of cancer.
The problem of designing an IMRT treatment plan for an individual patient is a
large-scale mathematical programming problem that is not yet solved satisfactorily.
Current treatment planning systems decompose the planning problem into several stages,
and the corresponding subproblems are solved sequentially. These subproblems include
determining the number and orientation of the beams of radiation, the radiation dose
12
distribution of each beam and the decomposition of a single treatment plan into several
smaller fractions.
This work addresses the integration of the beam orientation optimization (BOO) and
fluence map optimization (FMO) subproblems based on a convex formulation of the latter
and associated efficient algorithms for solving it, an approach which has not received much
attention in previous studies. The fractionation problem, the problem of dividing a single
treatment plan into the 35 treatments (fractions) the patient will actually receive, is also
addressed. Also, the problem of modeling the dose deposition of a beam is also considered.
1.2 Dissertation Summary
In IMRT, each beam is modeled as a collection of hundreds of small beamlets, the
fluences of which can be controlled individually. These fluence values are known as a
fluence map, and optimization of these fluences given a fixed set of beams is known as
fluence map optimization. The optimal solution value of the FMO problem quantifies the
quality of the treatment plan, where quality means the ability of the plan to deliver the
prescribed radiation dose to the specified target structures while sparing critical structures
by ensuring that they receive an acceptably low amount of radiation. Thus, the quality of
a set of beams can be measured by the optimal solution of the FMO problem performed
with those beams. Thus, the problem of selecting the best directions from which to
deliver radiation to the patient (the BOO problem) is based on the treatment plan quality
indicated by the optimal solution value to the corresponding FMO problem.
1.2.1 Fluence Map Optimization
One of the most popular subproblems of the intensity modulated radiation therapy
(IMRT) treatment planning problem is the fluence map optimization (FMO) problem.
In IMRT, each beam of radiation can be discretized in hundreds of smaller beamlets, the
radiation intensities (fluences) of which can be modulated independently of the other
beamlets. For a given set of beams, the beamlet fluences can greatly influence the quality
of the treatment plan, that is, the ability of the treatment to deposit the prescibed amount
13
of dose to cancerous target structures while simultaneously delivering a small enough dose
to critical structures so that they may continue to function after the treatment. These
fluence values are known as a fluence map, and optimization of these fluences given a fixed
set of beams is known as fluence map optimization.
Because the fluences of the beamlets can drastically affect the quality of the
treatment plan it is critical to obtain good fluence maps for radiation delivery. As the
FMO problem is one of the most popular subproblems in IMRT optimization, it has been
extensively studied in the literature. Several problem structures and algorithms to solve
various models are presented in many studies.
1.2.2 Beam Orientation Optimization
In a typical head-and-neck treatment plan, radiation beams are delivered from 5-9
nominally-spaced coplanar orientations around the patient. These coplanar orientations
are obtained from rotating the gantry only. Several components of a linear accelerator can
rotate and translate to achieve more orientations than those obtained from rotating the
gantry. The available orientations consist of the orientations obtained from rotation of the
gantry, collimator and couch, as well as the three translation directions of the couch.
Beam orientation optimization (BOO) is the problem of selecting from the available
beam orientations the best set to use in delivering a treatment plan. Given a fixed set
of beams, different fluence maps (radiation intensities of beamlets) yield treatment plans
with different qualities. Therefore, the quality of an optimized fluence map should be
considered when selecting a set of beam orientations to use in a treatment plan. Optimal
fluence maps may be difficult to obtain depending on the FMO model. Thus, it is common
in the literature for scoring approximations and other heuristics to be used to estimate the
quality of a beam solution.
Regardless of the objective function used in the BOO problem, the problem is
fundamentally nonlinear as the physics of dose deposition change with direction. Because
nonlinear programming problems are difficult to solve, most approaches to the BOO
14
problem rely on global search algorithms to obtain a solution, which may or may not be
optimal.
1.2.3 Fractionation
An important subproblem related to the FMO problem which has not yet received
much attention is the fractionation problem. Rather than deliver an entire treatment plan
in one session, a treatment plan is divided into several sessions, called fractions. This
is done to take advantage of the fact that normal, healthy cells recover faster from the
radiation than cancerous cells. To obtain the treatment plans for the fractions, in practice,
a single FMO treatment plan is developed and then divided into the desired number of
fractions, usually around 35. This division of a treatment plan is a non-trivial task, as
the target voxels, geometric cubes of tissue, must receive 1.8-2.0 Gy of radiation in each
fraction.
With a single IMRT treatment plan, it is practically impossible to devise a constant
dose-per-fraction delivery technique because only a single FMO problem is solved to
obtain the treatment plan, which is then simply divided into a number of daily fractions.
If a single plan is optimized to deliver doses to multiple target-dose levels, then the dose
per fraction delivered to each target must change in the ratio of a given dose level to the
maximum dose level. For example, say PTV1 has a prescription dose of 70 Gy, PTV2 has
a prescription dose of 50 Gy, and the number of fractions is 35. If a single treatment plan
is divided among the 35 fractions, then PTV1 will receive 70/35 = 2.0 Gy in each fraction,
but PTV2 will only receive 50/35 = 1.4 Gy, and thus any cancerous cells in PTV2 may
not be eradicated by the treatment. Similarly, if only 25 fractions are used in order to
ensure that PTV2 receives 2.0 Gy per fraction, then PTV1 receives 70/25 = 2.8 Gy per
fraction, well above the desired dose.
1.2.4 Modeling the Dose Deposition of a Beam
The FMO problem is arguably the most significant in determining the quality
of the treatment plan. The FMO problem depends heavily on the calculation of dose
15
received in each voxel of a patient. This dose is typically approximated by assuming a
linear relationship with the radiation intensities of the beamlets delivering the radiation.
Although this approximation is accepted as satisfactory, it is not truly accurate.
The dose in a voxel is determined by the paths the photons in the radiation beams
follow through the patient. Some photons may collide with particles inside the patient
and scatter in any direction, while others may collide with particles and be absorbed.
Still other photons may pass entirely through the patient with no collisions. Due to the
unpredictable nature of the radiation beam inside the patient, the dose received in a
voxel can only be accurately obtained through Monte Carlo simulations. A simple linear
relationship is assumed between total dose and beamlet fluences and is commonly accepted
as a satisfactory dose approximation in IMRT optimization. Errors of as much as 30%
have been reported for photon beams near tissue inhomogeneities (Ma et al. [5]).
For IMRT optimization, particularly with advent of image-guided IMRT (IGIMRT),
or 4D IMRT, the FMO problem must be solved extremely quickly to create real-time
treatment plans. Thus, the speed of the FMO problem is paramount. Lengthy Monte
Carlo simulation can provide an accurate measure of the dose deposited in a voxel,
but this technique is time intensive and impractical for clinical use and particularly for
treatment planning optimization.
1.3 Contribution Summary
1.3.1 Fluence map optimization
Nonlinear functions to approximate biological behavior and desired dose distributions
are common in the previously proposed FMO models in the literature, as are mixed-integer
programming models. These models can be difficult and computationally expensive to
solve. To make the FMO problem more tractable, we employ a model with a convex
objective function and linear constraints. This desirable structure allows our model to be
solved quickly and to optimality with the primal-dual interior point algorithm we have
developed specifically for this problem.
16
One of the greatest benefits of an interior point algorithm is that a globally
optimal solution can be found for many problem structures, and in particular, convex
problem structures. As our FMO model is convex, the interior point algorithm can
locate the globally optimal solution to within a specified duality gap. While there are
other algorithms that can theoretically return a globally optimal solution to a convex
problem (and many algorithms that cannot), interior point methods have the advantage of
providing a known duality gap and generally fast computation times. Because the duality
gap is known in each iteration, the user can make knowledgeable trade-offs between
computation time and solution optimality without having to guess how far from the
optimum the final solution may be. This allows for a scientific comparison of different
IMRT delivery techniques as we can solve the different problems to a specific duality gap.
Several alterations to the standard primal-dual interior point method were made
to improve its performance. Beamlets that are likely to have little or no contribution to
the treatment plan are removed a priori and different approximations to the objective
function Hessian are tested to save time in calculating the true Hessian in each iteration.
The use of warm starts to initialize the interior point method is also examined. The
solutions obtained provide quality treatment plans in a clinically feasible amount of time.
The incorporation of spatial information into the FMO model is also considered.
The probability of tumor metastasis increases with proximity to gross tumor mass. By
using the distances of voxels from target structures, the voxels can be weighted according
to their importance in the treatment plan. For example, it should be less important to
spare saliva gland voxels near a target structure than it should be to spare saliva gland
voxels far from a target. The use of spatial coefficients will help the model identify quality
treatment plans that will prevent future metastasis.
1.3.2 Beam Orientation Optimization
For head-and-neck cancers, typical IMRT treatment plans use 5-9 equi-spaced
coplanar beams. Coplanar beams are those beams obtained from the rotation of only
17
the gantry of the linear accelerator, the machine which delivers radiation beams to the
patient. If all other components of the linear accelerator are fixed, the rotation of the
gantry sweeps out a set of coplanar beams. The couch can rotate and translate in three
dimensions, and the head of the gantry can rotate independently, creating an even larger
set of beams. Beams obtained from the movement of more than one component from the
linear accelerator are known as non-coplanar beams.
Intuitively, one may expect that the number of beams required for a high-quality
treatment plan can be reduced, or the quality of the treatment plan for a given number
of beams can be improved, if the beam orientations are chosen optimally and/or from
a larger set. In particular, we investigate the effect of considering more coplanar or
non-coplanar beams. A treatment plan consisting of fewer beams is preferable because
the number of beams used in a plan directly affects the length of the actual treatment.
If fewer beams are used to treat a patient, then each treatment takes less time and more
patients can be treated in a day, which is beneficial from both a clinical and economic
perspective. Longer treatment times also allow for more errors due to possible patient
motion.
We view the BOO problem in IMRT treatment planning as a global optimization
problem with expensive objective function evaluations, each of which involves solving
a FMO problem. We propose a response surface method that, unlike other approaches,
allows for the generation of problem data only for promising beam orientations on-the-fly
as the algorithm progresses, enabling the consideration of far more candidate orientations
than is currently feasible. Our response surface approach to BOO allows us to develop
high quality plans using just four beams for head-and-neck cases, in contrast to the
current practice of using 5-9 beams. The response surface method also provides for
convergence to the globally optimal solution.
We have developed neighborhood search methods to solve our BOO model. One
method is simulated annealing, a proper global optimization algorithm, and the other
18
is a local search heuristic designed specifically for the BOO problem. The local search
heuristic, which we call the Add/Drop method, returns a locally optimal solution in a
small amount of time. The simulated annealing algorithm has the ability to escape local
minima, and is theoretically able to return a globally optimal solution given enough time.
For each of these algorithms, we have devised a new neighborhood structure based on
observations of known optimal BOO solutions compared to the simulated annealing and
Add/Drop BOO solutions. This new neighborhood structure provides faster objective
function value convergence in both algorithms.
1.3.3 Fractionation
In practice, a single FMO treatment plan is developed and then divided into the
number of desired fractions. Dividing a single FMO into multiple treatments is a
non-trivial task, owing to the need of maintaining a constant dose-per-fraction to each
the target structures, which may have different prescription doses. Therefore, any division
of a single FMO plan into multiple fractions can lead to suboptimal treatments. We
propose a new method of formulating the fractionation problem which yields optimal
fluence maps for each cancerous target structure. These fluence maps can then be easily
divided into optimal fractions.
The proposed fractionation model is solved using the same primal-dual interior point
method presented for the FMO problem. The solutions provide high quality fluence maps
for each target, and in a clinically acceptable amount of time.
1.3.4 Modeling the Dose Deposition of a Beam
We propose obtaining a limited number of Monte Carlo histories to obtain a noisy
dose distribution which can then be transformed into a very accurate, smooth dose
distribution suitable for optimization techniques in a reasonable amount of time.
Because the particles in a beamlet scatter in three dimensional space, multiple
dose distributions must be considered to satisfactorily model the beamlet’s affect on
the patient’s tissue. These distributions arise from the amount of radiation the beamlet
19
deposits as a function of depth (the depth-dose curve), and from the amount of radiation
radiating outward from the center of the beamlet (the lateral penumbra). The depth-dose
curve is modeled using a high-degree polynomial and the lateral penumbra is modeled as
the sum of error functions. The parameters of the error functions are determined using a
Levenberg-Marquardt quasi-Newton minimization method.
Using these techniques, dose distributions with satisfactory accuracy can be obtained
using at least a factor of 10 fewer Monte Carlo histories than would otherwise be required.
This can greatly decrease the amount of time required to obtain dose data for beamlets in
the FMO problem of IMRT treatment planning without sacrificing accuracy.
20
CHAPTER 2FLUENCE MAP OPTIMIZATION
2.1 Introduction
IMRT is differentiated from conformal radiation therapy by the dose distributions
that can be delivered by each beam. Rather that just delivering a uniform radiation field
of radiation, the dose distribution of a beam can be any desired distribution. This ability
allows for greater flexibility and accuracy in targeting the target structures while avoiding
the critical structures.
The dose distribution of a beam is achieved as follows. In IMRT, each beam can
be thought of as consisting of several hundred smaller beamlets, each of which can have
its own radiation intensity (fluence) independent of its neighbors. By modulating the
intensities of these beamlets, any dose distribution can be achieved. Given a fixed set of
beams, the optimization of these intensities is called fluence map optimization.
2.2 Literature Review
Because the FMO problem is one of the most studied problems of IMRT, many
different approaches have been taken to formulate the FMO problem, based on both
“physical” (Bortfeld [6]) and “biological” (Alber and Nusslin [7], Jones and Hoban [8],
Kallman et al. [9], Mavroidis et al. [10], Niemierko et al. [11], Niemierko [12], Wu et al. [13,
14]) objective functions and constraints. Linear programming (LP)-based multi-criteria
optimization (Hamacher and Kufer [15]) and mixed-integer linear programming (MILP)
(Bednarz et al. [16], Ferris et al. [17], Langer et al. [18, 19], Lee et al. [20, 21], Shepard et
al. [22]) models have been proposed for FMO.
Constraints to enforce various measures of treatment quality are also taken into
account in different FMO models. Hamacher and Kufer [15] include the homogeneity
of the dose received by the targets as a constraint in their FMO model. Full-volume
constraints, which require that the dose in every voxel of a structure be within pre-determined
upper and lower bounds, are common for controlling the dose in each structure. Models
21
employing full-volume constraints are found in Bednarz et al. [16], Hamacher and Kufer
[15], Lee et al. [20, 21], Romeijn et al. [23] and many others. Models containing partial
volume constraints, constraints requiring that dose in only a subset of voxels be within
pre-determined upper and/or lower bounds, are also common. Formulations with partial
volume constraints are found in Lee et al. [20, 21], Romeijn et al. [23, 24] and Shepard et
al. [22].
In addition to varying constraints, there are many competing methods of formulating
the FMO objective function to reflect the quality of the treatment plan. Shepard et al. [22]
describe several different objective formulations. These formulations include minimizing
the sum of doses received at all voxels; minimizing a weighted combination of doses
received at each voxel, where the weights depend on the structure in which the voxel
resides; and minimizing the deviation of the dose in each voxel from the recommended
prescription.
Romeijn et al. [25] showed that most of the treatment plan evaluation criteria
proposed in the medical physics literature are equivalent to convex penalty function
criteria when viewed as a multicriteria optimization problem. For each set of treatment
plan evaluation criteria from a very large class, there exists a class of convex penalty
functions that produces an identical Pareto efficient frontier. Therefore, a convex penalty
function-based approach to evaluating treatment plans is used to investigate the BOO
problem. Although this approach could be used in a multicriteria setting, Romeijn
et al. [23, 26] suggest that it is possible to quantify a trade-off between the different
evaluation criteria that produces high-quality treatment plans for a population of patients,
eliminating the need to solve the FMO problem as a multicriteria optimization problem for
each individual patient.
2.3 Model Formulation
A convex penalty function-based approach to the FMO model as described in
Romeijn et al. [23] is employed to quantify the quality of the treatment plan by appropriately
22
making the trade-off between delivering the prescribed radiation dose to the target
structures while sparing the critical structures. Using this approach, the FMO problem
can formulated as a quadratic programming problem with linear constraints as follows.
Denote the set of all potential beam orientations as B. The structures (both targets
and critical structures) are irradiated using a predetermined set of beam angles, denoted
θ, where each beam θh ∈ B, h = 1, . . . , k and k is the number of beams in θ. Each beam
is decomposed into a rectangular grid of beamlets with m rows and n columns, yielding
typically 100-400 beamlets per beam. The position and intensity of all beamlets in a beam
can be represented by a vector of values representing the beamlet intensities, called bixels.
The set of all bixels in beam θh is denoted by Bθh. The core task in IMRT treatment
planning is finding radiation intensities for all beamlets.
Denote the total number of structures by S and the number of targets by T . Each
structure s is discretized into a finite number vs of volume cubes, known as voxels.
Typically, around 350,000 voxels are required to accurately represent the targets and
surrounding structures of a head-and-neck cancer site.
Because a beamlet must pass through a certain amount of tissue to reach a voxel, the
dose received in a voxel from a beamlet may not be the full delivered intensity. Denote
Dijs as the dose received by voxel j in structure s from beamlet i at unit intensity. The
Dijs values are known as dose deposition coefficients. Let xi denote the intensity of bixel i.
This brings us to the following expression for the dose zjs received by voxel j in structure
s:
zjs =k∑
h=1
∑i∈Bθh
Dijsxi j = 1, . . . , vs, s = 1, . . . , S
Although the goal of IMRT treatment planning is to control the dose received by
each structure, if hard constraints are imposed on the amount of dose received by each
structure because such a solution may not exist. In some cases, it may be necessary to
sacrifice organs in order to treat targets, and if that possibility is not allowed in the model,
then a feasible or a satisfactory solution may not exist. Thus, in our model, a penalty is
23
assigned to each voxel based on the dose it receives for a given set of beamlet intensities.
Let Fjs denote a convex penalty function for voxel j in structure s of the follwing form:
Fjs(zjs) =1
vs
(ws
[(Ts − zjs)
+]p
s + ws
[(zjs − Ts)
+]ps
),
where Ts is the dose threshold value for structure s, ws and ps
are weighting factors for
underdosing, and ws and ps are weighting factors for overdosing. The expression (·)+
denotes max{0, ·}. The function is normalized over the number of voxels in the structure
using the coefficient 1/vs. By setting ws, ws ≥ 0 and ps, ps≥ 1, convexity is ensured.
A basic formulation of the FMO problem is then:
minimizeS∑
s=1
vs∑j=1
Fjs(zjs)
subject to zjs =k∑
h=1
∑i∈Bθh
Dijsxi j = 1, . . . , vs, s = 1, . . . , S
xi ≥ 0 i ∈ Bθh, h = 1, . . . , k
The FMO problem is the black-box function F (θ) in the BOO model to quantify the
quality of beam vector θ. In contrast with the methods presented by all of the previously
cited FMO studies except for Das and Marks [27], Haas et al. [28] and Schreibmann [29],
this measure of beam vector quality is an exact measure of the FMO problem, rather than
using heuristic methods or scoring approaches which cannot accurately optimize the beam
orientations.
2.4 Spatial Considerations
With IMRT optimization, it is possible to generate treatment plans with similar FMO
objective function values but very different levels of clinical treatment quality. Chao et
al. 2003 [30] illustrate this possibility with two treatment plans that have nearly identical
target coverage when plotted on a dose-volume histogram, but while one plan delivers
an acceptable homogeneous dose, the other plan results in significant underdosing of the
target structure.
24
Chao et al. 2003 [30] show that the probability of microscopic tumor extension
decreases linearly with distance from the gross tumor volume, implying that cold spots
located near the gross tumor volume are far more likely to allow for tumor metastasis after
treatment. Likewise, cold spots located far from the gross tumor volume are unlikely to
result in tumor metastasis.
To reduce the likelihood of obtaining an unsatisfactory plan with a good dose-volume
histograms, spatial coefficients are introduced into the FMO model. For each voxel, we
consider its position relative to the primary target as a measure of how acceptable/unacceptable
overdosing or underdosing may be. Voxels further from the gross tumor volume are
penalized more heavily than voxels closer to the gross tumor because it is less acceptable
for a voxel far away from the actual tumor to receive an overdose, as the cancerous cells
are unlikely to spread very far from the tumor location (Chao et al. [30]). This additional
penalization is called the spatial coefficient, and is denoted cjs for voxel j in structure s.
For voxels inside the target structures, the probability of cancer spread is 1, as cancer
already exists in those voxels. Let S ′ denote the set of gross tumor structures. Let d`js be
the minimum distance from voxel j in structure s to structure `. The spatial coefficient cjs
for voxel j in structure s is
cjs =
1 j = 1, . . . , vs, s /∈ S ′
min{
1, max{
0.001,∑|S′|
`=1 [exp (−λ`d`js) + µ`d`js + β`]}}
j = 1, . . . , vs, s ∈ S ′,
where λ`, µ` and β` are weighting coefficients. The objective function for the FMO
problem becomes
Fspatial(x) =S∑
s=1
vs∑j=1
cjsFjs(zjs)
2.5 A Primal-Dual Interior Point Algorithm for FMO
To solve the FMO and fractionated FMO models, a primal-dual interior point method
is employed. For a convex problem such as the FMO model presented in the preceding
section, this method yields an optimal solution in short amount of time.
25
The primal-dual interior point algorithm moves through the interior of the solution
space along a central path (a path through the interior of the solution space) toward the
optimal solution. The central path is defined by perturbing the KKT conditions described
below. These conditions ensure primal feasibility, dual feasibility and complementary
slackness. If these conditions are satisfied for a convex programming problem with linearly
independent constraints, they yield the optimal solution. Thus, we only need to solve this
system to obtain an optimal solution to our FMO model (which has a convex objective
function and linear, linearly independent constraints). The KKT system can be difficult to
solve, so the conditions are perturbed in order to obtain a solution.
The general idea of the primal-dual interior point algorithm is to start from an initial
feasible solution, use the perturbed KKT conditions to obtain a step direction close to the
central path, and then move the current solution some step length along that direction.
The amount of pertubation in the KKT conditions is gradually decreased so that in each
step, the solution becomes closer to the optimum. The interior point method allows for
the duality gap, the gap between the objective functions of the primal and dual problems,
to be calculated, thus providing a measure of how close the current solution is to the
optimum. For a problem with continuous variables, when the objective functions of the
primal and dual problems are equal (duality gap of zero), the solution is optimal.
A mathematical description of the primal-dual interior point method can be found
in Nocedal and Wright [31]. Further explanation is provided only as needed to define
variables in the algorithm. In the FMO problem, G(x) = −Ix, so the KKT conditions for
the FMO formulation are
∑s∈S
1
vs
∑j∈Vs
DijF′j
(∑`∈N
D`jx`
)− si = 0 i ∈ N. (2–1)
sixi = 0 i ∈ N. (2–2)
si ≥ 0 i ∈ N (2–3)
xi ≥ 0 i ∈ N, (2–4)
26
where the Equation (2–4) ensures that the solution is feasible, as the only constraints
in the FMO problem are nonnegativity. The complimentary slackness constraint (2–2)
forces the solution to the above conditions to be on the boundary of the solution space.
Since a point in the interior of the solution space is desired, the complimentary slackness
constraint must be relaxed.
The complimentary slackness constraint (2–2) is relaxed by changing each sixi = 0 to
sixi = µ, where µ > 0. This, along with requiring that x > 0 and s > 0 for feasibility,
ensures that a solution to the perturbed KKT conditions is an interior point.
Let n be the size of decision variable vector x. A solution is “close enough” to the
central path if the duality measure µ in iteration k is
µk =(xk)>s
n(2–5)
and ||XkSk − µke|| ≤ θµk, where Xk is a matrix with xki values as diagonals and zeros
elsewhere, and Sk is a matrix with ski values as diagonals and zeros elsewhere.
As the algorithm progresses, µ is reduced to zero until the solution is sufficiently close
to optimality. To reduce µ, in each iteration we set µ = µσ, where σ ∈ [0, 1] is called the
centering parameter. If the duality gap is very large, σ can be reduced so that µ is reduced
faster.
In each iteration, the current solution (x, s) is moved in a direction (∆x, ∆s) for some
step length α is given by xk+1
sk+1
=
xk
sk
+ α
∆xk
∆sk
Let Xk = diag(xk), Sk = diag(sk), H(xk) = ∇2φ(xk). The directions ∆xk and ∆sk
can be determined by solving the following equations:
[(Xk)−1
Sk + H(xk)]∆xk = −rDF −
(Xk)−1
rxs (2–6)
∆sk = −(Xk)−1 (
rxs + Sk∆xk)
(2–7)
27
In order to solve this system, we must obtain ∆xk from Equation (2–6) by taking the
inverse of [(Xk)−1Sk + H]. Because computing the inverse of such a large dense matrix is
very time consuming, a Cholesky factorization to solve this system quickly.
The primal-dual interior point method requires a feasible (x, s) solution in each step.
Thus, a maximum step length αmax must be imposed on each step direction to ensure that
x ≥ 0 and s ≥ 0:
αmax = min
{min
i=1,...,n{−xi/∆xi} , min
i=1,...,n{−si/∆si}
}Because the inverse of each xi is required to determine the step directions, it is
undesirable to have any xi = 0, which would result from using step length αmax. Instead,
only a percentage η < 1 of αmax is used:
α = min{1, ηαmax} (2–8)
The benefit of this primal-dual method is that in each step, we can calculate the
objective of the dual problem (simply s>x), thus providing a bound on how far the current
solution is from optimality.
2.5.1 Primal-Dual Interior Point Algorithm
The primal-dual interior point algorithm is as follows:
• Initialization
1. Select initial values for ε, σ and η (we use ε = 5, σ = 0.01, and η = 0.95).2. Set x0 = 0.05 (very close to 0) and calculate ∇φ(x0) and H(x0) = ∇2φ(x0).3. Set s0 = µ(X0)−1.4. Set µ0 = (
∑ni=1∇φ(x0)i)/100.
5. Set k = 0.
• Algorithm
1. If the duality gap is very large ((xk+1)>sk+1 > 107ε), set σ = 0.01σ.
2. Set µk = σµk.
28
3. Solve for the step direction (∆xk, ∆sk) as described in Equations (2–6) and(2–7). Note that this involves calculating the Hessian H(xk).
4. Solve for the step length α as described in Equation (2–8).
5. Set xk+1 = xk + α∆x and sk+1 = sk + α∆s.
6. If the duality gap (xk+1)>sk+1 < ε, stop. Otherwise, set µk+1 = (xk+1)>sk+1/nand k ← k + 1 and repeat.
2.5.2 Hessian Approximations
The most time-consuming step in the primal-dual interior point algorithm is
calculating the Hessian of the objective function in each iteration. For clarity, let∑
denote∑
s∈S 1/vs
∑j∈Vs
and F ′′j (x) denote F ′′
j (∑
l∈N Dljxl). The Hessian of the FMO
problem is then given by
H(x) =
∑
F ′′j (x)D2
1j . . .∑
F ′′j (x)D1jDnj
.... . .
...∑F ′′
j (x)DnjD1j . . .∑
F ′′j (x)D2
nj
Note that only the pairwise Dij products differ in each element of the Hessian. By
precomputing these cross products, only∑
s∈S 1/vs
∑j∈Vs
F ′′j (∑
l∈N Dljxl) has to be
recomputed in each iteration. The matrix of the Dij products yields the sparsity (or
density) pattern of the Hessian, which stays constant throughout the algorithm. Because
the Hessian is symmetric, the matrix values only need to be computed for half of the
matrix, further improving efficiency.
Despite these observations, computing the Hessian is still so expensive that it renders
the algorithm impractical. Methods of approximating the Hessian are implemented to
speed up the algorithm.
2.5.2.1 Single Hessian Approximation
One way of speeding up the algorithm is to compute the Hessian just once during
initialization to obtain H(x0), and then rather than re-compute the Hessian in each
iteration, use H(x0) as an approximation to H(xk). We call this the Single Hessian
29
approximation. Although the convergence of such an approximation has not yet been
mathematically proven, tests run on several head-and-neck cases for 5-beam and 7-beam
plans show that the Single Hessian does in fact converge to the known optimal solution.
2.5.2.2 BFGS Hessian Update
Another Hessian approximation is the Broyden-Fletcher-Goldfarb-Shanno (BFGS)
Hessian update. The approximation to the Hessian in iteration k is Bk, with B0 = H(x0).
The update to the approximated Hessian in each iteration is
Bk+1 = Bk +qkq
>k
q>k pk
− Bkpkp>k Bk
p>k Bkpk
,
where
pk = xk+1 − xk
qk = ∇φ(xk+1)−∇φ(xk)
Note that this update ensures that Bk is always symmetric and positive definite,
so the Cholesky factorization can still be applied to obtain the step direction. This
approximation also empirically converges to the known optimal solution for 5- and 7-beam
head-and-neck cases.
2.5.3 Insignificant Beamlets
Insignificant beamlets are those that bear little contribution to the quality of the
FMO plan. Letting d denote the diagonal elements of the initial Hessian H(x0), the set of
insignificant beamlets BI is defined as
BI =
{i :
|di|max{|d|}
< 0.001
}These beamlets are removed by removing the ith row and the ith column in H(x0) for
every i ∈ BI , and then updating the number of bixels to the number of remaining bixels.
The insignificant beamlets must be re-inserted into the solution xk in order to calculate
30
the voxel doses, objective function, gradient and Hessian, but the inversion of the Hessian
is done to the Hessian with the bad beamlets removed, providing significant time savings.
2.5.4 Warm Start
For the sake of theoretical accuracy, a truly optimal solution cannot have the bad
beamlets described in Section 2.5.3 removed. Without removing the bad beamlets a priori,
the interior point method must be run for an impractical amount of time to obtain a
near-optimal solution, say, ε = 0.001. The interior point method is typically started with
a decision variable vector x equal to almost zero. If the algorithm were to be started at a
point closer to the final solution, denoted xwarm, time savings could be gained, allowing all
beamlets to be considered in the interior point algorithm in a reasonable amount of time.
Such an approach is a called a “warm start”.
One difficulty in using a warm start with the interior point method is that a warm
start solution may have some xwarmi = 0, which is not allowed because the inverse of each
xi must be taken. To correct this problem, any xwarmi = 0 is simply replaced with some
very small value γ. Because these zero-valued variables are less important to the problem
than nonzero variables, γ should be less than the minimum nonzero value of xwarm. Let
γ = mini=1,...,n{xwarmi : xwarm
i > 0}. Then, γ = min{0.001, γ}.
x0i =
xwarmi i /∈ BI
γ i ∈ BI
An additional problem with warm starts in the interior point method is that the KKT
variable vector s is unknown at the warm start point. Depending on the algorithm used
to obtain the warm start, some information about swarm and µwarm, s and µ at the warm
start point, respectively, may not be available. If no information is available about s from
the warm start, then s0 = 0. If an interior point algorithm is used to obtain the warm
start, then swarm is available. If the warm start did not include the insignificant beamlets,
some corrections must be made to account for the insignificant beamlets which will be
31
optimized in the final solution. Let s0 be the initial s used in the interior point method
after the warm start has been obtained. Then,
s0i =
swarmi i /∈ BI
µwarm/γ i ∈ BI ,
where the value chosen for s0i corresponding to insignficant beamlets arises from the
general initialization s = µ(X0)−1.
2.6 Results
The true Hessian, Single Hessian approximation, and BFGS update implementations
of the primal-dual interior point algorithm are tested on six cases head-and-neck cases to
obtain coplanar, equi-spaced 5-beam plans. The tests are run on a 2.33GHz Intel Core 2
Duo processor with 2GB of RAM. The method is tested for both leaving in and removing
the insignificant beamlets, as well as the proposed alternative to computing the Hessian.
The optimality of the interior point method solutions is verified by comparison to the
known optimal solutions obtained by Java with CPLEX (ILOG).
An acceptable duality gap must be determined in order to implement the interior
point method. While we consider a duality gap of ε = 0.001 to be acceptably close to
optimal, it may be unnecessary to achieve such a small duality gap to obtain a quality
solution. A duality gap of 0.001 may be sufficiently small to ensure optimal solutions given
objective function values using certain weighting parameters, depending on the parameters
used in the FMO objective function, the value of the objective function may vary widely.
Because of the potential range of values, a stopping criteria based on a relative duality gap
rather than an absolute duality gap is preferable. Say the objective function value in an
iteration is f . Define the relative duality gap in an iteration to be ε′ = ε/f .
An examination of the relative duality gap necessary is presented in Section 2.6.1.
Computational results are presented in Section 2.6.2 and clinical comparisons are provided
in Section 2.6.3.
32
2.6.1 How Small of a Duality Gap is Necessary?
Because the run time of the algorithm is dependent on the required duality gap, it
is desirable to only require the algorithm to achieve as small a duality gap as necessary
to ensure a clincally good solution. The duality gap decreases quickly in the first few
iterations, and then subsequently decreases by only a small amount per iteration, as
shown in Figure 2-1A. If these iterations with only marginal improvements are found to
be unnecessary in terms of clinical quality, significant time can be saved by stopping the
algorithm once the duality gap is reasonably small, as opposed to waiting until the duality
gap is very small.
To check the importance of the duality gap, the FMO value and dose delivered to the
targets and the saliva glands were plotted against the duality gap in each iteration using
the true Hessian and without removing insignificant beamlets. For a representative case,
the FMO values per duality gap are shown in Figure 2-1B. It is clear that the duality gap
decreases rapidly in the first few iterations, but subsequent iterations yield increasingly
smaller drops in the duality gap.
Similarly, the amount of dose received by the targets and critical structures does not
change significantly toward the end of the algorithm. Figure 2-2 plots the dose received by
the two targets, PTV1 and PTV2, starting from a duality gap of 0.15%. The prescription
doses are 70 Gy for PTV1 and 50 Gy for PTV2, common dose values used in the cancer
clinic at Shands Hospital at the University of Florida. Neither the dose received by 95% of
the targets nor the size of the hotspots and coldspots changes significantly in this duality
gap range (Figure 2-2A). The hotspots are measured by the percent of the target receiving
110% and 120% of the prescription dose, while the coldspots are measured by the percent
of the target receiving at least 93% of the prescription dose (Figure 2-2B).
Figure 2-3 shows for two representative cases the amount of dose received by the
saliva glands starting from a relative duality gap of 0.15%. Both cases show that the
33
0 5 10 15 20 250
1
2
3
4
5
x 104
FM
O v
alue
0 5 10 15 20 250
0.05
0.1
0.15
0.2
0.25
0.3
iterations
rela
tive
dual
ity g
ap
Objective function and relative duality gap v. iteration
Figure 2-1. The duality gap drops sharply in early iterations, but very slowly thereafter.The relative duality gap monotonically decreases after several iterations.
change in dose received by the saliva glands as the duality gap decreases is not clinically
relevant.
From these figures, it appears that a duality gap as large as 0.1% could provide
clinically acceptable plans. Since the algorithm may terminate with a duality gap less than
the one specified as the stopping criteria, a duality gap larger than 0.1% will also be tested
for acceptability.
2.6.2 Computational Results
Table 4-1 shows the average run times for each of the implementations of the
algorithm. Relative duality gaps of 0.15%, 0.10%, 0.05% and 0.01%. are compared.
The value of θ used to define the central path is 0.5. As expected, using the Single
Approximation Hessian alternative with the insignificant beamlets removed is the fastest
method, while using the true Hessian is the slowest method, regardless of whether the
insignificant beamlets are removed. Interestingly, for large duality gaps, it is slightly faster
to leave the insignificant beamlets in the model when using the true Hessian. Otherwise, it
is faster to remove the insignificant beamlets.
The final FMO values are displayed for each of the tested methods using a duality
gap of 0.001, which is sufficiently small to ensure optimal solutions given typical objective
function values (Table 2-2). For each case, the final FMO value is nearly identical,
34
00.050.150
55
60
65
70
75
relative duality gap (%)
dose
(G
y)
Target coverage at 95%
PTV1PTV2
00.050.10
20
40
60
80
100
relative duality gap (%)
Per
cent
of t
arge
t
Target hotspots and coldspots
PTV1 at 1.10PTV1 at 1.20PTV1 at 0.93PTV2 at 1.10PTV2 at 1.20PTV2 at 0.93
A BFigure 2-2. Dose received by targets as a function of the duality gap. A) The amount of
dose received by at least 95% of each target is used to assess proper targetcoverage. B) The percent of each target receiving 110% and 120% of theprescription dose indicates hotspots, while 93% of the prescription doseindicates coldspots.
00.050.110
15
20
25
30
relative duality gap (%)
dose
(G
y)
Saliva gland dose at 50%
L. parotid glandR. parotid glandL. SMB glandR. SMB gland
00.050.122
24
26
28
30
32
relative duality gap (%)
dose
(G
y)
Saliva gland dose at 50%
R. parotid glandL. parotid glandR. SMB glandL. SMB gland
Figure 2-3. The amount of dose received by at least 50% of each saliva gland remainsrelatively constant even for large duality gaps. Two representative cases areshown.
35
Table 2-1. Average run times for 5-beam treatment plans.
Remove insig. Average run time (s)Hessian type beamlets? ε = 0.001 ε′ = 0.15 ε′ = 0.1 ε′ = 0.05 ε′ = 0.01True no 113.8 55.48 55.48 58.58 71.75True yes 105.6 55.25 56.29 59.09 70.56BFGS no 43.9 13.59 14.17 14.66 16.67BFGS yes 40.9 13.19 13.66 14.30 15.88Single Approx. no 18.1 8.83 8.98 9.29 10.13Single Approx. yes 16.8 8.69 8.84 9.14 9.90
Table 2-2. FMO value from using ε = 0.001.
Remove insig.Hessian type beamlets? Case 1 Case 2 Case 3 Case 4 Case 5 Case 6True Hessian no 2546.22 2200.70 2289.95 2566.38 5024.97 2585.40True Hessian yes 2546.22 2200.70 2289.95 2566.38 5024.97 2585.40BFGS update no 2546.23 2200.70 2289.95 2566.39 5024.97 2585.40BFGS update yes 2546.24 2200.70 2289.95 2566.39 5024.97 2585.40Single Approx. no 2546.38 2201.11 2290.40 2566.56 5025.06 2585.82Single Approx. yes 2546.38 2201.15 2290.44 2566.62 5025.14 2585.82
indicating that the Hessian alternatives and the removal of the insignificant beamlets still
provide for convergence to the optimal solution.
The percentage increases in the FMO values using an absolute duality gap of 0.001
and relative duality gaps of 0.15%, 0.10%, 0.05% and 0.01% are shown in Table 2-3.
2.6.3 Clinical Results
For each of the duality gaps tested, the DVHs of the solutions obtained using the
Single Approximation Hessian with the insignificant beamlets removed are compared.
Since the each of the interior point implementations obtains nearly identical solutions, it
does not matter which implementation is used to produce the DVHs.
As previously stated, the prescription doses used are 70 Gy for PTV1 and 50 Gy for
PTV2, marked by a vertical line in Figure 2-4A. As saliva glands are the most difficult
organs to spare in head-and-neck cases, the only critical structures shown are the saliva
glands (Figure 2-4B). All other glands are spared in every implementation. The sparing
criteria used for saliva glands is that no more than 50% percent of the saliva gland can
36
Table 2-3. Percent increase in objective function value from various relative duality gapsas opposed to an absolute duality gap of ε = 0.001.
Remove insig. Avg. increase in obj. fn. (%)Hessian type beamlets? ε′ = 0.15 ε′ = 0.1 ε′ = 0.05 ε′ = 0.01True no 0.58 0.58 0.27 0.05True yes 0.58 0.48 0.25 0.06BFGS no 0.99 0.54 0.30 0.05BFGS yes 0.94 0.57 0.26 0.07Single Approx. no 1.26 0.89 0.60 0.19Single Approx. yes 1.21 0.87 0.57 0.16
0 1020304050607080900
20
40
60
80
100
Interior point method: Target DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
ε’=0.15%ε’=0.10%ε’=0.05%ε’=0.01%
0 1020304050607080900
20
40
60
80
100
Interior point method: Saliva DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
ε=0.15%ε=0.10%ε=0.05%ε=0.01%
A BFigure 2-4. Quality of DVHs for duality gaps ε′=0.01%, 0.05%, 0.1% and 0.15%. A) The
target coverage is nearly identical. B) The saliva gland sparing for the differentduality gaps is similar, but the solution for ε′=0.15% sacrifices one salivagland. The sparing criteria is marked by a star.
receive more than 30 Gy in order to be spared. This point is marked by a star in Figure
2-4B.
Each of the duality gaps achieves good target coverage. While they each provide
similar saliva gland dosage, the plan obtained using ε′ = 0.15% slightly surpasses the
sparing criteria used for saliva glands.
2.6.4 Spatial Coefficient Results
To assess the possible treatment plan improvement afforded by spatial coefficients,
spatial parameters were tuned and then compared to treatment plans obtained without
using spatial information. To demonstrate the spatial coefficients, Figure 2-5 displays the
37
10 20 30 40 50 60 70
10
20
30
40
50
60
5 10 15 20 25 30 35 40 45 50 55
10
20
30
40
50
60
Figure 2-5. The spatial coefficients used for two cases.
coefficients used for two cases. In addition to tuning λ, µ and β to values of 1.07, -0.32
and 0.77, respectively, a minimum spatial coefficient of 0.025 was also set for target voxels.
By definition, the maximum value of a spatial coefficient is 1.
These spatial parameters generally produce treatment plans of nearly identical
quality to the best plans obtained without using spatial information, though with the
added benefit of preventing misleading dose-volume histograms. In some cases, the spatial
coefficients were able to outperform the non-spatial plans. Figures 2-6 and 2-7 illustrates
two such cases.
In Figure 2-6, the spatial coefficients yield improved target coverage and spare all
saliva glands, as opposed to the non-spatial plan which only spares three of the four saliva
glands. There is less dose outside the desired target in the plan using spatial coefficients.
In Figure 2-7, the spatial coefficients reduce the amount of overdose in the primary
targets. In this patient, both the spatial and non-spatial plans spare all saliva glands.
2.6.5 Warm Start Results
Warm start solutions were obtained using the interior point method and the projected
gradient algorithm (Nocedal and Wright [31]). The interior point method warm starts
were tested with each Hessian possibility and a large duality gap of 200, both with and
without insignificant beamlets removed. The projected gradient algorithm was tested using
38
0 1020304050607080900
20
40
60
80
100
Target DVHs: Non−spatial
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Spatial
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Non−spatial
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Spatial
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
A BFigure 2-6. Comparison of spatial and non-spatial treatment plans. A) Non-spatial
parameters result in slightly low target dosage and fail to spare one salivagland. B) Spatial parameters allow for improved target coverage and spare allsaliva glands.
39
0 1020304050607080900
20
40
60
80
100
Target DVHs: Non−spatial
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Spatial
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Non−spatial
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Spatial
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
A BFigure 2-7. A) Non-spatial parameters result in slightly low target dosage and fail to spare
one saliva gland. B) Spatial parameters allow for improved target coverage andspare all saliva glands.
40
several stopping criteria and without insignificant beamlets removed. It was observed
that the projected gradient algorithm is fast enough that the time required to remove and
re-insert the insignificant beamlets as necessary caused the algorithm to slow down. To
be theoretically close to optimal, the interior point method used after the warm start has
duality gap of 0.001 and no beamlets removed.
The determine the how close the warm start solution is to the final solution, the
percent improvement in objective function value the final solution obtains over the warm
start is measured. To assess how close to optimality the final solutions using a warm start
are, the percentage by which their objective function values are greater than the objective
function value of a near-optimal solution is measured. Lastly, the decrease in run times
over obtaining a near-optimal solutions are provided. These results for the interior point
and projected gradient warm starts are displayed in Tables 2-4 and 2-5, respectively.
From Table 2-4, it is clear that using an interior point warm start can provide
significant time savings over the near-optimal solution times. There is also a significant
increase in the FMO objective function value. From the amount of increase in the
objective function value, the interior point warm start does not appear to converge to
the optimal solution, and is unlikely to provide acceptable solutions. It is interesting to
note that the improvement from the warm start solution to the final solution is very small.
This indicates that KKT information obtained from the warm start and used in the final
algorithm were unhelpful in improving the solution.
For the projected gradient algorithm, once there is less than δ percent decreases from
one iteration to the next, the algorithm terminates. Several δ values are tested. As with
the interior point warm starts, the projected gradient warm starts also provided significant
time savings, as shown in Table 2-5. The final solutions from the projected gradient warm
start methods are nearly identical to the near-optimal solutions. The final interior point
method also significantly improves the objective value of the warm start solution. This
implies that despite not having KKT information about the warm start, the interior point
41
algorithm is still able to converge to the optimal, or at a least near-optimal, solution using
the KKT value approximations and adjustments to the warm start vector described in
Section 2.5.4.
42
Tab
le2-
4.Per
form
ance
mea
sure
sof
inte
rior
poi
nt
met
hod
war
mst
arts
.
inte
rior
poin
twar
mst
art
final
inte
rior
poin
tal
gori
thm
Impro
vem
ent
Incr
ease
Rem
ove
insi
g.R
emov
ein
sig.
over
war
mst
art
infinal
Avg.
tim
eH
essi
anty
pe
bea
mle
ts?
ε′H
essi
anty
pe
bea
mle
ts?
ε′ob
j.fn
.(%
)ob
j.fn
.(%
)sa
vin
gs(s
)Tru
eno
5tr
ue
no
0.01
0.00
4.46
64.7
5Tru
eye
s5
true
no
0.01
0.19
4.48
65.2
0Tru
eno
5B
FG
Sno
0.01
0.00
4.79
27.9
4Tru
eye
s5
BFG
Sno
0.01
0.20
4.84
28.4
7Tru
eno
5Sin
gle
Appro
x.
no
0.01
0.00
5.06
6.85
Tru
eye
s5
Sin
gle
Appro
x.
no
0.01
0.76
4.49
6.93
BFG
Sno
5tr
ue
no
0.01
0.00
4.46
64.9
7B
FG
Sye
s5
true
no
0.01
0.19
4.48
65.0
9B
FG
Sno
5B
FG
Sno
0.01
0.00
4.79
27.8
3B
FG
Sye
s5
BFG
Sno
0.01
0.20
4.84
28.5
5B
FG
Sno
5Sin
gle
Appro
x.
no
0.01
0.00
5.06
6.90
BFG
Sye
s5
Sin
gle
Appro
x.
no
0.01
0.76
4.49
6.87
Sin
gle
Appro
x.
no
5tr
ue
no
0.01
0.00
4.46
64.9
9Sin
gle
Appro
x.
yes
5tr
ue
no
0.01
0.19
4.48
65.0
0Sin
gle
Appro
x.
no
5B
FG
Sno
0.01
0.00
4.79
27.9
5Sin
gle
Appro
x.
yes
5B
FG
Sno
0.01
0.20
4.84
28.5
4Sin
gle
Appro
x.
no
5Sin
gle
Appro
x.
no
0.01
0.00
5.06
6.88
Sin
gle
Appro
x.
yes
5Sin
gle
Appro
x.
no
0.01
0.76
4.49
6.88
43
Tab
le2-
5.Per
form
ance
mea
sure
sof
pro
ject
edgr
adie
nt
met
hod
war
mst
arts
.
inte
rior
poin
twar
mst
art
final
inte
rior
poin
tal
gori
thm
Impro
vem
ent
Incr
ease
Rem
ove
insi
g.R
emov
ein
sig.
over
war
mst
art
infinal
Avg.
tim
ebea
mle
ts?
δH
essi
anty
pe
bea
mle
ts?
ε′ob
j.fn
.(%
)ob
j.fn
.(%
)sa
vin
gs(s
)no
1Tru
eno
0.01
19.8
30.
0036
.63
no
5Tru
eno
0.01
31.7
80.
0010
.98
no
10Tru
eno
0.01
36.4
30.
0019
.16
no
100
Tru
eno
0.01
56.5
90.
0139
.28
no
500
Tru
eno
0.01
89.4
60.
0956
.88
no
1B
FG
Sno
0.01
19.8
30.
009.
27no
5B
FG
Sno
0.01
31.7
80.
0012
.53
no
10B
FG
Sno
0.01
36.4
30.
0019
.30
no
100
BFG
Sno
0.01
56.5
90.
0327
.79
no
500
BFG
Sno
0.01
89.4
60.
1330
.30
no
1Sin
gle
Appro
x.
no
0.01
19.8
20.
003.
40no
5Sin
gle
Appro
x.
no
0.01
31.7
70.
013.
95no
10Sin
gle
Appro
x.
no
0.01
36.4
20.
014.
28no
100
Sin
gle
Appro
x.
no
0.01
56.5
60.
089.
28no
500
Sin
gle
Appro
x.
no
0.01
89.4
40.
2710
.04
44
2.7 Conclusions
The primal-dual interior point method is an effective algorithm for obtaining fluence
maps that deliver quality treatment plans. The proposed Hessian alternatives appear
to converge to the optimal solution, even when insignificant beamlets are removed. The
removal of the insignificant beamlets provides significant time savings in all instances. The
interior point method may also be run with a duality gap as large as 20 and still achieve
quality treatment plans, thus decreasing the amount of time required to run the algorithm.
Of the implementations tested, the fastest method that still provides quality solutions
without using a warm start is to use the Single Approximation Hessian alternative, remove
insignificant beamlets and employ a relative duality gap of 0.1%.
When the interior point method is started with one of the warm starts discussed, time
savings were again significant. Although the interior point warm starts generally provided
more improvement in computation time than the project gradient warm starts, the final
solutions using the projected gradient warm starts were much closer to optimality. The
fastest and most effective warm start method is to use the projected gradient algorithm
with δ = 500, followed by the interior point method with ε = 0.1% and the Single
Approximation Hessian. This combination results in a near-optimal solution with an
average total computation time of 8.32 seconds.
45
CHAPTER 3BEAM ORIENTATION OPTIMIZATION
3.1 Introduction
In a typical head-and-neck treatment plan, radiation beams are delivered from 5-9
nominally-spaced coplanar orientations around the patient. These coplanar orientations
are obtained from rotating the gantry only. As shown in Figure 3-1, several components
of a linear accelerator can rotate and translate to achieve more orientations than those
obtained from rotating the gantry. The available orientations consist of the orientations
obtained from rotation of the gantry, collimator and couch, as well as the three translation
directions of the couch.
Figure 3-1. A linear accelerator and the available movements; the gantry rotation ishighlighted.
BOO is the problem of selecting from the available beam orientations the best set
to use in delivering a treatment plan. Given a fixed set of beams, different fluence maps
(radiation intensities of beamlets) yield treatment plans with different qualities. Thus, the
quality of an optimized fluence map should be considered when selecting a set of beam
orientations to use in a treatment plan.
46
3.2 Literature Review
Many approaches have been taken to solve the BOO problem. Evolutionary
algorithms (Schreibmann [29]) and variants of evolutionary algorithms, particularly
genetic algorithms (Ezzell [32], Haas et al. [28], Li et al. [33]) have been employed. Li
et al. [34] use a particle swarm optimization method, which is conceptually based on
evolutionary algorithms. Bortfeld and Schlegel [35], Djajaputra et al. [36], Lu et al. [37],
Pugachev and Xing [38], Rowbottom et al. [39] and Stein et al. [40] have all employed
variations of simulated annealing to determine a beam solution. Soderstrom and Brahme
[41] selected coplanar beam orientations using two measures, entropy and the integral
of the low frequency part of the Fourier transform of the optimal beam profiles, both of
which are based on the size and shape of the target structure. Soderstrom and Brahme
[42] also use an iterative technique to determine the optimal number of coplanar beams
required using BOO. Das and Marks [27] use a quasi-Newton method. Rowbottom et al.
[43] use artificial neural network algorithms to select beam orietations. Gokhale et al.
[44] use a measure of each beam’s “path of least resistance” from the patient surface to
the target location to determine the best beam directions. Meedt et al. [45] use a fast
exhaustive search to obtain a non-coplanar solution. The concept of beam’s-eye view
(BEV) has also been commonly used to approach the BOO problem (Chen et al. [46], Cho
et al. [47], Goitein et al. [48], Lu et al. [37], Pugachev and Xing [38, 49, 50]).
Despite the varying techniques to quantify the quality of a beam solution, it is widely
accepted that the optimal solution to the FMO problem presents the most relevant
measure (Bortfeld and Schlegel [35], Djajaputra et al. [36], Holder and Salter [51], Lee
et al. [20, 21], Li et al. [33, 34], Meedt et al. [45], Morrill et al. [52], Oldham et al. [53],
Rowbottom et al. [39, 43, 54], Schreibmann et al. [29], Soderstrom and Brahme [41],
Stein et al. [40], Wang et al. [55, 56], Woudstra and Heijman [57]). Given this accepted
measure of treatment quality, the shortcoming of the previous works is twofold. First,
they predominantly only consider coplanar angles, and not necessarily even the entire
47
coplanar solution space, while those that do consider non-coplanar beams only consider
a hand-selected subset of the available orientations. Second, the majority of the previous
studies do not select beam solutions using the FMO problem as a model for determining
quality; instead, the beam solutions are chosen based on scoring methods (e.g., BEV, path
of least resistance) or approximations to the FMO. By not optimizing the beam solution
with respect to the exact FMO problem, the BOO methods cannot guarantee convergence
to an optimal solution.
Of the previously cited works, only Das and Marks [27], Gokhale et al. [44], Meedt et
al. [45], Lu et al. [37], Rowbottom et al. [39] and Wang et al. [56] consider non-coplanar
orientations. This is likely due to the computational difficulties associated with the
inclusion of non-coplanar orientations as well as the widespread belief that non-coplanar
orientations do not improve the quality of a treatment plan.
Also, of those works that addressed non-coplanar beams, Das and Marks [27] require
that the beam distances be maximized, essentially requiring that beam solutions must
be equi-distant and thus restricting the size of the solution space; Meedt et al. [45] only
consider 3,500 beams (a minute subset of orientations available by rotation of the couch
and the gantry); and Wang et al. [56] use only nine pre-selected non-coplanar beams.
With the exception of Das and Marks [27], Haas et al. [28] and Schreibmann [29],
the previous studies have based their BOO approaches not on a beam solution’s optimal
solution to the FMO problem, but on locally optimal FMO solutions or on various scoring
techniques. Without basing BOO on the optimal FMO solutions, the resulting beam
solutions have no guarantee of optimality, or even of local optimality.
3.3 Model Formulation
The goal of radiation therapy treatment planning is to design a treatment plan that
delivers a prescribed level of radiation dose to the targets while simultaneously sparing
critical structures by ensuring that the level of radiation dose received by these structures
is less than a structure-specific radiation dose. These two goals are contradictory if the
48
targets are located near critical structures. This is especially problematic for certain
cancers, such as tumors in the head-and-neck area, which are often located very close
to, for instance, the spinal cord, brain stem and salivary glands. In order to model the
BOO problem, a quantitative measure that appropriately makes trade-offs between
these contradictory goals must be developed. Let F (θ) be a black-box function that
quantifies the quality of the treatment plan if radiation is delivered from beam vector
θ = (θ1, . . . , θk), where k is the user-specified number of orientations that may be used. F
is formulated in such a way that the optimal plan yields the minimum function value.
For k beams orientations to be optimized in the treatment plan, the vector of decision
variables representing the beam orientations is defined as θ = (θ1, . . . , θk)T . The decision
vector θ is used as input into the black-box function F (θ) to determine the ability of the
beam vector to deliver the prescribed treatment without unduly damaging normal tissue
and critical structures. The BOO problem is then formulated as
min F (θ)
subject to θh ∈ B h = 1, . . . , k,
where B is the set of candidate beams. The candidate set of beams can be selected
according to any user-specified criteria; for example, the beams can be coplanar or
non-coplanar, continuous or discrete, or only represent a subset of the available beams.
It is also possible to fix some beams and only optimize a subset of the total number of
beams to be used. Theoretically, the linear accelerator is able to capture a continuous set
of orientations, but due to machine tolerances, the actual beams delivered may not be
exactly the desired beams. Therefore, it is common to only consider a discretized set of
beam orientations.
In our BOO model, the black-box function F (θ) is the convex FMO problem
described in Section 2.3, thus ensuring an exact measure of the quality of each beam
vector. Even though F (θ) is convex, this formulation of the BOO problem is fundamentally
49
nonlinear because the physics of dose deposition change with each beam orientation; that
is, the effect of a beam on each patient can be drastically different than the effect of a
neighboring beam. To illustrate the nonlinearity of the problem, Figure 3-2 shows the
FMO problem as a function of just two coplanar beam angles. From this illustration, it is
evident that the FMO function, particularly in higher, more realistic dimensions, is likely
to also be multi-modal.
Although the FMO problem itself can be solved quickly using the convex model
presented in Section 2.3, in order to perform the FMO, lengthy calculations must be made
in order to determine each candidate beam’s effect on the patient. These calculations,
described in Section 3.5, require ≈ 13 minutes per beam to calculate, and thus make each
evaluation of the FMO problem expensive. Despite the time required for each function
evaluation, the limiting factor in beam orientation optimization is the hard drive space
required to store the beam data for each candidate beam. If the candidate set of beams is
small, this data can be pre-computed and stored, allowing the FMO problem to be solved
quickly in the BOO problem. But, if the candidate set of beams is large—for example,
consisting of non-coplanar orientations—then the data cannot be pre-computed due to
storage requirements.
Because of these difficulties with the BOO problem, previous studies have been largely
unable to consider the entire solution space of available beams. By using the response
method, which is specifically designed to model expensive nonlinear black-box functions,
we can iteratively identify promising beam vector solutions and generate beam data for
these solutions on-the-fly, thus circumventing the issue of storage space and allowing for
the consideration of all deliverable beam orientations.
3.4 Mixed-Integer Model Formulation
As an alternative to the BOO model given in Section 3.3, if the set of beam
orientations B is finite, the BOO and FMO problems can be formulated together and
solved simultaneously as a mixed-integer linear or nonlinear program (D’Souza et al. [58],
50
060
120180
240300
350
060
120180
240300
3500
1000
2000
3000
4000
Angle 1Angle 2
FM
O v
alue
Figure 3-2. FMO value as a function of two angles.
Ehrgott and Johnston [59], Ferris et al. [17], Lee et al. [20, 21], Lim et al. [60], Shepard
et al. [22], Wang et al. [61]). The FMO formulation can be combined with BOO in the
following model. Let yθ be a binary variable indicating whether or not beam θ ∈ B is used.
If beam θ is used in the treatment plan, then all the beamlets in θ, Bθ, are “turned on”;
that is, they can have positive fluences up to some pre-determined maximum intensity M .
The simultaneous BOO+FMO MIP model is then
minimize F (z)
subject to zjs =k∑
h=1
∑i∈Bθk
Dijsxi j = 1, . . . , vs, s = 1, . . . , S
xi ≤Myθ i ∈ Bθ, θ ∈ B∑θ∈B
yθ ≤ k
xi ≥ 0 i ∈ Bθ, θ ∈ B
yθ ∈ {0, 1} θ ∈ B
51
In order to solve such a problem, all beam data must be pre-computed for every beam
orientation. As described in Section 3.5, beam data requires a tremendous amount of
time and space to compute and store. Because of this requirement, only a small subset of
all possible beam orientations can be considered due to time and space constraints for a
BOO+FMO MIP formulation.
3.5 Beam Data Generation
For each beam orientation that is considered, lengthy calculations must be made to
determine the beam’s effect on the patient’s tissue and organs. This includes determining
in which structure each voxel lies, which voxels are hit by which beamlets and the amount
of intensity of each beamlet is deposited in each voxel through which it passes.
Beamlet dose computation models used in IMRT rely heavily on ray-tracing
algorithms for voxel classification and determination of the radiological path (Fox et
al. [62]). Voxel classification (Siddon [63]) establishes whether voxels are inside or outside
the path of a radiation beam and classifies voxel centers as inside or outside of segmented
targets and critical structures. The radiological path is the effective distance traveled by
a beamlet when the effect of traveling through tissues of different densities is considered.
The exact radiological path of a beamlet through the patient is required to correct for
tissue heterogeneities in determining the dose deposition coefficients (Siddon [64]).
Siddon’s ray-tracing algorithms (Siddon [63, 64]) have been the standard methods
used for ray-tracing in radiotherapy since the 1980s. In Siddon’s polygon and voxel
ray-tracing algorithms for voxel classification (point-in-polygon testing), structures
are represented as 3D polygonal objects, known as Siddon Prisms, and the signs of
cross-products of rays passing through the polygons are used to determine whether a voxel
lies inside or outside a structure. Despite its overwhelming use, Siddon’s algorithm for
polygon ray-tracing becomes very costly due to the number of voxels in a patient. Fox et
al. [62] developed a novel approach to polygon ray-tracing that circumvents the need for
cross-products by translating the polygon structure onto a coordinate system, replacing
52
the need for a cross-product by the sign of the second coordinate of each voxel in the
coordinate system.
In Siddon’s algorithm for determining radiological paths (Siddon [64]), the radiological
path must be determined for each voxel for every beamlet. This involves computations
for millions of beamlet-voxel combinations. As reported by Jacobs et al. [65] a significant
amount of computational time is required for these repeated calculations. Fox et al. [62]
combine the incremental voxel ray-tracing algorithm presented by Jacobs et al. [65] with a
method of virtual stereographic projection to significantly reduce the computational cost
of obtaining radiological path lengths.
Using their polygon translation and incremental ray-tracing algorithms, Fox et al. [62]
achieve a 100-300 fold improvement in computation time over Siddon’s point-in-polygon
algorithm. Because of the significant reduction in computation time, these methods are
used to generate beam data.
Because these beam data calculations must be performed for each of millions of
beamlet-voxel combinations, beam data generation is a lengthy process, requiring ≈ 13
minutes per beam using the algorithms described by Fox et al. [62]. In a typical FMO
formulation, the beam vector is pre-determined and the beam data for the beam vector
is calculated once and stored a priori. For a 5-beam case, this requires ≈150 MB of space
to store. As with a typical FMO problem, in a simultaneous FMO+BOO mixed-integer
programming (MIP) formulation, beam data for each of the candidate beams in B must
be generated a priori. If candidate beams are considered only for coplanar angles on a 10◦
grid, that is, only every 10th angle, beam data would have to be computed for 36 beams,
which requires ≈5 hours to compute and ≈800 MB of space to store. If we also wanted to
consider the possibility of rotating the couch on a 10◦ grid in addition to the gantry, beam
data would then have to be computed for 362 beams, which would require ≈170 hours and
≈ 60 GB of space for just one plan.
53
Clearly, the storage space requirements for each beam restricts the number of beams
that can be considered in a simultaneous FMO+BOO MIP formulation. This issue is
typically addressed by simply restricting the number of candidate beams in B. Lee et al.
[20] restrict the set B to only contain 18 pre-selected beam orientations, which can be
coplanar or non-coplanar. If only gantry and couch rotations are allowed on a 10◦ grid, a
beam set of 18 beams comprises only a small percent of the available beam orientations.
As more ranges of motion are allowed, this percentage falls even further. The inclusion
of all possible beam orientations significantly increases the size of the solution space and
could possibly allow for improved treatment plans, but the beam data for all orientations
cannot be pre-computed. In order to consider these orientations, we use a method that
allows us to generate the beam data on-the-fly only as necessary.
3.6 A Response Surface Approach to BOO
The shortcoming of the previous works on BOO is twofold. First, they predominantly
only consider coplanar angles, and not necessarily even the entire coplanar solution space,
while those that do consider non-coplanar beams only consider a hand-selected subset
of the available orientations. Second, the majority of the previous studies do not select
beam solutions using the FMO problem as a model for determining quality; instead, the
beam solutions are chosen based on scoring methods (e.g., BEV, path of least resistance)
or approximations to the FMO. By not optimizing the beam solution with respect to
the exact FMO problem, the BOO methods cannot guarantee convergence to an optimal
solution.
Of the previously cited works, only Das and Marks [27], Gokhale et al. [44], Meedt et
al. [45], Lu et al. [37], Rowbottom et al. [39] and Wang et al. [56] consider non-coplanar
orientations. Of these works, Das and Marks [27] require that the beam distances be
maximized, essentially requiring that beam solutions must be equi-distant and thus
restricting the size of the solution space; Meedt et al. [45] only consider 3,500 beams (a
54
minute subset of orientations available by rotation of the couch and the gantry); and
Wang et al. [56] use only nine pre-selected non-coplanar beams.
With the exception of Das and Marks [27], Haas et al. [28] and Schreibmann [29],
the previous studies have based their BOO approaches not on a beam solution’s optimal
solution to the FMO problem, but on locally optimal FMO solutions or on various scoring
techniques. Without basing BOO on the optimal FMO solutions, the resulting beam
solutions have no guarantee of optimality, or even of local optimality.
Because beam data generation is costly, a method that iteratively identifies only
promising beam orientations is required. The response surface (RS) method is such an
algorithm. In contrast to the previous studies, our approach to the BOO problem allows
for the inclusion of all possible beam orientations which are measured according to the
exact FMO problem, thus ensuring convergence to optimality due to the properties of the
response surface method.
The RS method is designed to efficiently model expensive black-box functions. In this
application, the FMO solver is our black box and the set of beams to be used is the input.
As in Aleman et al. [66, 67], we employ the response surface method as detailed in Jones
[68] and Jones et al. [69].
3.6.1 Overview of Response Surfaces
The response surface method identifies promising solutions based on the performance
of previous solutions. The function value and expected improvement over the current
best solution of a certain point is estimated based on the function behavior learned from
previously sampled points and their calculated objective function values. The function
values of points are related by correlation functions that depend on each point’s distance
from the previously sampled points. From the correlation functions, the algorithm predicts
the probability that the best solution will improve at unexplored points in the solution
space. Using this probability, a promising solution is identified. For the BOO problem,
55
beam data only needs to be generated for these promising solutions, thus saving both
computation time and storage space.
The response surface method models the objective function as a stochastic process of
the form
F (θ) = µ + ε(θ), (3–1)
where µ is a constant representing an average of the function F and ε(θ) is a random error
term associated with the point θ. In the general case, the error terms between two points,
say θ(1) and θ(2), are correlated by
Corr(ε(θ(1))
, ε(θ(2)))
= exp[−d(θ(1), θ(2)
)], (3–2)
where d(θ(1), θ(2)) is a weighted distance measure between θ(1) and θ(2). Intuitively, if two
points are very close together, the correlation between them will be close to one; similarly,
if two points are very far apart, the correlation between them will approach zero. Jones et
al. [69] propose the following weighted distance measure in general:
d(θ(1), θ(2)
)=
k∑h=1
ch
∣∣∣θ(1)h − θ
(2)h
∣∣∣ph
,
where the parameters ch and ph are weighting factors corresponding to the importance
of each variable h and the smoothness of the function F in the direction of variable h,
respectively. If small changes in variable h cause large changes in the function F , then ch
should be large to reflect that two points with relatively small differences in the value of
variable h should be “far” apart due to the large difference in their function values, and
thus have a low correlation. The parameter ch can take on any value, whereas 1 ≤ ph ≤ 2,
with ph = 2 corresponding to objective function smoothness and ph = 1 corresponding to
less objective function smoothness.
In the application to BOO, θ = (θ1, . . . , θk) is the vector of k angles from which
radiation will be delivered. Because no beam is more important than another beam, each
beam orientation h contributes equally to the FMO function, so ch = c and ph = p for
56
all h = 1, . . . , k. To maintain tractability of the subproblems described in the following
sections, the angles are treated as though they are points on a line rather than points on
a circle and so a Euclidean distance metric is used to determine the distance between two
points. The weighted distance measure for BOO is then
d(θ(1), θ(2)
)= c
∥∥∥θ(1) − θ(2)∥∥∥p
p, (3–3)
where ‖ · ‖p denotes the `p-norm. To ensure tractability of the subproblems described in
Section 3.6.2, the value p = 2 is used.
The idea of the RS method is to iteratively evaluate the true function F at certain
beam vectors θ, and then construct the conditional stochastic process given these function
values. This conditional stochastic process is then used to decide where to evaluate the
function F next. Due to the time and space required to generate the beam data necessary
to evaluate the function F , it is desirable to only evaluate points that will either improve
the best solution with a significant probability or significantly increase our knowledge of
the function. The optimization models to determine the next observation are described in
Section 3.6.2.
Let θ(1), . . . ,θ(n) be n previously sampled points. Rn is the matrix of correlations
between the previously sampled points, yn is the vector of function values F (θ(i)) of the
previously sampled points and µn and σn be estimators of the average and variance of the
function F , respectively. The response surface algorithm is given by:
• Initialization:
1. Choose values for the parameters c and p.
2. Choose an initial sample size, n, and a set of angles θ(i), i = 1, . . . , n. Evaluatethe function F at each of these points, yielding the values yi, i = 1, . . . , n.
• Iteration:
1. Compute or update the values of Rn, R−1n , µn, σn, and F n, the minimum
observed objective function value.
57
2. Determine the next point to observe using one of the methods described inSection 3.6.2 and call this point θ(n+1).
3. Find the value yn+1 = F (θ(n+1)), set n← n + 1, and repeat.
3.6.2 Determining the Next Observation
Because the function F is expensive to evaluate, we want to sample as few points
as possible. Thus, in each iteration, an optimization problem is solved that determines
the “best” next point at which to observe the true function F . Some of the optimization
problems that have been proposed in the literature depend on the uncertainty of the
predictor as a function of θ, as well as the expected improvement over the current best
solution (Jones [68], Jones et al. [69]).
Let rn(θ) be the vector correlations between θ and the n previously sampled points.
The uncertainty is then given by
s2n(θ) = σ2
n
[1− rn(θ)>R−1
n rn(θ) +
[1− 1>R−1
n rn(θ)]2
1>R−1n 1
],
where
σ2n =
1
n(yn − 1µn)>R−1
n (yn − 1µn)
is the estimator of the variance σ2n based on the n observations. The expected improvement,
denoted In(θ), is given by
In(θ) = sn(θ) [zΦ (z) + φ (z)] (3–4)
where
z =
(F n − Fn(θ)
sn(θ)
)(3–5)
and F n = min{y1, . . . , yn} is the current best solution and Fn(θ) is the estimated function
value of θ given the n previously sampled points. Φ and φ are the c.d.f. and p.d.f. of a
standard normal random variable, respectively.
58
The selection of the next point will be based on selecting the point that maximizes
either the uncertainty or the expected improvement, or a combination of both. Denote the
beam vector to be chosen as the vector θ.
3.6.2.1 Maximizing the expected improvement
Jones [68] and Jones et al. [69] recommend selecting the next point to sample as the
point θ for which the expected improvement over the current best solution value, In(θ), is
largest. This corresponds to solving the following optimization problem:
max In(θ)
subject to θh ∈ B h = 1, . . . , k
Although this is a difficult optimization problem, it can be solved using a branch-and-bound
technique, but in order to do so, an upper bound on In(θ) must be obtained. This can
be done by solving for the expected improvement in equation (3–4) while substituting
an upper bound on the uncertainty and a lower bound on Fn(θ), used in equation (3–5)
to determine the value z. The method of bounding Fn(θ) is taken directly from Jones
[68] and Jones et al. [69] and is not discussed further here. The method of bounding
s2n(θ) is improved from the original formulation in Jones et al. [69] to overcome numerical
instabilities, and is presented in Section 3.6.2.2. The branch-and-bound algorithm used to
maximize In(θ) is described in Section 3.6.2.3.
3.6.2.2 Obtaining an upper bound on the uncertainty
Due to the complexity of the s2n(θ) function, maximizing the uncertainty is a difficult
problem to solve. It can be relaxed into a linearly constrained quadratic programming
problem as follows (Jones et al. [69]). The resulting solution to the relaxed uncertainty
maximization problem is an upper bound on the uncertainty that can be used in
determining an upper bound on In(θ) as described in Section 3.6.2.1.
Let r = {r1, . . . , rn}, where r is a vector of decision variables independent of θ. By
treating both r and θ as decision variables, a quadratic objective function is obtained.
59
Because r is now a decision variable independent of θ, an equality constraint must be
added to the problem to ensure that r assumes the correct correlation values according
to the correlation definition in equation (3–2). This constraint is nonlinear, but it can be
relaxed by expressing the single equality as two inequalities (≤ and ≥) and then replacing
the nonlinear terms generated by ln(ri) and c‖θ − θ(i)‖22 with linear underestimators
ai + biri and pi,h + qi,hθh, respectively. The different types of linear estimators require
different values for ai, bi, pi,h and qi,h, and are differentiated by a superscript c for the
chord underestimators and a superscript t for the tangent line underestimators in the
model formulation, denoted Problem s2-UB.
Unfortunately, this relaxation provided by Jones et al. [69] can become numerically
unstable if two sampled points are very close together. If such a situation arises, the
bounds of the corresponding correlation value can become so close that due to round-off
error, the lower bound rLi can become slightly larger than the upper bound rU
i , resulting
in infeasibility. To avoid such an instability, instead of bounding ri using constraints, the
amount by which ri is outside of its feasible range is penalized by adding penalization
terms wLi = min{0, ri − rL
i } and wUi = min{0, rU
i − ri}. This final formulation is given in
Problem s2-UB. This formulation has only two more variables and two more constraints
for each sampled point, so the increased problem size does not significantly increase the
amount of time required to solve the problem.
60
PROBLEM s2-UB: Choose r and θ to
min − σ2n
[1− r>R−1
n r +
[1− 1>R−1
n r]2
1>R−1n 1
]+
n∑i=1
(wL
i
)2+
n∑i=1
(wU
i
)2subject to (ac
i + bciri) + c
k∑h=1
(pt
i,h + qti,hθh
)≤ 0 i = 1, . . . , n
(at
i + btiri
)+ c
k∑h=1
(pc
i,h + qci,hθh
)≤ 0 i = 1, . . . , n
wLi ≤ 0 i = 1, . . . , n
wLi ≤ ri − rL
i i = 1, . . . , n
wUi ≤ 0 i = 1, . . . , n
wUi ≤ rU
i − ri i = 1, . . . , n
lh ≤ θh ≤ uh h = 1, . . . , k
Using the upper bound on the uncertainty provided by Problem s2-UB, the point
yielding the maximum uncertainty is obtained by using the same branch-and-bound
method described in 3.6.2.3, except that s2n(θ) is maximized rather than In(θ).
Alternatively, another approach would be to choose the next point based on
maximizing uncertainty rather than the expected improvement. The branch-and-bound
approach described in Section 3.6.2.3 can be adapted to solve that problem rather than
maximizing the expected improvement.
3.6.2.3 Branch-and-Bound
A branch-and-bound method is used to determine the maximum expected improvement
in each iteration. At some point in the algorithm, n points, θ(1), . . . ,θ(n), have already
been observed. The solution space is divided into regions based on these previously
sampled points and consider each region as a separate subproblem.
Each of these subproblems is solved using branch and bound. First, the upper bound
on the uncertainty is determined as described in Section 3.6.2.2 using the subregion’s
61
lower and upper bounds on θ. Next, the lower bound FL on Fn(θ) is determined using the
method in Jones [68] and Jones et al. [69].
The upper bound on s2n(θ) and lower bound on F are now used to determine an
upper bound on In(θ) over the current subregion by solving for In(θ) substituting
Fn(θ) = FL and sn(θ) = sU as described in Jones [68] and Jones et al. [69]. In addition,
the θ that yielded the maximum uncertainty can be used to evaluate the function In(θ),
yielding a lower bound on In(θ) over the interval lh ≤ θh ≤ uh, h = 1, . . . , k. This value is
used to update the current best lower bound found (i.e., if the current best lower bound is
less than the new lower bound found, the current best lower bound is replaced by the new
one; otherwise, the current best lower bound is unchanged).
If the upper bound is less than the current best lower bound, the subregion is
discarded as not interesting. If the lower and upper bound are very close, we say that
we have found the optimum over the current subregion. Otherwise, the upper bound is
significantly larger than the current lower bound, so the subregion is further divided into
subregions as described below and the procedure is repeated for each of the new regions.
This is the branching step.
At some point, there are no more subregions to consider, as we have either decided
they are not interesting or have found the optimal solution for that subregion. Then, the
algorithm terminates and the current best lower bound is the optimal solution for In(θ)
over the current region.
This branch-and-bound procedure is applied to each of the regions, and the overall
largest In(θ) value is then the maximum In(θ), and the corresponding θ is the next point
at which to evaluate the FMO function.
Selecting the subregions. An important component of the branch-and-bound
algorithm is the method of selecting the subregions. The definition of these subregions,
as well asl the order in which they are explored, can have significant impact on both the
amount of time and memory required to perform the algorithm. As our implementation
62
of the branch-and-bound method requires that the entire solution space be divided into
subregions before the branch-and-bound algorithm begins, the selection of these initial
regions may also affect the speed of the algorithm.
Initial regions. Before beginning the branch-and-bound process, the solution space
of the decision variables, θh ∈ [0, 360] for all h = 1, . . . , k, is divided into a set of initial
regions. If θ represents non-coplanar orientations, we consider two ways of selecting the
regions defined by the non-coplanar orientations. First, we consider the entire solution
space as the only region, that is, instead of dividing the solution space into several
subregions, we only consider one subregion that encompasses the entire solution space (see
Figure 3-4A).
Second, denote a subset of variable indices H ⊆ {1, . . . , k}. For each index h ∈ H,
order the n previously sampled points increasingly by h. For each previously sampled
point i = 1, . . . , n − 1, consider the regions defined by lh = 0 and uh = 360 for h /∈ H,
and lh = θ(i)
hand uh = θ
(i+1)
h. Also consider the region defined by lh = 0 and uh = 360
for h /∈ H, and lh = 0 and uh = θ(1)
h. Similarly, consider the region defined by lh = 0 and
uh = 360 for h /∈ H, and lh = θ(n)
hand uh = 360. Figures 3-4A-3-4D illustrate the initial
regions for different H values where k = 2. Denote the initial region set where H = ∅
as B0 (Figure 3-4A), H = {1} as B1 (Figure 3-4B), H = {2} as B2 (Figure 3-4C) and
H = {1, 2} as B2 (Figure 3-4D).
Note that in the coplanar case, it is only necessary to test the initial region scheme for
one angle because the angles are interchangeable.
Bounds for discrete and continuous variables. If θ is discrete, the points on the
boundary between between the two subregions will be contained in both subregions, thus
creating an inefficiency. This can be seen in Figure ??, where θ(1)b is the point at which
we branch and the blue line represents the division of the region into two subregions. The
boundary line is contained in both the top interval and the bottom interval. This overlap
can be avoided when θ is integral by adjusting the bounds between subregions in such a
63
0 60 120 180 240 300 3600
60
120
180
240
300
360
Cou
ch a
ngle
Gantry angle
Initial region scheme B0
0 60 120 180 240 300 3600
60
120
180
240
300
360
Cou
ch a
ngle
Gantry angle
Initial region scheme B1
A B
0 60 120 180 240 300 3600
60
120
180
240
300
360
Cou
ch a
ngle
Gantry angle
Initial region scheme B2
0 60 120 180 240 300 3600
60
120
180
240
300
360
Cou
ch a
ngle
Gantry angle
Initial region scheme B3
C D
Figure 3-3. Initial regions in the branch-and-bound algorithm. A) Initial regions withH = ∅ (B0). B) Initial regions with H = {1} (B1). C) Initial regions withH = {2} (B2). D) Initial regions with H = {1, 2} (B3).
64
way as to prevent overlapping between any subregions. If the lower bound lh on θh in a
subregion is fractional, then we discard the non-integral solutions by setting lh = dlhe.
Similarly, if the upper bound uh on θh in a subregion is fractional, then uh = buhc. If the
lh and uh bounds are integral and lh = uh, overlapping is avoided by setting lh = lh − 1
(see Figure ??). If θ is continuous, the bounds cannot be adjusted.
Branching scheme. The basic principle of the branch-and-bound method is to
decompose regions into smaller subregions in such a way that as many subregions as
possible can be discarded as uninteresting, leaving a reduced number of subregions that
must actually be searched. The branch-and-bound method is a well studied problem,
and as such, there are numerous methods of selecting the subregions. Regions may be
divided into two equal subregions (bisection), or more generally, into multiple subregions
which may or may not be equal in size (multisection) (Csallner et al. [70], Lagouanelle and
Soubry [71]). Some other common methods include selecting only a subset of variables on
which to branch (Epperly et al. [72]), using Langrangian duality to obtain lower bounds
(Barrientos and Correa [73], Thoai [74], Tuy [75]) and applying decomposition algorithms
(Phong et al. [76], Bomze [77], Cambini and Sodini [78]).
In our branching step, we form the subregions based on some point in the region. The
region is divided at this point along one of the indices. In Figure 3-4A, θ(1)b is the point
at which we branch. We branch by dividing the region horizontally into two subregions
at θ(1)b , taking into account the adjustments to the bounds described above so as to avoid
overlapping regions. For k = 2, in each branching step, we alternately divide the region
horizontally (along index 2) and vertically (along index 1) as shown in Figures 3-4B–3-4D.
After branching horizontally once at θ(1)b as shown in Figure 3-4B, we examine the top
region and select θ(2)b as the point at which we branch. We then branch by dividing this
subregion vertically at θ(2)b . We proceed in the same manner for θ
(3)b , where we branch
horizontally, and so on until the convergence criteria is met.
65
In the general case, we divide the region into two subregions along the branching
index while cycling through each of the indices h = 1, . . . , k sequentially. For the
branching index h, the bounds for one new subregion are lh = lh and uh = θb,h − 1,
and the bounds for the other new subregion are lh = θb,h and uh = uh. The lower
and upper bounds on the region for the remaining indices are unchanged for both new
subregions, i.e. lh = lh and uh = uh for h 6= h.
In the non-coplanar case, a beam in θ may be represented by more than one index.
For example, if a single non-coplanar beam consisting of couch and gantry rotation is
optimized, the vector θ consists of θ1 representing the gantry angle and θ2 representing
the couch angle. The branching index h ∈ {1, 2} represents branching on either the
gantry angle or on the couch angle. If two such non-coplanar beams are optimized,
then θ consists of θ1 and θ2 representing the gantry and couch angles of the first beam,
respectively, and θ3 and θ4 representing the gantry and couch angles of the second beam,
respectively. The branching index h ∈ {1, 2, 3, 4} then represents branching on a single
component of a single beam.
Accounting for symmetry. In the case where θ represents a set of coplanar beam
angles, the ordering of the variables in θ is irrelevent to the FMO value obtained at θ. For
example, if θ(1) = (10, 20, 30, 40) and θ(2) = (20, 30, 40, 10), then F (θ(1)) = F (θ(2)). Thus,
it is redundant to consider both θ(1) and θ(2), and elimination of these redundant regions
can greatly decrease the size of the solution space.
For example, if we consider the two-dimensional case (k = 2), the solution space is a
square region with 0 ≤ θ1 ≤ 360 and 0 ≤ θ2 ≤ 360. The points above the line θ1 ≤ θ2 are
equivalent to the points below the line, so we only need to consider one of these regions.
Say we branch by splitting the region into four equal quadrants, as shown in Figure 3-5A.
If we arbitrarily choose to only examine the points above the line θ1 ≤ θ2, then quadrant 4
can be eliminated.
66
Cou
ch a
ngle
Gantry angle
Branching scheme
l1 u
1
u2
l2
θb(1)
Cou
ch a
ngle
Gantry angle
Branching scheme
l1 u
1
u2
l2
θb(1)
A B
Cou
ch a
ngle
Gantry angle
Branching scheme
l1 u
1
u2
l2
θb(2)
θb(1)
Cou
ch a
ngle
Gantry angle
Branching scheme
l1 u
1
u2
l2
θb(2)
θb(1)
θb(3)
C D
Figure 3-4. Partitioning a region into subregions. A) Partitioning a region into subregionswithout accounting for overlap. B) Preventing overlapping regions. C) Regionsafter two branches. D) Regions after three branches.
67
A B
Figure 3-5. Accounting for symmetry. A) Accounting for symmetry in 2D. B) Accountingfor symmetry in 3D.
In three dimensions, the solution space is a cube. If we branch by splitting the cube
into eight equal cubes, the region to be examined is shown in Figure 3-5B, where the
origin is the back bottom left corner of the cube. From this figure, we can see that a
sizable portion of the solution space can be discarded.
In regions where there are both viable and redundant solutions (for example,
quadrants 2 and 3 in Figure 3-5A), the addition of constraints requiring that θ1 ≤ . . . ≤ θk
in the problem of maximizing the expected improvement ensure that only the unique
portion of the region is considered.
If more than one non-coplanar orientation is optimized, a similar symmetry to the
multiple coplanar orientation symmetry exists. Consider an implementation where two
non-coplanar beam orientations are optimized, and these orientations are obtained from
rotating both the gantry and the couch. Each beam is represented by two variables
in the solution vector: one variable indicating the degree of gantry rotation, and one
variable indicating the degree of couch rotation. Let θ1 and θ2 be the gantry rotation and
couch rotation of the first beam, respectively, and θ3 and θ4 be the gantry rotation and
couch rotation of the second beam, respectively. Then, the solution vector {θ1, θ2, θ3, θ4}
68
is identical to the solution vector {θ3, θ4, θ1, θ2}. Because the couch angle selected is
dependent on the gantry angle (and vice versa), this symmetry can be exploited by only
removing redundant solutions from one of the beam variables, that is, by requiring that
θ1 ≤ θ3 (removing redundancy from the gantry angles) or θ2 ≤ θ4 (removing redundancy
from the couch angles). In general, if d degrees of motion are used to obtain m beam
orientations, and the linear accelerator motion variables are in the same order for each
beam, then θk ≤ θk+d ≤ θk+2d ≤ . . . ≤ θk+(m−1)d for some k ∈ {1, . . . , d}.
3.6.3 Method of Obtaining the Next Observation
The RS algorithm allows for two methods of selecting the next point to observe:
by maximizing the expected improvement or by maximizing the uncertainty. In these
tests, the point to observe is obtained by first selecting the point that maximizes the
expected improvement until the maximum expected improvement falls below a certain
threshold, and then switching to the point that maximizes the uncertainty. Once the
maximum uncertainty also falls below a certain threshold, the algorithm terminates. By
first selecting according to the expected improvement, the method quickly obtains a good
solution. By then selecting according to uncertainty, theoretical convergence to the global
minimum is ensured.
3.7 Neighborhood Search
3.7.1 Introduction
From Aleman et al. [79], we test the simulated annealing algorithm on the BOO
problem, as well as existing and new variants of a greedy neighborhood search heuristic
called the Add/Drop algorithm (see Kumar [80]) to obtain a good solution to the BOO
problem. In each step of the Add/Drop algorithm, a beam in the current beam set is
replaced by a neighboring beam that yields an improving solution. As with the simulated
annealing implementation, we also apply our new neighborhood to the Add/Drop
algorithm and compare its performance to a commonly used neighborhood structure.
69
3.7.2 Neighborhood Search Approaches
Neighborhood search approaches are common methods of obtaining solutions to
global optimization problems. For a vector of decision variables, a neighbor is obtained by
perturbing one or more of the decision variables. A neighborhood for a particular vector
of decision variables is the set of all its neighbors for a given method of perturbating the
decision variable vector. A solution is considered to be locally optimal if there are no
improving solutions in its neighborhood.
Both deterministic and stochastic neighborhood search algorithms have been applied
to a wide variety of optimization problems. A deterministic neighborhood search algorithm
is one in which the entire neighborhood, or a pre-defined subset of the neighborhood,
is enumerated in each iteration to find an improving solution. Stochastic versions of
neighborhood search approaches, for example, simulated annealing, randomly select
neighboring solutions in an attempt to find an improving solution in each iteration.
For the BOO problem, we consider two neighborhood search methods. The first is
a deterministic neighborhood search algorithm that finds a locally optimal solution, and
the second is the simulated annealing algorithm, which, although based on neighborhood
searches, provably converges to the globally optimal solution for certain neighborhood
structures.
3.7.3 A Deterministic Neighborhood Search Method for BOO
Deterministic neighborhood search methods are optimization algorithms that
start from a given solution and then iteratively select the best point in the current
neighborhood as the next iterate. The best point in the neighborhood can be found by
complete enumeration if the neighborhood is small, or by optimization is the neighborhood
is large or if objective function evaluations are expensive. Due to the complexity of
the BOO problem, even when only a subset of available orientations is considered, we
will focus on smaller neighborhoods and use enumeration. The neighborhood could
alternatively be searched heuristically, for example by searching the neighborhood until
70
the first improving solution is found, rather than the best improving solution. If no
improved solution can be found the current solution is a local optimum.
In our implementation of the Add/Drop algorithm, a small neighborhood is desired
for enumeration purposes. In each iteration, a neighborhood for just a single beam is
considered. Say a beam set consisting of k beams is desired. Letting the neighborhood of a
single beam θh in θ be denoted as Nh(θ), the Add/Drop algorithm is as follows:
• Initialization:
1. Choose an initial starting solution θ(0).
2. Set θ∗ = θ(0) and i = 0.
• Iteration:
1. Select h ∈ {1, . . . , k}, then generate θ ∈ Nh(θ(i)).
2. If F (θ) < F (θ∗), set θ∗ = θ(i+1) = θ and set i← i + 1.
3. If all points in ∪kh=1Nh(θ
(i)) have been sampled without improvement, stop withθ∗ as a local minimum. Otherwise, repeat Step 1.
3.7.3.1 Neighborhood Definition
In each step of the Add/Drop algorithm, a beam in the current solution is replaced
with an improving beam in its neighborhood. Rather than define a neighbor as related
to an entire beam vector, the neighborhoods of individual beams are considered. The
neighborhood of a single beam θh in θ is defined as
Nh(θ) ={
(θ1, . . . , θh−1, θ mod 360, θh+1, . . . , θk)
∈ Bk : θh − δ ≤ θ ≤ θh + δ}
.
In other words, the neighborhood of a beam is all beams within ± δ degrees taking into
account the cyclic nature of the angles. The cyclicality of the angles refers to the fact
that all angles can be represented by degrees in [0,360]. For example, 400◦ = 40◦ and
−100◦ = 260◦. The expression θ mod 360 captures this cyclicality.
71
3.7.3.2 Neighbor Selection
The process of selecting a neighboring point in each iteration consists of two steps:
selecting the index h to change and then selecting an improving angle in Nh(θ) to replace
θh. If h is selected as i mod k + 1, the algorithm will cycle through each index sequentially,
similar to a Gibbs Sampler (see, for example, Geman and Geman [81] and Gelfand and
Smith [82]). The Gibbs Sampler also uses a similar two-step approach to generating a
new point by sequentially generating a new value for each variable in turn. If h is selected
randomly in each iteration, the resulting algorithm is similar to a Hit-and-Run method
(see, for example, Smith [83] and Belisle [84]), in which a variable to be changed is selected
randomly, and then a new value for that variable is also selected randomly within a
neighborhood.
Once h is selected, the new value for θh can be generated by enumeration or by a
heuristic method. The Add/Drop algorithm compares the quality of the new solution to
the current solution, and then only accepts improving solutions. This greedy approach
results in a locally optimal solution.
3.7.3.3 Implementation
The index of beam angle to be changed in each iteration, h in Step 1 of the algorithm
in Section 3.7.3, is chosen as h = i mod k + 1 to cycle through each index in a sequential
manner. In the Add/Drop implementation, once h is determined, θ in iteration i is
chosen as θ = arg minθ∈Nh(θ(i)){F (θ)}. By replacing each beam by the most improving
neighbor, the Add/Drop algorithm is a greedy heuristic which terminates when there is no
improving neighbor for any beam.
A multi-start aspect is added by repeating the algorithm with multiple initial starting
points. For example, one strategy to select starting points would be to select a random
starting point according to a particular distribution. Another strategy would be to select
an equi-spaced solution and rotate it a fixed number of times to obtain new starting points
until the initial equi-spaced solution is repeated. Equi-spaced beam solutions are common
72
in clinical practice for an odd number of beams. The reason that such a method is not
generally used in practice for even-numbered beams is that the resulting beam set would
contain parallel-opposed beams (beams that lie on the same line), which are not used by
convention as it is believed that the effect of a parallel-opposed beam is very similar to
simply doubling the radiation delivered from a beam. If an equi-spaced solution is not
possible given a beam set of k beams and the discretization level of the candidate beam
set B, then the solution can be rounded so that θ(0)h ∈ B, h = 1, . . . , k.
3.7.4 Simulated Annealing
The simulated annealing algorithm used is similar to the classical simulated annealing
approach proposed in Kirkpatrick et al. [85]. The simulated annealing algorithm is based
on the Metropolis algorithm, wherein a neighboring solution to the current iterate is
generated, and if it is an improving point, it becomes the current iterate. Otherwise, it
becomes the current iterate with probability exp{∆F/T}, where ∆F is the difference
in FMO value between the current iterate and the newly generated point and T is the
temperature, a measure of the randomness of the algorithm. If T = 0, then only improving
points are selected. If T is very large, then any move is accepted, which is essentially a
random search.
The simulated annealing algorithm starts with an initial temperature T0 and performs
a number of iterations of the Metropolis algorithm using T = T0. Then, the temperature is
decreased according to some cooling schedule such that {Ti} → 0.
Obvious parallels can be drawn between the simulated annealing algorithm and the
Add/Drop neighborhood search method described in Section 3.7.3. While the Add/Drop
algorithm deterministically searches the neighborhood for improving solutions, the
simulated annealing algorithm randomly selects neighboring solutions. Rather than
being limited by the ability to only move to improving solutions, the simulated annealing
algorithm may still move to a non-improving solution with a certain probability, thus
73
allowing for the escape from local minima. The Add/Drop algorithm, on the other hand,
is a greedy algorithm that is specifically designed to find local minima.
The simulated annealing algorithm is essentially a randomization of the Add/Drop
algorithm. In addition to the added randomness, the possibility of changing more than
one beam in each iteration is allowed by selecting a set of indices H ⊆ {1, . . . , k} to
change, rather than just selecting a single index h. The simulated annealing algorithm is
as follows:
• Initialization:
1. Choose an initial beam set θ(0) and calculate its FMO objective function valueF0.
2. Set θ = θ(0), F = F0, i = 0.
• Iteration:
1. Select H ⊆ {1, . . . , k}, generate θ ∈ ∪h∈HNh(θ(i)), and calculate its FMO
objective function value F .
2. If F < F , set F = F , Fi+1 = F , θ(i+1) = θ and θ = θ. Otherwise, set Fi+1 = Fand θ(i+1) = θ with probability exp{(Fi − F )/Ti}.
3. Set i← i + 1 and repeat Step 1.
The simulated annealing algorithm has been previously applied to the BOO problem.
Bortfeld and Schlegel [35] use the “fast” simulated annealing algorithm described by Szu
and Hartley [86] which employs a Cauchy distribution in generating neighboring points.
Stein et al. [40], Rowbottom [39] and Djajaputra et al. [36] also use a Cauchy distribution
in generating neighoring solutions. Lu et al. [37] randomly select new points satisfying
BEV and conventional wisdom criteria and Pugachev and Xing [38] randomly generate
new points and then vary them according to an exponential distribution. All accept
improving solutions, and with the exception of Rowbottom et al. [39] who only accept
improving solutions (essentially Ti = 0 for all i), all accept non-improving solutions with a
74
Boltzmann probability. None of the previous BOO studies employing simulated annealing
use the exact FMO as a measure of the quality of a beam set.
3.7.4.1 Neighborhood Definition
Two neighborhood structures are explored. The first neighborhood is similar to that
described in Section 3.7.3.1 in that a neighborhood Nh(θ) is considered for only a single
beam index h ∈ {1, . . . , k}, just as in the Add/Drop method.
As an extension to changining a single angle in each iteration, we also consider
a neighborhood that involves changing all beams in each iteration, corresponding to
H = {1, . . . , k} in Step 1 of the simulated annealing algorithm in Section 3.7.4. This
neighborhood is defined as N (θ) = ∪kh=1Nh(θ). Again, the neighborhoods for the
individual beams are defined as in the first method, with bounds of ± δ degrees.
3.7.4.2 Neighbor Selection
The method of selecting a neighbor depends on the neighborhood structure as
described in Section 3.7.4.1. In the first method where only one beam is changed at a
time, a neighbor is selected using the randomized approach described in Section 3.7.3.2.
Once h is selected, the probability of selecting a particular solution in Nh(θ) where the
new θ is d degrees from θh is P{D = d}, where D is the realization of a random variable of
some probability distribution defined on the interval [−δ,−δ + 1, . . . , δ].
For the neighborhood N (θ) where all beams are changed in an iteration, the new
value for each beam h ∈ {1, . . . , k} is generated from Nh(θ) in the same manner described
above.
3.7.4.3 Implementation
In addition to basing our algorithms on the exact FMO solution rather than on
heuristics or scoring measures, our simulated annealing approach differs from the previous
studies in the distribution used to generate neighbors, the definition of the neighborhood,
the cooling schedule and the number of iterations/restarts used. Not only do we use a
new neighborhood structure, but also a geometric probability distribution rather than a
75
uniform or Cauchy distribution on the neighborhood. The geometric distribution is similar
in shape to the Cauchy distribution in that they both can have fat tails depending on
the choice of probability parameters. The fat tails of these distributions allow for points
far away from the current solution to be selected as successive iterates, which potentially
increases the likelihood of finding a globally optimal solution. The geometric distribution
has the added attractiveness of producing discrete solutions, which is desirable for the
BOO problem in which discrete solutions are preferred.
By using the cooling schedule Ti+1 = αTi with α < 1, the sequence of temperatures
{Ti} converges to zero as the number of iterations increases. In our approach, the
neighborhood of a beam for both the Nh(θ) and N (θ) neighborhoods is defined using
δ = 180, that is, Nh(θ) = B. By defining the neighborhood of each beam to be the entire
single-beam solution space, the simulated annealing algorithm converges to the global
optimum when using the neighborhood N (θ) defined in Section 3.7.4.1. Though Nh(θ)
is large, each beam in Nh(θ) is assigned a probability so that only the beams closest to
θh have a significant probability of being selected. Figure 3-7A shows the probability of
replacing θh with beams at varying distances using probability p = 0.25 for the geometric
distribution. Note that the current beam cannot be selected as a replacement.
As with the Add/Drop method, a multi-start aspect is added to the simulated
annealing algorithm by repeating the algorithm using several different starting points.
3.7.4.4 Convergence
Unlike many previously proposed simulated annealing algorithms, our algorithm
converges to the globally optimal solution to the BOO problem under mild conditions.
The following theorem summarizes these conditions.
Theorem 3.7.1. Suppose that
• H = {1, . . . , k}
• limi→∞ Ti = 0
• δ = 180
76
• There is a positive probability of generating any solution in the neighborhood.
Then our simulated annealing algorithm converges to the global optimum solution in the
sense that
limi→∞
Fi = F ∗ in probability
where F ∗ is the global optimum value of the BOO problem.
Proof. This follows from Theorem 1 in Belisle et al. [87].
3.7.5 A New Neighborhood Structure
For the BOO problem, the neighborhood structure that is typically used for a vector
of beam orientations is simply the collection of beam vectors obtained from changing one
or more of the beams to a neighboring beam, where each beam has its own neighborhood
Nh(θ).
In addition to Nh(θ), we consider a new neighborhood which we call a “flip”
neighborhood. The flip neighborhood of a beam index h consists of Nh(θ) plus a
neighborhood around the parallel opposed beam of h. The parallel opposed beam is
the beam 180◦ away, that is,
h′ = (θh + 180) mod 360
The flip neighborhood can be defined as
N Fh (θ) =
{(θ1, . . . , θh−1, θ mod 360, θh+1, . . . , θk) ∈ Bk
: θ ∈ [θh − δ, θh + δ] ∪[θh + 180− δF , θ + 180 + δF
] }Note that the values δ and δF may be different. Figure 3-6 depicts a flip neighborhood
for a beam located at 0◦ degrees, the center of the top shaded wedge representing Nh(θ),
where θh = 0.
The motivation for the flip neighborhoods arises from the observation that many
of the 3-beam simulated annealing plans generated using the regular neighborhood
contained two beams very close to two beams in the optimal solution (obtained by explicit
77
enumeration), while the third beam was very close to the parallel opposed beam of
the third beam in the optimal solution. Given this observation, it is intuitive that the
inclusion of the neighborhood around the parallel beam should provide improved solutions.
The neighborhoods Nh(θ) and N Fh (θ) with varying δF values are applied to both
the Add/Drop and the simulated annealing frameworks. For the geometric probability
distribution used in the simulated annealing method, Figure 3-7B shows the probability of
selecting beams at different distances using a flip neighborhood with probability p = 0.25.
Note that the current beam cannot be selected as its own neighbor.
Figure 3-6. Nh(θ) (top shaded area) and N Fh (θ) (top and bottom shaded areas) for θh=0.
−6 −4 −2 0 2 4 60
0.05
0.1
0.15
distance from current beam
sele
ctio
n pr
obab
ility
Geometric probability distribution for standard neighborhood
p=0.25
−180 0 +1800
0.01
0.02
0.03
0.04
0.05
0.06
0.07
distance from current beam
sele
ctio
n pr
obab
ility
Geometric probability distribution for flip neighborhood
p=0.25
A B
Figure 3-7. Selection probabilities. A) Nh(θ). B) N Fh (θ).
78
3.8 Results
In addition to judging the BOO algorithms according to their computational time, the
plans must also be evaluated for clinical viability. All criteria used are those employed at
the Davis Cancer Center at Shands Hospital at the University of Florida.
3.8.1 Evaluating Plan Quality
In order to formulate an optimization problem, a quantitative measure of the
treatment plan quality is needed. This measure, the FMO function value, needs to
appropriately make the trade-off between the contradictory goals of covering targets and
sparing critical structures.
Typically, a good plan ensures that at least a certain percent of each target receives
the prescription dose. A coldspot occurs where less than a certain percent of the target
receives the prescription dose. Similarly, a hotspot occurs if a significant percentage of the
target receives more than the prescription dose.
3.8.1.1 Target coverage
Each of the plans contains two target structures, or planning tumor volumes (PTV):
one is the tumor mass observed from imaging scans, which we will call PTV2, and the
other is the PTV2 plus some margin specified by the physician, which we will call PTV1.
The PTV1 structure is used by physicians in case there are elements of the tumor mass
that cannot be seen from the imaging scans. The dose prescribed for PTV1 is less than
the dose prescribed for PTV2.
For target structures, we require that at least 95% of the target receives the full
prescription dose, so the dose that is received by at least 95% of each of the targets is
measured. We want to restrict the amount of the target that receives more than the
prescription dose. Because PTV2 is contained inside PTV1, PTV2 will necessarily have a
sizable, but less important, area receiving an overdose. Thus, we are only concerned with
PTV2 overdose. To evaluate the size of the hotspot, we check the percent volume of PTV2
that receives more than 110% of the prescription dose. To evaluate the coldspots, we check
79
Table 3-1. Sparing criteria varies for each critical structure
Structure Percent (%) ≤ Dose (Gy)brain stem 100 55eyes 50 30mandible 100 70optic chiasm 100 55optic nerves 100 50parotid glands 50 30skin 100 60spinal cord 100 45submandibular glands 50 30
the percent volume of both PTV1 and PTV2 that receives at least 93% of the prescription
dose. The prescription doses are set to 54 Gy for PTV1 and 73.8 Gy for PTV2, which are
the dose values used at Shands Hospital at the University of Florida.
3.8.1.2 Critical structure sparing
The critical structures involved in each case vary, depending on their proximity to
the tumor. The critical structures can be classified into two general groups according
to their ability to survive radiation dose. Parallel structures, e.g., saliva glands, will
continue to function as long a certain percentage of the organ receives less than a certain
amount of dose. Serial structures, on the other hand, will cease to function if any of the
organ receives over a certain amount of dose. The spinal cord is one example of a serial
structure—if it receives too much dose, the effect is equivalent to cutting it in half, leaving
the patient paralyzed. The sparing criteria for each of the common critical structures in
head-and-neck cases are listed in Table 4-2. The critical structures involved in each case
vary, depending on their proximity to the tumor.
There are four saliva glands: one submandibular and one parotid gland on each of
the right and left sides. The saliva glands are of particular importance because their loss
can greatly decrease the patient’s quality of life, but because of their location relative
to the usual tumor positions, they can be difficult to spare. Studies show that a patient
can lead a relatively normal life with three of the four glands spared. The loss of other
80
organs, especially the spinal cord, will also greatly affect the patient’s quality of life, but
head-and-neck tumors are usually situated in such a way that other organs can be easily
spared in the FMO optimzation. Thus, the results presented place particular emphasis on
the sparing of saliva glands.
Rather than relying strictly on FMO value, a tool commonly used by physicians to
judge the quality of a treatment plan is the dose-volume histogram (DVH). This histogram
is a measure of the cumulative dose received by a given structure. It specifies the fraction
of each structure’s volume that receives at least a certain amount of dose. Although there
are several critical structures to be considered in head-and-neck cases, the saliva glands are
notoriously the most difficult to spare due to their proximity to common tumor locations.
Thus, for clarity, the DVH results provided include only target structures and saliva
glands. Each of the treatment plans spares all organs not shown in the DVHs.
In the DVH results provided, vertical lines indicate target prescription doses, and
asterisks mark the sparing criteria for the saliva glands.
3.8.2 Response Surface Method Results
The response surface method was tested on six head-and-neck cases using a Windows
XP computer with a 3.2 GHz Pentium IV processor and 2 GB of RAM. The sizes of the
test cases for plans with three beams are shown in Table 3-2. Each algorithm was allowed
to run for 12 hours, which is not an unreasonable run length because BOO will not be
performed on a day-to-day basis. It is anticipated that BOO will be performed once
overnight between the time the patient is imaged and the time the patient begins radiation
therapy. A good beam vector chosen before treatment begins should continue to provide
quality treatment plans throughout the patient’s treatment, which is typically 35 days.
The beam orientations from which linear accelerators are capable of delivering
radiation are not restricted to integer value degrees. In this study, integral beam
orientations are desired to account for setup tolerances. For the same reasons, beam
orientations are considered on a 10◦ grid. To obtain integral solutions, in the subproblem
81
Table 3-2. Sizes of test cases.
Case # bixels # voxels1 514 345,6292 546 352,2843 613 347,2334 549 268,8235 423 271,1566 585 389,565Avg. 538 329,115Low 423 268,823High 613 389,565
of maximizing I(θ), the integer constraint is relaxed in the problem of determining an
upper bound on s2(θ), and the resulting solution is rounded to integer values.
The branching scheme used treats the rounded solution as integral and branches so
as to avoid overlapping subregions as described in Section 3.6.2.3. Results are provided
for each possible initial region scheme. The point at which branching is performed in each
region, θb in Section 3.6.2.3, is chosen as the midpoint of the region. Also, ri and θh in the
underestimating terms in Problem s2-UB in Section 3.6.2.2 are taken to be the midpoints
of their respective intervals.
It is anticipated that the weighted distance measure in equation 3–3 will have
an significant impact on the algorithm’s performance. Intuitively, a small weighted
distance corresponds to a small correlation between points, which will cause the algorithm
to behave locally. In order to induce the algorithm to behave globally, the algorithm
must assume less correlation between two points. If the points are less correlated, the
algorithm will be less likely to stay in the neighborhood of previously sampled points.
The correlation between two points can be decreased by increasing the weighted distance
between the points, which can be done by increasing c or p. If c becomes sufficiently large,
the correlation between points will be effectively zero, thus yielding an effectively random
search algorithm. To test these expectations, c was tested with values of 10.0, 100.0 and
82
500.0. In each test, five randomly selected starting points were used to initialize the RS
algorithm.
To evaluate the algorithm’s performance across all of the tested cases, the relative
improvements in FMO value over a 5-beam equispaced plan (denoted 5 equi), a 7-beam
equispaced plan (denoted 7 equi) and a locally optimal 3-beam coplanar plan obtained
using a local search algorithm called the Add/Drop local search heuristic introduced by
Kumar [80] and denoted 3 A/D are compared.
3.8.2.1 Proof of concept
To test the accuracy of the RS method, a single case was tested wherein the problem
of adding a single coplanar beam to an equi-spaced, coplanar 3-beam solution over a 1◦
grid was considered. The algorithm was initialized with two randomly selected starting
points. By considering such a small scale problem, the solution space in each iteration can
be explicitly enumerated in order to exactly obtain the next best point to sample. The
ability to enumerate the solution space will also allow us to determine how accurately the
RS method models the FMO objective function.
At each point that has been sampled, both the uncertainty and the expected
improvement will be zero. This result is not only theoretically true, but also intuitive
because once the FMO value at a certain point is known, there will be no improvement
over the current best FMO value by sampling that point again. It is also expected that as
the algorithm progresses, the approximation of the FMO function will become increasingly
accurate, with the approximation obtaining the exact FMO values at sampled points.
Figures 3-8A-3-8D demonstrate how the RS method behaved as predicted at different
points in the RS algorithm. The expected value is zero at sampled points and the
approximation of the FMO function almost perfectly fits the true FMO function by
the time the algorithm terminates.
The importance of the starting points, the points sampled before the algorithm begins
to give the method some baseline information about the FMO function, was also tested.
83
0 60 120 180 240 300 3605
6
7
8
9
10
11
12
13
θ
FM
O v
alue
F(θ
)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
2 points sampled
Exp
ecte
d Im
prov
emen
t I(θ
)
True FMO valueFMO appromixation
0 60 120 180 240 300 3605
6
7
8
9
10
11
12
13
θ
FM
O v
alue
F(θ
)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
20 points sampled
Exp
ecte
d Im
prov
emen
t I(θ
)
True FMO valueFMO appromixation
A B
0 60 120 180 240 300 3605
6
7
8
9
10
11
12
13
θ
FM
O v
alue
F(θ
)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
80 points sampled
Exp
ecte
d Im
prov
emen
t I(θ
)
True FMO valueFMO appromixation
0 60 120 180 240 300 3605
6
7
8
9
10
11
12
13
θ
FM
O v
alue
F(θ
)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
148 points sampled − algorithm terminates
Exp
ecte
d Im
prov
emen
t I(θ
)
True FMO valueFMO appromixation
C D
Figure 3-8. Proof of concept results at various stages of the RS algorithm. A) After twopoints. B) After 20 points. C) After 80 points. D) After 148 points, when thealgorithm terminates.
The RS method was run with 100 randomly generated sets of starting points, and the RS
method obtained the global optimum in 90.6% of trials, indicating that the performance of
the algorithm is not significantly dependent on the starting points.
3.8.2.2 Adding a non-coplanar beam to a coplanar solution
Next, the problem of adding a non-coplanar beam to a 3-beam locally optimal
coplanar solution was considered. The locally optimal solution is obtained using the
Add/Drop algorithm. The beam data for the non-coplanar beam being optimized is
generated on-the-fly, and consists of gantry and couch rotations, where the both gantry
and couch are allowed to rotate a full 360◦ on a 10◦ grid. As the final solution of the
non-coplanar RS plan will be a 4-beam plan, the results from the response surface solution
84
are compared to the locally optimal coplanar 4-beam Add/Drop plan, denoted 4 A/D.
The plans will also be compared to an equi-spaced, coplanar 7-beam plan, denoted 7 equi,
which is commonly used in practice to treat head-and-neck cancers.
There is relatively little deviation in the final solutions between the different
parameter choices and initial regions schemes, as shown by Table 3-3. The results also
indicate that the starting points chosen do not significantly affect the outcome of the
algorithm. This implies that the response surface algorithm is robust with respect to
varying implementations.
Although the 4 RS solutions obtained an average of 5.44% decrease in FMO value
from the 7 equi plans, the 4 RS solutions did in fact obtain an average of 16.12%
improvement in FMO value over the 4 AD solutions. Despite the differences in FMO
value, all treatment plans examined were similar in clinical quality, as discussed in Section
3.8.2.3.
Although the algorithm was allowed to run for 12 hours in each scenario, the
minimum FMO value obtained by the RS method was found early on. On average,
the best FMO value found was obtained in 6.15 hours after sampling 27-40 points.
For each of the RS method variations tested, both the number of points sampled
and the relative improvements in FMO value are nearly identical. This indicates that the
algorithm is robust with respect to parameter and implementation changes. The time
spent generating beam data comprises approximately 84% of the algorithm’s run time,
while the response surface portion on average accounts for only 13%. Thus, it is expected
that changes to the RS method, including improvements to the branch-and-bound routine,
will not have a very strong impact on the number of points the algorithm will sample in
its allotted run time.
3.8.2.3 Clinical results
The target coverage achieved by the different treatment plans are displayed in Table
3-4. On average, the 7 equi plan was able to deliver the most amount of dose to PTV2,
85
Table 3-3. Minimum FMO value obtained and time required to obtain it.
Min. FMO value Time (hrs)Case Avg. St. Dev. Avg. St. Dev.1 565.24 8.82 5.35 5.072 570.51 12.83 7.49 3.783 927.34 20.60 7.05 2.214 710.92 7.72 6.54 3.395 512.22 20.04 6.96 3.336 799.95 34.07 3.48 3.40
Table 3-4. Target coverage achieved by the treatment plans
4 RS 4 A/D 7 equiPTV2 dose at 95% volume 73.16 Gy 72.56 Gy 73.81 GyPTV2 % receiving > 110% of Rx 23.18 % 15.07 % 31.63 %PTV2 % receiving > 93% of Rx 98.87 % 98.67 % 99.57 %PTV1 dose at 95% volume 54.71 Gy 54.41 Gy 55.09 GyPTV1 % receiving > 93% of Rx 97.95 % 98.01 % 97.46 %
but the 4 RS plan is very close. Both of the 4-beam plans obtain smaller hotspots and
better PTV1 target coverage than the 7 equi plan. The 4 A/D plan on average underdoses
PTV2, which could lead to recurrence of the cancer. This underdosage could also account
for the smaller hotspot in the 4 A/D plans.
Figures 3-10 and ?? illustrate two representative cases where the 4 RS, 4 A/D and
7 equi plans each have clinically acceptable target coverage. The vertical line at 73.8 Gy
indicates the prescription dose for PTV2.
The ability of each of the treatment plans to spare the organs in the cases tested is
shown in Table 3-5. Surprisingly, both the 4 RS and the 4 A/D plan are equivalent to
or outperform the 7 equi plan in terms of organ sparing. In the 4-beam plans, the left
submandibular gland is spared in 83% of the treatment plans developed, whereas it is only
spared in 67% of the 7 equi plans. One case illustrating equivalent organ sparing is shown
in Figure 3-10, and one case demonstrating improved organ sparing over the 7 equi plan is
shown in Figure ??. Just as PTV2 underdosage in the 4 A/D plans likely contributed to
the smaller hotspots, it is possible that the improved organ sparing in the 4 A/D plans is
also a result of the underdosage.
86
Table 3-5. Percentage of plans in which an organ is spared.
Structure 4 RS 4 A/D 7 equibrain stem 100% 100% 100%mandible 100% 100% 100%left optic nerve 100% 100% 100%right optic nerve 100% 100% 100%left eye 100% 100% 100%right eye 100% 100% 100%optic chiasm 100% 100% 100%left parotid gland 100% 100% 100%right parotid gland 67% 67% 67%left SMB gland 83% 83% 67%right SMB gland 50% 50% 50%spinal cord 100% 100% 100%skin 100% 100% 100%
0 10 20 30 40 50 60 70 80 900
20
40
60
80
100
case001 Target DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
7−beam equi4−beam A/D4−beam RS
PTVGTV
0 10 20 30 40 50 60 70 80 900
20
40
60
80
100
case001 Saliva gland DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
7−beam equi4−beam A/D4−beam RS
r. parotidr. SMB
A B
Figure 3-9. 7-beam equi-spaced (dotted), 4-beam Add/Drop (dashed) and 4-beam RSnon-coplanar (solid) target and select saliva gland DVHs. A) Target coverageis nearly identical. B) The tumor surrounds the right submandibular gland, sothe FMO solver recognizes that it cannot be spared and allows it to receive asmuch dose as necessary to ensure good target coverage in all plans. All othersaliva glands are spared in all plans.
87
0 10 20 30 40 50 60 70 80 900
20
40
60
80
100
case005 Target DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
7−beam equi4−beam A/D4−beam RS
PTVGTV
0 10 20 30 40 50 60 70 80 900
20
40
60
80
100
case005 Saliva gland DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
7−beam equi4−beam A/D4−beam RS
l. SMB
A B
Figure 3-10. 7-beam equi-spaced (dotted), 4-beam Add/Drop (dashed) and 4-beam RSnon-coplanar (solid) target and select saliva gland DVHs. A) Target coverageis nearly identical. B) The left submandibular gland is spared by the two4-beam plans, but not by the 7-beam plan. All other saliva glands are sparedin all plans.
3.8.3 Neighborhood Search Method Results
The simulated annealing method was tested on six head-and-neck cases using a
Windows XP computer with a 2.13 GHz Pentium M processor and 2 GB of RAM. On
average, ≈ 340 FMOs were calculated in 30-minute run time allowed for the simulated
annealing and Add/Drop algorithms. Beams were selected on a 5-degree grid, yielding 72
candidate coplanar beams.
The simulated annealing and Add/Drop algorithms were used to obtain 4-beam
coplanar plans using regular and flip neighborhoods. In order to compare the quality of
the treatment across different plans, the plans are compared in terms of the percentage
improvements of each plan’s FMO value improvement over the FMO value of the locally
optimal 3-beam plan obtained from the Add/Drop local search heuristic described
by Kumar [80]. The Add/Drop plans are denoted 3 A/D and 4 A/D for the 3- and
4-beam plans, respectively. The 4-beam plans generated by the simulated annealing and
Add/Drop algorithms are compared to the typical 5- and 7-beam equi-spaced plans,
88
denoted 5 equi and 7 equi, respectively. The simulated annealing plans are denoted by the
implementation numbers, which refer to the parameters used, given in Table 3-6.
Figure 3-11 demonstrates the improved convergence times possible using the flip
neighborhood.
3.8.3.1 Add/Drop algorithm results
The Add/Drop algorithm was allowed to run for 30 minutes to generate a 4-beam
plan. The Nh(θ) neighborhood with δ = 20 and the N Fh (θ) with δF = 0 and δF = 20
neighborhoods are tested for the Add/Drop algorithm. The value δ = 20 is chosen
to approximate the neighborhood size that is expected from the simulated annealing
implementation using a large flip neighborhood, where δF = 180. More details on the
simulated annealing implementations are provided in Section 3.8.3.2.
Using Nh(θ), the 4-beam Add/Drop solution is nearly identical identical to the
7-beam equi-spaced plan, while the flip neighborhoods allow the Add/Drop algorithm
to find 4-beam solutions that exceed the quality of the 7-beam plans. Figure 3-12
demonstrates the quality of the solutions, while Figure 3-11(a) illustrates that the flip
neighborhoods provide faster FMO convergence than that of Nh(θ).
3.8.3.2 Simulated Annealing results
Several parameter sets were tested for the simulated annealing algorithm. For
simplicity, each of the parameter sets and methods of generating a neighboring solution
are numbered according to Table 3-6. Each implementation contains a total of 500
iterations, i.e., 500 sampled points, thus yielding a fair comparison between the parameters.
To ensure clinical practicality, the algorithm was allowed to run for a maximum of 30
minutes or 500 iterations, whichever came first.
For the cooling schedule, we update the temperature according to an exponential
cooling schedule, Ti+1 = αTi, where α < 1. Due to the random nature of the algorithm,
the algorithm is restarted five times, each time with a different initial starting point. The
first initial starting point is an equi-spaced solution, and each subsequent starting point is
89
0 5 10 15 20 25 30550
600
650
700
750
time (minutes)
min
imum
FM
O v
alue
4−beam Add/Drop
Regular neigborhoodflip, δF=0
flip, δF=20
0 5 10 15 20 25 30450
500
550
600
650
700
time (minutes)
min
imum
FM
O v
alue
4−beam simulated annealing
Regular neigborhoodflip, δF=0
flip, δF=180
A B
Figure 3-11. Comparison of FMO convergence. A) Add/Drop. B) Simulated annealing.
the previous initial solution rotated by d degrees, where candidate angles are considered on
a d-degree grid, that is, every dth angle is considered. The number of simulated annealing
and Metropolis iterations are chosen such that the total number of iterations is 500.
The initial temperature values tested are T0 = 0 and T0 = 75. T0 = 0 results in
the acceptance of only improving solutions, while the initial temperature value 75 was
selected as the value that would approximately yield a 50 percent probability of selecting a
non-improving solution for the initial iterations of the algorithm.
For both the Nh(θ) and N Fh (θ) neighborhoods, δ = δF = 180 is used so that the
entire solution space is considered as a neighborhood. As shown in Figure 3-7A, the
probability of selecting a beam 20◦ away using the Nh(θ) neighborhood with geometric
distribution with p = 0.25 is only 0.39% on a 5◦ grid. We consider this sufficiently small to
not consider neighborhoods larger than δ = 20 for Nh(θ) and δF = 20 for N Fh (θ) in the
Add/Drop algorithm. Just as in the Add/Drop implementation, the neighborhood N Fh (θ)
with δF = 0 is also considered.
90
Table 3-6. Definitions of implementations.
Number n m N α T0
1 100 1 1 0.9 02 10 10 1 0.9 03 100 1 1 0.99 04 10 10 1 0.99 05 100 1 1 0.9 756 10 10 1 0.9 757 100 1 1 0.99 758 10 10 1 0.99 759 100 1 all 0.9 0
10 10 10 all 0.9 011 100 1 all 0.99 012 10 10 all 0.99 013 100 1 all 0.9 7514 10 10 all 0.9 7515 100 1 all 0.99 7516 10 10 all 0.99 75
.
Figure 3-11(b) shows that the flip neighborhoods converge in FMO value significantly
faster than does the Nh(θ) neighborhood, while Figure 3-13 shows that the flip neighborhoods
provide comparable solution quality to both the non-flip simulated annealing and 7-beam
equi-spaced solutions.
3.8.3.3 Clinical results
Because there is no fundamental way of quantifying a treatment plan, a tool
commonly used by physicians to judge the quality of a treatment plan is the dose-volume
histogram (DVH). A DVH is a graphical measure of the cumulative dose received by a
given structure. It specifies the percentage of each structure’s volume that receives at least
a certain amount of dose, thus providing an intuitive means of assessing the quality of a
treatment plan.
The plans tested plans each contain two target structures. The gross tumor volume
(GTV) is the tumor mass observed from imaging scans. The clinical tumor volume (CTV)
is the GTV plus some margin specified by the physician. The CTV is used by physicians
91
in case there are elements of the tumor mass that cannot be seen from the imaging scans,
and the dose prescribed for the CTV is less than the dose prescribed by the GTV.
DVHs for a representative case comparing the 7-beam equi-spaced plan with the
simulated annealing plans obtained using a regular neighborhood and flip neighborhoods
with δF = 0 and δF = 180 are shown in Figure 3-13. Comparison of the 7-beam
equi-spaced plan and the Add/Drop plans using a regular neighborhood and flip
neighborhoods with δF = 0 and δF = 20 are shown in Figure 3-12. The sparing criteria
used for the saliva glands, no more than 50% of the gland receiving 30 Gy, is marked by
the star in Figures 3-13 and 3-12. The prescription dose for the GTV is 73.8 Gy, which
is marked by the vertical line in Figures 3-13 and 3-12. As previously stated, for target
structures, we require that at least 95% of the target receives the full prescription dose.
Figure 3-13 reveals that the 7-beam equi-spaced plan actually overdoses the target
and has a larger hotspot than the 4-beam simulated annealing plans. The 7-beam
equi-spaced plan only spares three of the four saliva glands, whereas the 4-beam simulated
annealing plans spare three or more saliva glands. The simulated annealing plans obtained
using the flip neighborhoods spare all four saliva glands, while the plan obtained how the
Nh(θ) neighborhood only spares three saliva glands, indicating that the flip neighborhoods
do in fact find superior solutions in terms of clinical quality.
Figure 3-12 shows that the 4-beam Add/Drop plans obtain nearly identical solutions
when compared to the 7-beam equi-spaced DVHs. The flip neighborhoods perform
clinically comparably to the regular neighborhood plans, and all of the Add/Drop plans
are comparable to the 7-beam equi-spaced plan in terms of saliva gland sparing and target
coverage.
3.9 Conclusions and Future Directions
3.9.1 Response Surface Conclusions
We have shown that for head-and-neck cases, quality plans with fewer beams than
a standard treatment plan can be obtained if BOO is applied. The response surface
92
0 10 20 30 40 50 60 70 80 900
20
40
60
80
100
Add/Drop: Target DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
7−beam equi−spaced
flip, δF=20
flip, δF=0Regular neighborhood
0 10 20 30 40 50 60 70 80 900
20
40
60
80
100
Add/Drop: Saliva gland DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
7−beam equi−spacedflip, δF=20
flip, δF=0Regular neighborhood
A B
Figure 3-12. Comparison of Add/Drop and 7-beam equi-spaced plans. A) The Add/Dropplans achieve nearly identical target coverage when compared to the 7-beamequi-spaced plan. B) The saliva gland sparing in the Add/Drop plans and the7-beam equi-spaced plan is clinically equivalent.
0 10 20 30 40 50 60 70 80 900
20
40
60
80
100
Simulated Annealing: Target DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
7−beam equi−spacedflip, δF=180
flip, δF=0Regular neighborhood
0 10 20 30 40 50 60 70 80 900
20
40
60
80
100
Simulated Annealing: Saliva gland DVHs
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
7−beam equi−spaced
flip, δF=180
flip, δF=0Regular neighborhood
A B
Figure 3-13. Comparison of Add/Drop and 7-beam equi-spaced plans. A) Unlike the7-beam equi-spaced plan, the 4-beam simulated annealing plans do notoverdose the target. B) The simulated annealing plans are also capable ofsparing more saliva glands than the 7-beam equi-spaced plan.
93
algorithm operates in a clinically reasonable time frame, and is generally successful in
selecting non-coplanar beam orientations to improve the FMO value over that of locally
optimal coplanar solutions. The FMO value of the 4-beam response surface plans was
also only slightly larger than that of the 7-beam equi-spaced coplanar treatment plans,
indicating comparable treatment plans despite the decrease in the number of beams used.
In terms of clinical results, the most significant benefit of the non-coplanar solutions
over the locally optimal coplanar solutions was the ability to deliver a higher amount
of dose to the target structures. Both the non-coplanar and locally optimal coplanar
solutions were able to obtain treatment plans with organ sparing that is comparable to or
improved upon the 7-beam equi-spaced coplanar treatment plans.
While the inclusion of non-coplanar orientations in BOO is useful in terms of FMO
value and target coverage, the resulting improvements in the treatment plan may not
always be clinically significant. With better parameter tuning or neighborhood structure,
it is possible that the Add/Drop algorithm can obtain coplanar treatment plans with more
desirable target coverage, thus making the response surface plans and the Add/Drop plans
clinically equivalent. This suggests that the inclusion of non-coplanar beam orientations
does not significantly improve the quality of a treatment plan. Although most BOO
research is restricted to coplanar orientations, there has not yet been a study assessing the
solution quality of coplanar versus non-coplanar solutions. With this study as evidence,
both researchers and practioners now have a basis for restricting the solution space to the
smaller, more tractable set of coplanar beams for head-and-neck beam optimization.
The patient cases in this work were all head-and-neck cases. Different tumor sites,
e.g., breast, lung and prostate, could also benefit from BOO, and perhaps may experience
greater improvements in treatment plan quality. In future work, these sites will be tested
to assess the general clinical usefulness of non-coplanar orientations and the response
surface method.
94
3.9.2 Neighborhood Search Conclusions
We have shown that for head-and-neck cases, quality plans with fewer beams than
a standard treatment plan can be obtained if BOO is applied. The simulated annealing
and Add/Drop algorithms both regularly obtained quality treatment plans with as few
as four beams in only 30 minutes. The use of the flip neighborhood improves the rate of
FMO convergence in both algorithms, and even has the ability to improve organ sparing
as shown in the simulated annealing results. The simulated annealing and Add/Drop
algorithms performed comparably to each other, with neither algorithm indicating a
significant benefit over the other.
It is possible to incorporate flip neighborhoods into other BOO algorithms that rely
on neighborhood searches to yield improved treatment plans in clinically acceptable time
frames.
95
CHAPTER 4FRACTIONATION
4.1 Introduction
Typically, head-and-neck treatment plans each contain two target structures, or
planning tumor volumes(PTV): PTV1 and PTV2. Let PTV1 be the tumor mass observed
from imaging scans, and let PTV2 be PTV1 plus some margin specified by the physician.
Rather than deliver an entire treatment plan in one session, a treatment plan is
divided into several sessions, called fractions. This is done to take advantage of the fact
that normal, healthy cells recover faster from the radiation than cancerous cells. To
obtain the treatment plans for the fractions, in practice, a single FMO treatment plan is
developed and then divided into the desired number of fractions, usually around 35. This
division of a treatment plan is a non-trivial task, as the target voxels must receive 1.8-2.0
Gy of radiation in each fraction.
With a single IMRT treatment plan, it is practically impossible to devise a constant
dose-per-fraction delivery technique because only a single FMO problem is solved to
obtain the treatment plan, which is then simply divided into a number of daily fractions.
If a single plan is optimized to deliver doses to multiple target-dose levels, then the dose
per fraction delivered to each target must change in the ratio of a given dose level to the
maximum dose level. For example, say PTV1 has a prescription dose of 70 Gy, PTV2 has
a prescription dose of 50 Gy, and the number of fractions is 35. If a single treatment plan
is divided among the 35 fractions, then PTV1 will receive 70/35 = 2.0 Gy in each fraction,
but PTV2 will only receive 50/35 = 1.4 Gy, and thus any cancerous cells in PTV2 may
not be eradicated by the treatment. Similarly, if only 25 fractions are used in order to
ensure that PTV2 receives 2.0 Gy per fraction, then PTV1 receives 70/25 = 2.8 Gy per
fraction, well above the desired dose.
We propose a new method of approaching the fractionation subproblem wherein an
FMO treatment plan is developed for each target structure, rather than developing a
96
single treatment plan for all target structures. The individual treatment plans can then be
easily divided into optimal fractions.
The primal-dual interior point algorithm presented by Aleman et al. [88] is used to
solve the FMO and fractionation models to optimality.
4.2 Model Formulation
The fractionation model builds on the FMO model described in Chapter 2. To solve
the fractionation problem, we consider developing an individual fluence map solution for
each target. For a case with two targets, two plans must be developed: (1) a plan that
delivers the prescription dose to PTV1 and PTV2, and (2) a plan that “boosts” the dose
received by PTV1 to reach the prescribed dose level. These two fluence maps can then be
divided into the appropriate number of fractions easily. For the example of 50 Gy and 70
Gy prescription doses for PTV2 and PTV1, respectively, this would yield 25 fractions of
treating both PTV1 and PTV2 to 50/25 = 2.0 Gy, and another 10 treatments of treating
just PTV1 to (70− 50)/10 = 2.0 Gy.
For simplicity, we call these individual fluence maps “fractions”, rather than using the
term to describe the daily treatments. The development of these fluence maps separately
would result in suboptimal solutions. To optimize these fluence map sets simultaneously,
we consider each bixel in each fraction as an individual decision variable. As there number
of fractions is equal to the number of targets (T ), this results in a fluence map developed
for each target.
In the single FMO formulation, dose to voxel j in structure s is defined as zjs =∑Ni=1 Dijxi, s = 1, . . . , T , and the penalty associated with it as Fs(zjs). Because the
fractionation model will be concerned with dose-per-fraction as well as cumulative dose,
new variables must be defined to express these values.
97
Define xfi , f = 1, . . . , T , as the fluence of beamlet i in fraction f . The amount of dose
received by a voxel j in structure s in fraction f is defined as
zfjs =
N∑i=1
Dijxfi , j = 1, . . . , vs, s = 1, . . . , T, f = 1, . . . , T (4–1)
Critical structures are thought to be affected by only the cumulative dose received
from all treatments, rather than the just the dose in any one particular fraction. This
cumulative dose received by a voxel is
zjs =T∑
f=1
N∑i=1
Dijxfi , j = 1, . . . , vs, s = T + 1, . . . , S (4–2)
Critical structures are penalized in the same manner as in the original FMO model, that
is, Fs(zjs), s = T + 1, . . . , S.
Targets require a more complex treatment in the fractionation model. In each
fraction, we are primarily concerned with dose received by the targets in that particular
fraction. Thus, new variables are needed to express the amount of dose per fraction
received by a voxel (zfjs in Equation (4–1)).
Since we must also ensure that the cumulative dose received by each target reaches
the prescribed dose, variables to express the cumulative dose received by a voxel are
required. Intuitively, this cumulative dose should be the sum of all the doses received in all
fractions. If the cumulative dose for targets is defined this way, then over/underdosing
in one fraction can result in under/overdosing in another to compensate, which is
undesirable. To prevent such a scenario, another new variable called the artificial dose
is required (zjs in Equation (4–3)). Rather than simply summing up the dose received
in each fraction, we will assume that in the previous fraction, the target voxel received
exactly the correct prescription dose for the previous fraction. Thus, no compensating will
be necessary. The artificial dose is just the prescription dose from the previous fraction
98
(Pf−1) plus the dose received in the current fraction:
zjs = Pf−1 + zfjs j = 1, . . . , vs, s = 1, . . . , T, f = 1, . . . , T (4–3)
Since each of the target voxels being irradiated in fraction f is treated as target f , the
penalty functions for these voxels is
T∑s=f
∑j∈Vs
Ff (zfjs)
Once a target has received its prescription dose, ideally, it should not receive any
further dose. As target f is treated in fraction f , for all fractions after f , target f should
be treated as normal tissue. Specifically, targets that no longer require dose will be treated
as skin, denoted structure S. Therefore, these target voxels, along with actual skin voxels,
will be penalized with penalty function FS. The dose received by these target voxels is
the prescription dose of the voxel (Ps) plus the dose received in all subsequent fractions
(∑T
`=s+1 z`js). This leads to the following penalty functions for voxels penalized as normal
tissue in fraction f :f−1∑s=1
∑j∈Vs
FS
(Ps +
T∑`=s+1
z`js
)+∑j∈VS
FS(zjS)
As with the traditional FMO model, penalty functions are normalized according to
the number of voxels in the structure. For critical structures, this normalization factor
is still 1/vs since there are always vs voxels being treated as critical structure s. In each
fraction, the number of target voxels depends on which targets still need to be treated.
Each fluence map set will only “see” the target voxels that are included in its prescription
dose level. Thus, define the number of target voxels treated in fluence map f as
vf =T∑
s=f
vs f = 1, . . . , T
99
The number of voxels treated as skin in each iteration can be expressed by v1 − vf + vS,
where v1 − vf is the number of target voxels being treated as skin and vS is the number of
actual skin/unspecified tissue voxels.
Identical to the traditional FMO, the critical structures are normalized and penalized
byS−1∑
s=T+1
1
vs
∑j∈Vs
Fs(zjs)
Let z be a vector of all zjs, zfjs and zf
js variables. The objective function is obtained
by summing the normalized penalty functions:
Ffrac(z) =T∑
f=1
{1
v1 − vf + vS
[f−1∑s=1
∑j∈Vs
FS
(Ps +
T∑`=s+1
z`js
)+∑j∈VS
FS(zjS)
]
+1
vf
T∑s=f
∑j∈Vs
Ff (zfjs) +
S−1∑s=T+1
1
vs
∑j∈Vs
Fs(zjs)
}
The fractionation model is then formulated as
minimize Ffrac(z)
subject to zfjs =
N∑i=1
Dijxfi j = 1, . . . , vs, s = 1, . . . , T, f = 1, . . . , T
zjs =T∑
f=1
N∑i=1
Dijxfi j = 1, . . . , vs, s = T + 1, . . . , S
zjs = Pf−1 + zfjs j = 1, . . . , vs, s = 1, . . . , T, f = 1, . . . , T
x ≥ 0
As the objective function is the sum of quadratic functions and the constraints are all
linear, the fractionation formulation, just like the basic FMO formulation.
4.3 Results
The fractionation model is tested using the primal-dual interior point algorithm
in Aleman et al. [88]. One significant benefit of employing a primal-dual interior point
algorithm is that the solution generated is guaranteed to be optimal to within a certain
100
tolerance that can be specified by the user. Thirteen head-and-neck cases using five
equi-spaced beams are tested. Each test case consists of two targets, PTV1 and PTV2,
with prescription dose levels of 70 Gy and 50 Gy, respectively.
According the suggestions made on algorithm parameters in Aleman et al. [88],
the primal-dual interior point algorithm was implemented with a Single Approximation
Hessian and a stopping criteria of a relative duality gap of 0.1%. Although it was also
recommend to remove “insignificant” beamlets, these removal of these beamlets actually
increases run time in the fractionation model. Thus, insignificant beamlets are left in the
fractionation model.
4.3.1 Computational Results
The tests are run in Matlab (MathWorks, Inc.) on a 2.33GHz Intel Core 2 Duo
processor with 2GB of RAM. Table 4-1 shows the sizes of each case in terms of the
number of decision variables (the number of bixels) and the size of the patient area being
treated (the number of voxels). The computation times obtained are display in Table 4-1.
On average, the fractionation model was solved in 22.03 seconds. With the same algorithm
parameters and weighting parameters, a single FMO treatment plan can be determined
in an average of 16.28 seconds, thus there is only a 35% increase in computation time
required to develop two FMO plans for the fractionation model. This relatively small
increase in time could be accounted for by the fact that the weighting parameters used
in the objective function were specifically tuned for the fractionation model. Using
parameters specifically tuned to the single-FMO model, the single-FMO model can be
solved on average in 9.36 seconds. Compared to this average run time, the FMO model
requires 2.4 times as much computation time to develop two models as opposed to one,
which is a more intuitive expectation of the interior point method’s performance.
101
Table 4-1. Case sizes and run times using identical algorithm and weighting parameters.
Single FMO FractionationCase Bixels Voxels Iterations Time (s) Iterations Time (s)
1 813 85,017 16 8.39 16 19.602 1320 189,234 103 82.69 14 55.343 935 86,255 24 11.75 11 18.794 692 58,636 15 6.87 11 11.475 1044 102,262 14 13.16 12 29.706 1005 84,369 13 10.31 12 25.587 822 71,873 17 9.14 14 18.888 802 92,307 59 22.92 14 20.199 911 65,541 18 10.84 17 26.12
10 642 66,634 25 7.94 16 12.4411 279 56,847 29 2.75 14 2.9912 994 96,105 17 12.30 12 27.1313 823 72,729 33 12.55 14 18.15
Average 852 86,755 29 16.28 14 22.03
4.3.2 Clinical Results
Because there is no fundamental way of quantifying a treatment plan, DVHs are
examined in addition to objective function values to assess the quality of a treatment
plan..
The prescription doses used are 70 Gy for PTV1 and 50 Gy for PTV2. These are
common prescriptions used in the cancer center at Shands Hospital at the University of
Florida. Figures 4-1-4-7 show both dose volume histograms (DVHs) and axial slices for
several cases. The DVHs show that in the first fraction, both PTV1 and PTV2 are treated
to 50 Gy, and in the second fraction, only PTV1 is treated to an additional 20 Gy. The
prescription dose for the fraction is marked by a vertical line. The amount of dose received
by each target in each fraction is clinically acceptable.
As this study focuses on head-and-neck cases where the most conflict lies in treating
the targets while sparing the saliva glands, only DVHs of the saliva glands are shown. All
other organs, including skin/unspecified tissue, receive a low enough amount of dose to be
spared in the treatment. The sparing criteria for each of the common critical structures in
head-and-neck cases are listed in Table 4-2. The critical structures involved in each case
102
Table 4-2. Sparing criteria varies for each critical structure
Structure Percent (%) ≤ Dose (Gy)brain stem 100 55eyes 50 30mandible 100 70optic chiasm 100 55optic nerves 100 50parotid glands 50 30skin 100 60spinal cord 100 45submandibular glands 50 30
vary, depending on their proximity to the tumor, and thus DVHs for some cases do not
include all saliva glands.
DVHs of the saliva gland doses in Fraction 1 show that the saliva glands receive the
majority of dose in the first fraction. Because the cumulative amount of dose received
determines whether or not critical structures can be spared, the DVHs for Fraction 2
depict the cumulative dose of these organs. The sparing criteria used for saliva glands is
that no more than 50% of the gland can receive more than 30 Gy. This point is marked as
a star. For most cases, all of the saliva glands are spared.
Figures 4-1-4-7 also show the dose received in each fraction as a colorwash of a slice
of the patient. Fraction 1 delivers a homogeneous dose of 50 Gy to both PTV1 and PTV2
while generally avoiding overdosing any of the marked critical structures. In Fraction 2,
the dose to PTV1 is boosted by 20 Gy without delivering any unnecessary dose.
4.3.3 Spatial Coefficient Results
The concept of employing spatial information as described in Section 2.4 is also
applied to the fractionation model. One set of spatial coefficients is used to obtain both
fractions. For the fractionation treatment plans, the spatial coefficients are λ = 1.02,
µ = −0.92, β = 0.97 and the minimum coefficient for target voxels is 0.6.
Generally, the DVHs for both targets and critical structures using spatial coefficients
are similar to those obtained without using spatial coefficients. In fact, in the cases tested,
103
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-1. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right).
104
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-2. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right).
105
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1PTV2
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-3. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right).
106
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-4. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right).
107
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-5. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right).
108
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-6. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right).
109
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandright parotid gland
Figure 4-7. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right).
110
there were no instances of either the spatial treatment plans or the non-spatial treatment
plans yielding clinically significant changes in the DVHs. The slices show that there is
improved homogeneity in the target doses when spatial coefficients are used.
The slices also indicate that since the use of spatial coefficients results in the target
voxels weighing more heavily than other voxels, the model is more willing to deliver dose
to critical structures rather than overdose or underdose the target. This helps provide a
uniform dose in the target, and should still be acceptable as the cumulative dose for all
critical structures remains within acceptable levels and there are no instances of sacrificing
organs that were not already sacrificed in the non-spatial plan.
Because more critical structure voxels receive dose in the spatial plans, the dose
deposited in the target structures is more spread out, and thus the maximum dose
received by the critical structure voxels is less than in the non-spatial plans. This of course
means that more voxels are exposed to radiation, but the levels are lower and the amount
of radiation still falls within clinically acceptable limits. The resulting improvement in
homogeneity is evident for each of the cases, but the effect of the more spread out dose is
best illustrated in the second fraction of each case.
Figures 4-8–4-14 show the DVHs and slices for some of the tested cases. In particular,
Figures 4-9, 4-10, 4-11 and 4-14 demonstrate that the spatial coefficients reduce the
amount of dose delivered outside of the targets when compared to their respective
non-spatial plans in Figures 4-2, 4-3, 4-4 and 4-7.
4.4 Conclusions and Future Directions
The fractionation model presented allows for the creation of guaranteed optimal
fluence maps for each fraction of a patient’s treatment. These fluence maps can be easily
divided into the appropriate number of fractions without sacrificing optimality. Using the
primal dual interior point method, the fractionation model obtains fluence maps for each
target in a clinically feasible amount of time. As expected, the computation time required
to generate two fluence maps for a two-target case is more than the time necessary to
111
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-8. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right)using spatial coefficients.
112
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-9. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right)using spatial coefficients.
113
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1PTV2
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-10. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right)using spatial coefficients.
114
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-11. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right)using spatial coefficients.
115
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-12. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right)using spatial coefficients.
116
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandleft submandibular glandright parotid glandright submandibular gland
Figure 4-13. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right)using spatial coefficients.
117
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 1 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1 ∪ PTV2
0 1020304050607080900
20
40
60
80
100
Target DVHs: Fraction 2 of 2
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV1
0 1020304050607080900
20
40
60
80
100
Target DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
PTV2PTV1
0 1020304050607080900
20
40
60
80
100
Saliva gland DVHs: Cumulative dose
Dose [Gy]
Vol
ume
[Fra
ctio
nal]
left parotid glandright parotid gland
Figure 4-14. Target DVHs, saliva DVHs and axial slices in Fractions 1 (left) and 2 (right)using spatial coefficients.
118
generate a single FMO plan, but the computation times are still acceptable. Further
parameter tuning could possibly yield better results.
The addition of spatial coefficients in the model allows for improved homogeneity,
but does not seem likely to provide additional organ sparing. The improved homogeneity
alone is enough to warrant the inclusion of spatial information in the model. The model is
sensitive to the changes in the spatial coefficients, so further parameter tuning will have to
be performed in small incremental changes.
Currently, the model assumes that prior to each fraction, each target voxel has
received exactly the prescribed amount of dose up to that point in time. While we
have assumed that over/underdose in one fraction should not be compensated by
under/overdose in another fraction, it may in fact be advantageous to allow for some
degree of compensation. The fractionation formulation proposed affords enough flexibility
to model such a scenario. For example, say a physician would like to allow underdose in
target s in previous fractions to be compensated by up to ξ Gy of overdose in the current
fraction. Then, for target structure s, the Ps term in the objective function would be
replaced by the expression max{zjs, Ps − ξ}. As this type of discontinuity already exists in
the model, the structure of the model would not be altered by making this modification.
Other future research possibilities include further parameter testing to employ the
model on other cancer site treatments, for example, lung and prostate cancers.
119
CHAPTER 5A MONTE CARLO METHOD FOR MODELING DOSE DEPOSITION
5.1 Introduction
The FMO problem relies on the calculation of the amount of total radiation dose
received in each voxel. The dose in a voxel is determined by the paths the photons in
the radiation beams follow through the patient. Some photons may collide with particles
inside the patient and scatter in any direction, while others may collide with particles
and be absorbed. Still other photons may pass entirely through the patient with no
collisions. Due to the unpredictable nature of the radiation beam inside the patient, the
dose received in a voxel can only be accurately obtained through Monte Carlo simulations.
A simple linear relationship is assumed between total dose and beamlet fluences and is
commonly accepted as a satisfactory dose approximation in IMRT optimization. Errors of
as much as 30% have been reported for photon beams near tissue inhomogeneities (Ma et
al. [5]).
For IMRT optimization, particularly with advent of image-guided IMRT (IGIMRT),
or 4D IMRT, the FMO problem must be solved extremely quickly to create real-time
treatment plans. Thus, the speed of the FMO problem is paramount. While Monte Carlo
simulation may provide the most accurate measure of dose, the lengthy computation
time renders the method impractical for clinical use. We propose a Monte Carlo method
that performs a limited number of histories to obtain a noisy approximation of the dose
distribution of each beamlet to which a smoothing function can be applied in order to
determine an accurate dose distribution. The anticipation is that few histories will be
required, and that this approach can be clinically feasible.
Recently, a similar approach has been taken by Jelen and Alber [89] and Jelen et al.
[90] with good results. Jelen et al. [90] acknowledge that there is some loss of accuracy
at the beam’s edge due to a lack of lateral density correction and the effects arising from
MLC systems, for example, tongue-and-groove and inter-leaf scatter. Jelen and Alber
120
[89] pursue the issue of density scaling, but the MLC effects have not yet been addressed.
Section 5.6 proposes some possible methods of accounting for such MLC effects.
5.2 Monte Carlo Engine
The “Dose Planning Method” (DPM) (Sempau et al. [91]) program will be used
to perform the Monte Carlo simulations. DPM is designed to simulate the transport of
photons in radiotherapy class problems. DPM is primarily based on the public domain
code PENELOPE (Baro et al. [92], Sempau et al. [93]).
This study focuses on modeling a finite sized pencil beam emanating from a 6MV
linear accelerator. A finite sized pencil beam is a beam of finite sized that is parallel to
the point source of radiation. To determine a reasonably accurate measure of the dose of
a single beamlet in a given tissue, approximately one billion histories are run in DPM.
As fewer histories are run, the inaccuracies of the dose resulting from the Monte Carlo
experiment grow. Figure 5-4 shows how the noise in the depth-dose curve of the beamlet
becomes increasingly pronounced in relation to the number of histories. As shown by
Table 5-1, the amount of time required to run each experiment is approximately linear in
the number of histories recorded. Thus, it is impractical to run the number of histories
necessary for acceptable accuracy.
5.3 Dose Distribution of a Beamlet
The accuracy of a treatment plan is contingent upon the accuracy of the calculated
dose deposited by each beamlet in the plan. Because the particles in a beamlet scatter in
three dimensional space, multiple dose distributions must be considered to satisfactorily
model the beamlet’s affect on the patient’s tissue. These distributions arise from the
amount of radiation the beamlet deposits as a function of depth (the depth-dose curve),
and from the amount of radiation radiating outward from the center of the beamlet (the
lateral penumbra).
121
5.3.1 Depth-Dose Curve
The depth-dose curve represents the radiation intensity deposited by the beamlet
in the tissue through which it passes as function of depth. Figure 5-1 shows the dose
distribution of a single 6MV beamlet in various tissues obtained from the DPM simulations.
The dose distribution of a beamlet in water is empirically known, and the results from the
DPM simulation in water can be easily verified to be correct. Muscle, which has nearly
identical density as water (the densities of muscle and water are 1.04g/cm3 and 1.00g/cm3,
respectively), has nearly the same depth-dose distribution as water. As expected, a
beamlet passing through lung tissue, which is significantly less dense than water, does not
lose its intensity as quickly as it travels through the less dense tissue. Lastly, a simulation
with inhomogeneous tissue is considered. A simulation of muscle with a 10-cm thick
layer of lung located at a depth of 10cm shows a dose distribution that when the beamlet
reaches the less-dense segment of lung, its depth-dose curve becomes less steep, indicating
that less radiation intensity is lost through the lung than through the muscle. Once the
deeper layer of muscle is reached, the steepness of the depth-dose curve increases again.
0 5 10 15 20 25 300.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1Depth dose curve of beamlet in various tissues after 1B histories
depth (cm)
rela
tive
dose
(%
)
watermusclelungmuscle−lung−muscle
Figure 5-1. Dose distribution of a single beamlet in various tissues.
122
Although it may seem unintuitive that the depth-dose curve increases at shallow
depths, this behavior is called the build-up curve, and is explained by the likelihood of
electrons scattering out of the tissue and into air at shallow depths. Because the density
of air is extremely small, an electron that reaches air is likely to travel very far away from
the tissue, and therefore unlikely to return to the tissue and deposit radiation dose. Once
the depth increases passes a certain point, the electrons cannot leave the tissue and the
amount of dose received in the tissue increases. Once that point is reached, the amount
of radiation delivered by the beamlet decreases monotonically in depth, as would be
expected.
5.3.2 Lateral Penumbra
In addition to the dose distribution occuring as the beamlet penetrates the tissue,
there is a dose distribution spreading away from the beamlet. Just as light emanating
from a flashlight in a dark room does not have a discrete boundary between light and
dark, the radiation delivered by a beamlet also does not have a discrete boundary between
what is and is not irradiated. With a circular flashlight beam shown onto a flat surface,
it is apparent from the distribution of the illuminated portion of the surface that some of
the light is diffused into the surrounding darkness as a result of scatter. If the distribution
of light in the circular projection of the flashlight beam is plotted, a bell-shaped curve
describes the brightest point in the center of the illuminated disc decreasing in brightness
as the edge of the illuminated disc is approached, eventually reaching complete darkness.
This behavior is parallel to the behavior of a beamlet passing through any medium.
From The Physics of Radiation Therapy [94], the penumbra of a beam is the region
at the edge of a radiation beam, over which the dose rate changes rapidly as a function of
distance from the beam axis. Hence, the distribution of radiation dose originating from the
beamlet described above is called the lateral penumbra. Figure 5-2 shows the colorwash
of dose distribution consistituting the lateral penumbra, while Figure 5-3 shows the dose
123
distance from beam
Lateral penumbra of a finite sized pencil beam
0
0
1
2
3
4
5
6
7
8
9
10
11x 10
−4
Figure 5-2. Colorwash of the lateral penumbra of a finite sized pencil beam
distribution of the lateral penumbra at a fixed depth in one dimension obtained from one
billion Monte Carlo histories of a 5-cm finite sized pencil beam in water .
5.4 Methodology to Model a Beamlet
Modeling the dose distribution of a beamlet is relatively straightforward for a beamlet
in a single medium. The difficulty arises when multiple mediums are traversed by the
beamlet because the varying densities affect the particle scattering of the beam, thus
affecting both the depth-dose curve and the lateral penumbra. As previously stated, errors
of as much as 30% have been reported for photon beams near tissue inhomogeneities
(Ma et al. [5]). Because there are numerous inhomogeneities in most cancer treatment
sites, these inhomogeneities are of particular interest. The beamlet’s behaviour at the
boundary of different tissue types cannot be determined as easily, and thus requires Monte
Carlo simulation. In designing an IMRT treatment plan for a patient, there can be more
than a dozen different structures (tissue types) with complicated boundary geometries.
124
0 2 4 6 8 10 120
0.2
0.4
0.6
0.8
1
1.2x 10
−3 Lateral penumbra of 5−cm finite size pencil beam
distance (cm)
dose
(G
y)
Figure 5-3. Plot of the lateral penumbra of a finite sized pencil beam
Knowledge of a beamlet’s behaviour given certain tissue inhomogeneities can be very
useful in accurately determining dose in a voxel.
5.4.1 Modeling the Depth-Dose Curve
In the section, we analyze the behavior of the depth-dose curve under both single
tissue and multiple tissue scenarios. The goal of the analyzation is to determine the
minimum number of Monte Carlo histories required to obtain a reasonably accurate
approximating function of the dose deposited at each depth in the tissue. For both
the instances of only a single medium and multiple mediums, this is done by fitting
the depth-dose curve from Monte Carlo experiments with varying numbers of histories
to high-degree polynomial functions. The polynomial fits are then compared to the
polynomial fit of a very accurate measure of the depth-dose curve obtained from an
number of Monte Carlo histories accepted to be satisfactorily accurate.
The number of histories recorded in the Monte Carlo simulation can have a drastic
effect on the accuracy of the data collected. For example, Figure 5-4 demonstrates the vast
125
0 5 10 15 20 25 300
0.2
0.4
0.6
0.8
1
1.2
1.4
depth (cm)
rela
tive
dose
(%
)
Depth dose curve of beamlet in water
1B histories100M histories10M histories1M histories
Figure 5-4. Observed depth-dose curve in water for several histories.
variation observed in the depth-dose curve of a beamlet in water for histories ranging from
one million to one billion. It is hoped that after a certain number of histories, the function
approximation of the data will closely follow the function approximation of very accurate
data obtained from a large number of histories.
For a beamlet in both homogeneous and heterogeneous tissue, the depth-dose curve
can be modeled using a polynomial function of order k. Although the depth-dose curve
may exhibit changes in concavity in the presence of tissue inhomogeneity, a high degree
polynomial will capture the curve’s behavior.
The variation of a k-degree polynomial fitted to n-history Monte Carlo data is
measured by
vk,n,n′ =∥∥∥d(n′) − p(k,n)
∥∥∥2,
where d(n′) is the actual observed depth-dose curve from n′ Monte Carlo histories and
p(k,n) is the vector of approximated depth-dose values obtained from a polynomial fit of
126
0 5 10 15 20 25 300.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
depth (cm)
rela
tive
dose
(%
)
Polynomial fits compared to 1B−history data in water
1B histories: k=27, var=0.050278100M histories: k=23, var=0.08015810M histories: k=28, var=0.218771M histories: k=24, var=0.54071
Figure 5-5. Polynomial fits of several histories compared to the observed 1B-historydepth-dose curve in water.
degree k to data obtained from n Monte Carlo histories. It is desirable to have that n′ > n
to assess the quality of the polynomial fit compared to more accurate data.
In this study, the accuracy of the polynomial obtained is judged by its variation from
the observed data from a very large number of Monte Carlo histories, that is, n′ >> n
in the calculation of vk,n. Figure 5-5 shows that for the illustrated number of histories,
the polynomial fit from 100 million histories closely resembles not only the polynomial fit
from one billion histories, but also the actual data collected from one billion histories. The
polynomial fit to one million histories is clearly an unsatisfactory approximation to the
data collected from one billion histories.
For several numbers of Monte Carlo histories, the best approximating polynomial
function with degree in the range [k, k] is found, that is, k∗ = arg mink∈[k,k]{vk,n}. Several
degrees are tested because the degree of the polynomial can significantly affect the quality
of the fit, even for polynomials that are only one degree apart. Figure 5-6 illustrates the
amount of variation observed in the polynomial approximation as a function of the degree
127
5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
degree of polynomial (k)
varia
tion
(v k
,n,1
e9 )
Variation of polynomial fit (v k,n,1e9
) as function of degree (k)
Figure 5-6. Variation of polynomial fit as function of degree.
of the polynomial for polynomials fitted to the depth-dose curve of a beamlet in water
obtained from 1 billion histories.
5.4.2 Modeling the Lateral Penumbra
In the section, we analyze the behavior of the lateral penumbra under both single
tissue and multiple tissue scenarios. The lateral penumbra of a beam is a bell-shaped
curve that can be approximated as the sum of error function pairs. The error function,
erf(x), is twice the integral of the Gaussian distribution with mean 0 and variance of 1/2:
erf(x) =2√π
∫ x
0
e−t2dt.
Figure 5-7A demonstrates a sample error function. While a single side of the lateral
penumbra of a beamlet resembles an error function, a closer approximation to a single side
of the lateral penumbra is represented as the average of two error functions given by
a
2
[erf
(x + x0
σ
)− erf
(x− x0
σ
)],
128
−3 −2 −1 0 1 2 3−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Sample error function: erf(x)
0 1 2 3 4 5 60
0.2
0.4
0.6
0.8
1
Sample error function pair for lateral penumbra
A B
Figure 5-7. An error function and an error function pair. A) Error function. B) Errorfunction pair.
where a is the amplitude, x0 is the offset and σ is the variation of the two error functions.
The expression is divided by 2 to take the average of the error function pair. An example
of an error function pair is given in Figure ??B.
Because the lateral penumbra of a beamlet resembles an error function on both the
left- and right-hand sides of the beam center, the lateral penumbra L(x) is represented as
the sum of the average of N error function pairs, given by
L(x) =N∑
i=1
ai
2
[erf
(x + x0i
σi
)− erf
(x− x0i
σi
)],
where ai is the amplitude, x0iis the offset and σi is the variation of error function pair i,
i = 1, . . . , N .
To determine the parameters ai, x0iand σi for each of the N error function pairs, a
Levenberg-Marquardt quasi-Newton minimization method is employed. This method takes
as input N and an initial guess of the parameters and returns a locally optimal solution
to the problem of minimizing the variation between the real data and the sum of the error
function pairs.
At a given depth in the tissue, the amplitude of the error function is determined by
the value of the depth-dose curve at that depth. Thus, for each tissue type, it is only
129
0 2 4 6 8 10 120
0.2
0.4
0.6
0.8
1
distance (cm)
rela
tive
dose
(%
)
Lateral penumbra of 5−cm finite sized pencil beam in water
1B histories100M histories10M histories1M histories
Figure 5-8. Lateral penumbra for several numbers of Monte Carlo histories.
necessary to model a single lateral penumbra, and then that model can be extended to
all depths simply by manipulating the amplitude according to the depth-dose curve.
Figure 5-3 shows the lateral penumbra of a 5-cm finite sized beamlet at a fixed depth
in water for a number of Monte Carlo histories deemed to yield a satisfactorily accurate
representation of the dose deposited in the tissue. Using the method described above, the
lateral penumbra was modeled to yield the approximation to the observed data collected
for the various Monte Carlo histories shown in Figure 5-8.
In a similar fashion to the method for modeling the depth-dose curve, the method
for modeling the lateral penumbra consists of fitting the sum of error function pairs to
the lateral penumbra data. The quality of these fits is judged by their variation from the
observed data for a sufficiently large number of Monte Carlo histories to obtain accurate
dose information.
130
0 2 4 6 8 10 120
0.2
0.4
0.6
0.8
1
1.2
distance (cm)
rela
tive
dose
(%
)
Error function pair fits compared to 1B−history data in water
1B histories: var=0.070667100M histories: var=0.07541410M histories: var=0.145111M histories: var=1.1829
Figure 5-9. Error function fits of several histories compared to the observed 1B-historylateral penumbra of a beamlet in water.
Just in as the method for determining the quality of the depth-dose curve approximation,
the variation of the error function fit from the actual lateral penumbra is calculated as
νn,n′ =∥∥∥L(n′) − L(n,N)
∥∥∥2,
where L(n′) is the observed lateral penumbra data from a simulation of n′ histories, and
L(n,N) is the approximated lateral penumbra obtained from the parameters fitted to the
expression LN(x). It is desirable to have that n′ > n.
Figure 5-9 displays the error function pair fits obtained from the Levenburg-Marquardt
method, as well as the variation of the fits from the observed data from one billion
histories. The variation is measured in the same manner as described in Section 5.4.1.
It is anticipated that although the lateral penumbra exhibits different dose distributions
in materials of different densities, the distribution will only show a fundamental change
in shape if the beam simultaneously hits multiple tissues of varying densities. In such a
situation, the penumbra, which is taken to be symmetric about the center of the beam in
131
Table 5-1. Computation times in minutes of Monte Carlo simulations
n Water Muscle Lung Muscle-Lung-Muscle1e9 222.184 211.887 111.318 186.894
100e6 20.543 21.256 11.239 18.70110e6 2.210 2.234 1.269 1.9861e6 0.244 0.339 0.233 0.309
homogeneous tissue, will no longer be symmetric. To model the lateral penumbra under
inhomogeneous material, a sum of error function pairs can still be employed, though it
may be necessary to increase the number of error function pairs required. The difficulty
will lie in correctly determining when the addition of additional error function pair will
be needed. A possible measure could be the variation between the lateral penumbra
approximation and the observed data.
5.5 Results
The homogeneous tissues tested are water, muscle and lung, and the heterogeneous
material tested consists of muscle and lung. Each scenario is considered to have a depth
of 30cm. The voxel sizes are 5mm × 5mm × 5mm, and a 5-cm finite sized pencil beam
is considered. For each simulation, tests were run with 1 billion, 100 million, 10 million
and 1 million Monte Carlo histories in DPM on a Mac OS X 10.4.6 machine with dual
2.3GHz PowerPC G5 processors and 8GB of RAM. Due to time constraints, the muscle
tests are run to a maximum of 100 million iterations, and all comparisons to the fit quality
are made to this 100-million-history data instead of the 1-billion-history data used for the
other simulations.
As can be seen from the computation times in Table 5-1, the run time of DPM is
approximately linear in the number of histories. Altough a larger number of Monte Carlo
histories yields improved accuracy, the maximum number of histories considered is one
billion because of time limitations and the satisfactory accuracy of the 1-billion-history
runs.
132
For each of the tested tissue types, the depth-dose curves and lateral penumbras
were modeled using the methods described in Section 5.4. For the polynomial fits of the
depth-dose curve, the values k and k are chosen as 10 and 45, respectively. By choosing
the polynomial approximation over such a large range of degree values, an acceptably
accurate fit is likely to be found.
For the lateral penumbra, N was chosen as 4 because in addition to the obvious need
for two error functions to model the sides of the lateral penumbra, an additional error
function is needed to model each tail with reasonable accuracy. For example, the four
error functions used to model the lateral penumbra of a beamlet in water (Figure 5-9) are
shown separately in Figure 5-10. The computation times required to obtain each of the
function approximations are displayed in Table 5-2.
The initial parameters ai, x0iand σi for each error function pair i, i = 1, . . . , N , used
to approximate the lateral penumbra are obtained by the following method. Of the four
error function pairs considered, two of the error functions—I = {1, 2}—are used to model
the steep sides of the lateral penumbra, and the other two error functions—I = {3, 4}—are
used to model the tails of the dose distribution. At a given depth z, the amplitude ai is
ai =
d(z) i ∈ I
d(z)/50 i ∈ I ,
where d(z) represents the value of the depth-dose curve approximation at a depth z. The
expression for the amplitude when i ∈ I was obtained by experimenting with several
different fractions of d(z).
The σ value of the error functions determines the shape of the error function curve.
As σ increases, the curve becomes increasingly spread out. Thus, it is desirable to have
a small σi value for i ∈ I since the error function in I only need to model the sides of
the lateral penumbra, and a larger σi value for i ∈ I since the error function in I need
to model the elongated tails of the lateral penumbra. For the tissues tested, the σi values
133
Table 5-2. Computation times in seconds of approximating function fits to the dosedistribution. The polynomial fits to the depth dose curve are represented byD.D., and the error function fits to the lateral penumbra are represented byLat.Pen.
Water Muscle Lung Muscle-Lung-Musclen D.D. Lat.Pen. D.D. Lat.Pen. D.D. Lat.Pen.. D.D. Lat.Pen.
1e9 0.078 2.640 0.078 2.422 0.094 1.062 0.078 n/a100e6 0.078 1.172 0.078 2.625 0.828 0.906 0.109 n/a10e6 0.110 3.454 0.109 1.390 2.609 2.594 0.093 n/a1e6 0.094 1.407 0.094 1.172 1.063 0.953 0.078 n/a
used are
σi =
0.4 i ∈ I
0.8 i ∈ I ,
These values were obtained through experimentation.
For the 5-cm finite sized pencil beams used in this experiment, the offsets x0iwere
empirically set at values of 8.5, -3.5, 11 and -1 for i = 1, . . . , N , respectively. A method of
identifying the locations of these offsets based on the Monte Carlo data can be developed
by basing the offsets on the slope of the observed data, and is planned for future research.
The results for the fits of both the depth-dose curve and the lateral penumbra of a
beamlet in water are shown in the examples in Section 5.4. Figures 5-11-5-12 show the
results of the fits for the muscle and lung tissues. From the computational results, it is
clear that the time to obtain fits to the Monte Carlo data is insignificant compared with
the amount of time required to run the Monte Carlo histories, even for as few as 1 million
histories.
To test the model in the presence of tissue inhomogeneity, a 10cm-thick layer of lung
between two 10cm-thick layers of muscle is considered. As expected, for the first 10cm,
the depth-dose curve of the muscle-lung-muscle case is identical to that of the muscle
depth-dose curve. Once the beamlet reaches the significantly less dense layer of lung (lung
has a density of 0.30g/cm3), a predominant change in the depth-dose curve is evident
(Figure 5-1). Once the layer of lung is reached, the rate of decrease in the amount of dose
deposited in the tissue decreases, that is, less radiation intensity is lost as the beamlet
134
0 2 4 6 8 10 12−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
distance (cm)
rela
tive
dose
(%
)
Error functions for 5−cm finite sized pencil beam in water
erf pair 1erf pair 2erf pair 3erf pair 4
Figure 5-10. Error function pairs summed to approximate a beamlet in water.
0 5 10 15 20 25 300
0.2
0.4
0.6
0.8
1
1.2
1.4
depth (cm)
rela
tive
dose
(%
)
Depth dose curve of beamlet in muscle
1B histories100M histories10M histories1M histories
0 5 10 15 20 25 300
0.2
0.4
0.6
0.8
1
1.2
1.4
depth (cm)
rela
tive
dose
(%
)
Polynomial fits compared to 1B−history data in muscle
1B histories: k=27, var=0.046123100M histories: k=22, var=0.07521210M histories: k=22, var=0.202841M histories: k=24, var=0.69751
A B
Figure 5-11. Depth-dose curves in muscle tissue. A) Monte Carlo histories. B) Polynomialfits.
135
0 2 4 6 8 10 120
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
distance (cm)
rela
tive
dose
(%
)
Lateral penumbra of 5−cm finite sized pencil beam in muscle
1B histories100M histories10M histories1M histories
0 2 4 6 8 10 120
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
distance (cm)
rela
tive
dose
(%
)
Error functions for 5−cm finite sized pencil beam in muscle
1B histories: var=0.10448100M histories: var=0.1176110M histories: var=0.123861M histories: var=0.20889
A B
Figure 5-12. Lateral penumbra curves in muscle tissue. A) Monte Carlo histories. B) Errorfunction fits.
0 5 10 15 20 25 300
0.5
1
1.5
depth (cm)
rela
tive
dose
(%
)
Depth dose curve of beamlet in lung
1B histories100M histories10M histories1M histories
0 5 10 15 20 25 300
0.2
0.4
0.6
0.8
1
1.2
1.4
depth (cm)
rela
tive
dose
(%
)
Polynomial fits compared to 1B−history data in lung
1B histories: k=22, var=0.056789100M histories: k=22, var=0.1100610M histories: k=22, var=0.329931M histories: k=24, var=1.0547
A B
Figure 5-13. Depth-dose curves in lung tissue. A) Monte Carlo histories. B) Polynomialfits.
136
0 2 4 6 8 10 120
0.2
0.4
0.6
0.8
1
1.2
distance (cm)
rela
tive
dose
(%
)
Lateral penumbra of 5−cm finite sized pencil beam in lung
1B histories100M histories10M histories1M histories
0 2 4 6 8 10 120
0.2
0.4
0.6
0.8
1
1.2
distance (cm)
rela
tive
dose
(%
)
Error function pair fits compared to 1B−history data in lung
1B histories: var=0.097127100M histories: var=0.1027510M histories: var=0.124931M histories: var=0.12937
A B
Figure 5-14. Lateral penumbra curves in lung tissue. A) Monte Carlo histories. B) Errorfunction fits.
passes through the lung tissue. When the beamlet reaches the second layer of muscle, this
rate increases again. The same approach used to model the depth-dose curve in a single
tissue continues to work well in multiple tissue. Figures 5-15A and 5-15B illustrate the
ability of a polynomial to approximate the depth-dose curve in inhomogeneous tissue.
Because testing the beamlet in a scenario where it could hit multiple tissues
simultaneously is reserved for future research, results for modeling the lateral penumbra in
the multiple-tissue scenario tested are identical to those for the single-tissue scenario. The
lateral penumbra at a given depth in a certain tissue can be modeled by using the dose
from the depth-dose curve at the given depth as the amplitude of the lateral penumbra.
The dose distribution in the lateral penumbra can then be modeled according to the same
error function pairs used in modeling the lateral penumbra in a single-tissue scenario of
the same medium.
Figure 5-16 illustrates the variations of the fits used to approximate the depth-dose
and lateral penumbra distributions of a beamlet in water as a function of the number
of histories. From this data, it is very clear that the accuracy of the beamlet model is
directly correlated with the number of Monte Carlo histories. It is interesting that there
is not a significant improvement in the beamlet model accuracy from 100 million to 1
137
0 5 10 15 20 25 300.2
0.4
0.6
0.8
1
1.2
1.4
1.6
depth (cm)
rela
tive
dose
(%
)
Depth dose curve of beamlet in muscle−lung−muscle
1B histories100M histories10M histories1M histories
0 5 10 15 20 25 300.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
depth (cm)
rela
tive
dose
(%
)
Polynomial fits compared to 1B−history data in muscle−lung−muscle
1B histories: k=27, var=0.052096100M histories: k=27, var=0.1005310M histories: k=23, var=0.213231M histories: k=22, var=0.88061
A B
Figure 5-15. Depth-dose curves in heterogeneous muscle and lung tissue. A) Monte Carlohistories. B) Polynomial fits.
Table 5-3. Variation of fits to several numbers of histories with n′ = 1 billion.
Water Muscle Lung Muscle-Lung-Musclen vk∗,n,n′ νn,n′ vk∗,n,n′ νn,n′ vk∗,n,n′ νn,n′ vk∗,n,n′ νn,n′
1e9 0.050 0.071 0.046 0.105 0.057 0.097 0.052 n/a100e6 0.080 0.075 0.075 0.118 0.110 0.103 0.101 n/a10e6 0.219 0.145 0.203 0.124 0.330 0.125 0.213 n/a1e6 0.541 1.183 0.698 0.209 1.055 0.129 0.881 n/a
billion histories, and computing 100 million histories requires approximately one tenth of
the amount of time as computing 1 billion histories. Depending on the composition of the
tissue, it may be reasonably accurate to only require 10 million histories, particularly in
the depth-dose curve approximation.
5.6 Conclusions and Future Directions
In conclusion, the Monte Carlo approach presented is employed to model the dose
distribution of a beamlet using a limited number of histories. Using the polynomial
and error function pair fitting techniques described, dose distributions with satisfactory
accuracy can be obtained using at least a factor of 10 fewer Monte Carlo histories than
would otherwise be required. This can greatly decrease the amount of time required to
obtain dose data for beamlets in the FMO problem of IMRT treatment planning without
sacrificing accuracy.
138
Figure 5-16. Variations of the fits used to approximate the depth-dose and lateralpenumbra distributions as function of the number of histories.
106
107
108
109
0
0.2
0.4
0.6
0.8
1
1.2
1.4Variation of approximations as a function of number of histories
number of histories
varia
tion
Depth−dose: waterLateral penumbra: waterDepth−dose: muscleLateral penumbra: muscleDepth−dose: lungLateral penumbra: lungDepth−dose: muscle−lung−muscle
For future testing, more tests on the number of Monte Carlo histories needed will
be run as well, particularly with histories in the range of 10-100 million. More tests of
varying tissues, both homogeneous and heterogeneous, will be run to determine a smaller
range of degrees to be evaluated for the polynomial fit to the depth-dose curve. An
automated method of determining a quality set of initial parameters to model the lateral
penumbra will also be developed. Lastly, the scenario where a beamlet hits multiple
tissues simultaneously will be tested using our model for approximating the lateral
penumbra.
Jelen and Alber [89] and Jelen et al. [90] have demonstrated that a beamlet can
be modeled very effectively using an approach based on the one described here. This
approach was improved upon by scaling the modeling parameters according to tissue
density in Jelen and Alber [89]. Despite the sophistication of the density scaling method
employed, the model loses accuracy in the penumbra regions and at the edge of tissue
heterogeneities. This study also used a Levenberg-Marquardt algorithm to determine the
139
modeling parameters, and although the details of the implementation are not provided, it
is possible that with an improved initial guess or damping parameter, the algorithm could
converge to better modeling parameters, thus providing improved prediction of beamlet
behavior at the penumbra.
To further improve upon their work, the effects of the MLC must be considered. One
method of accounting for these effects could be to model the dose deposition of an entire
aperture rather than just the dose deposition of a single beamlet. As the number and
shape of apertures required to deliver an FMO-based IMRT optimization are unknown,
this method would be most practical if an aperture modulation approach—where aperture
fluences from a pre-defined set of apertures are chosen, instead of fluences from individual
beamlets—is employed instead of an FMO approach, as the number and shape of the
apertures in consideration are predetermined.
140
REFERENCES
[1] American Cancer Society. Cancer Facts and Figures Report. 2006.
[2] Murphy GP, Lawrence WL, Lenlard RE, eds. American Cancer Society Textbook onClinical Oncology. The American Cancer Society, 1995.
[3] Perez CA, Brady LW. Principles and Practice of Radiotherapy. Lippincott-Raven, 3edn., 1998.
[4] Steel GG. Basic Clinical Radiobiology for Radiation Oncologists. Edward ArnoldPublishers, 1994.
[5] Ma CM, Mok E, Kapur A, Findley D, Brain S, Boyer AL. Clinical implementation ofa monte carlo treatment planning system. Medical Physics 1999;26:2133–43.
[6] Bortfeld T. Optimized planning using physical objectives and constraints. SeminRadiat Oncol 1999;9:20–34.
[7] Alber M, Nusslin F. An objective function for radiation treatment optimization basedon local biological measures. Phys Med Biol 1999;44:479–493.
[8] Jones LC, Hoban PW. Treatment plan comparison using equivalent uniformbiologically effective dose (eubed). Phys Med Biol 2000;45:159–170.
[9] Kallman P, Lind BK, Brahme A. An algorithm for maximizing the probabilityof complication–free tumor–control in radiation-therapy. Phys Med Biol 1992;37:871–890.
[10] Mavroidis P, Lind BK, Brahme A. Biologically effective uniform dose for specification,report and comparison of dose response relations and treatment plans. Phys Med Biol2001;46:2607–2630.
[11] Niemierko A. Reporting and analyzing dose distributions: a concept of equivalentuniform dose. Medical Physics 1997;24:103–110.
[12] Niemierko A, Urie M, Goitein M. Optimization of 3d radiation-therapy with bothphysical and biological end-points and constraints. Int J Radiat Oncol Biol Phys1992;23:99–108.
[13] Wu QW, Djajaputra D, Wu Y, Zhou JN, Liu HH, Mohan R. Intensity-modulatedradiotherapy optimization with geud-guided dose-volume objectives. Phys Med Biol2003;48:279–291.
[14] Wu QW, Mohan R, Niemierko A, Schmidt-Ullrich R. Optimization ofintensity-modulated radiotherapy plans based on the equivalent uniform dose.Int J Radiat Oncol Biol Phys 2002;52:224–235.
141
[15] Hamacher HW, Kufer KH. Inverse radiation therapy planning a multiple objectiveoptimization approach. Discrete Applied Mathematics 2002;118:145–161.
[16] Bednarz G, Michalski D, Houser C, Huq MS, Xiao Y, Anne PR, Galvin JM. The useof mixed-integer programming for inverse treatment planning with pre-defined fieldsegments. Phys Med Biol 2002;47:2235–2245.
[17] Ferris MC, Meyer RR, D’Souza W. Radiation treatment planning: Mixed integerprogramming formulations and approaches. In G Appa, L Pitsoulis, HP Williams,eds., Handbook on Modelling for Discrete Optimization. Springer-Verlag, New York,NY, 2006;317–340.
[18] Langer M, Brown R, Urie M, Leong J, Stracher M, Shapiro J. Large-scaleoptimization of beam weights under dose-volume restrictions. Int J Radiat OncolBiol Phys 1990;18:887–893.
[19] Langer M, Morrill S, Brown R, , Lee O, Lane R. A comparison of mixed integerprogramming and fast simulated annealing for optimizing beam weights in radiationtherapy. Medical Physics 1996;23:957–964.
[20] Lee EK, Fox T, Crocker I. Simultaneous beam geometry and intensity mapoptimization in intensity-modulated radiation therapy treatment planning. An-nals of Operations Research 2003;119:165–181.
[21] Lee EK, Fox T, Crocker I. Integer programming applied to intensity-modulatedradiation therapy treatment planning. Int J Radiat Oncol Biol Phys 2006;64:301–320.
[22] Shepard DM, Ferris MC, Olivera GH, Mackie TR. Optimizing the delivery ofradiation therapy to cancer patients. SIAM Review 1999;41:721–744.
[23] Romeijn HE, Ahuja RK, Dempsey JF, Kumar A, Li JG. A novel linear programmingapproach to fluence map optimization for intensity modulated radiation therapytreatment planning. Phys Med Biol 2003;38:3521–3542.
[24] Romeijn HE, Ahuja RK, Dempsey JF, Kumar A, Li JG. A column generationapproach to radiation therapy treatment planning using aperature modulation. SIAMJournal of Optimization 2005;15:838–862.
[25] Romeijn HE, Dempsey JF, Li JG. A unifying framework for multi-criteria fluencemap optimization models. Phys Med Biol 2004;49:1991–2013.
[26] Romeijn HE, Ahuja RK, Dempsey JF, Kumar A. A new linear programmingapproach to radiation therapy treatment planning problems. Operations Research2006;54:201–216.
[27] Das SK, Marks LB. Selection of coplanar or noncoplanar beams usingthree-dimensional optimization based on maximum beam separation and minimized
142
nontarget irradiation. Int J Radiat Oncol Biol Phys 1997;38:643–655.
[28] Haas OC, Burnham KJ, Mills J. Optimization of beam orientation in radiotherapyusing planar geometry. Phys Med Biol 1998;43:2179–2193.
[29] Schreibmann E, Lahanas M, Xing L, Baltas D. Multiobjective evolutionaryoptimization of the number of beams, their orientations and weights forintensity-modulated radiation therapy. Phys Med Biol 2004;49:747–770.
[30] Chao KSC, Blanco AI, Dempsey JF. A conceptual model integrating spatialinformation to assess target volume coverage for IMRT treatment planning. Int JRadiat Oncol Biol Phys 2003;56:1438–1449.
[31] Nocedal J, Wright SJ. Numerical Optimization. Springer-Verlag, 1999.
[32] Ezzell GA. Genetic and geometric optimization of three-dimensional radiation therapytreatment planning. Medical Physics 1996;23:293–305.
[33] Li Y, Yao J, Yao D. Automatic beam angle selection in IMRT planning using geneticalgorithm. Phys Med Biol 2004;49:1915–1932.
[34] Li Y, Yao J, Yao D, Chen W. A particle swarm optimization algorithm for beamangle selection in intensity-modulated radiotherapy planning. Phys Med Biol 2005;50:3491–3514.
[35] Bortfeld T, Schlegel W. Optimization of beam orientations in radiation therapy: sometheoretical considerations. Phys Med Biol 1993;38:291–304.
[36] Djajaputra D, Wu Q, Wu Y, Mohan R. Algorithm and performance of a clinicalIMRT beam-angle optimization system. Phys Med Biol 2003;48:3191–3212.
[37] Lu HM, Kooy HM, Leber ZH, Ledoux RJ. Optimized beam planning for linearaccelerator-based stereotactic radiosurgery. Int J Radiat Oncol Biol Phys 1997;39:1183–1189.
[38] Pugachev A, Xing L. Incorporating prior knowledge into beam orientationoptimization in IMRT. Int J Radiat Oncol Biol Phys 2002;54:1565–1574.
[39] Rowbottom CG, Oldham M, Webb S. Constrained customization of non-coplanarbeam orientations in radiotherapy of brain tumours. Phys Med Biol 1999a;44:383–399.
[40] Stein J, Mohan R, Wang XH, Bortfeld T, Wu Q, Preiser K, Ling CC, Schlegel W.Number and orientations of beams in intensity-modulated radiation treatments.Medical Physics 1997;24:149–160.
[41] Soderstrom S, Brahme A. Selection of suitable beam orientations in radiation therapyusing entropy and fourier transform measures. Phys Med Biol 1992;37:911–924.
143
[42] Soderstrom S, Brahme A. Which is the most suitable number of photon beam portalsin coplanar radiation therapy? Int J Radiat Oncol Biol Phys 1995;33:151–59.
[43] Rowbottom CG, Webb S, Oldham M. Beam-orientation customization using anartificial neural network. Phys Med Biol 1999b;44:2251–2262.
[44] Gokhale P, Hussein EM, Kulkarni N. The use of beams eye view volumetrics in theselection of non-coplanar radiation portals. Medical Physics 1994;23:153–163.
[45] Meedt G, Alber M, Nusslin F. Non-coplanar beam direction optimization forintensity-modulated radiotherapy. Phys Med Biol 2003;48:2999–3019.
[46] Chen GT, Spelbring DR, Pelizzari CA, Balter JM, Myrianthopoulos LC, VijayakumarS, Halpern H. The use of beams eye view volumetrics in the selection of non-coplanarradiation portals. Int J Radiat Oncol Biol Phys 1992;23:153–163.
[47] Cho BCJ, Roa HW, Robinson D, Murray B. The development of target-eye-viewmaps for selection of coplanar or noncoplanar beams in conformal radiotherapytreatment planning. Medical Physics 1999;26:2367–2372.
[48] Goitein M, Abrams M, Rowell D, Pollari H, Wiles J. Multi-dimensional treatmentplanning: Ii. beams eye-view, back projection, and projection through CT sections.Int J Radiat Oncol Biol Phys 1983;9:789–97.
[49] Pugachev A, Xing L. Computer-assisted selection of coplanar beam orientations inintensity-modulated radiation therapy. Phys Med Biol 2001;46:2467–2476.
[50] Pugachev A, Xing L. Pseudo beam’s-eye-view as applied to beam orientation selectionin intensity-modulated radiation therapy. Int J Radiat Oncol Biol Phys 2001;51:1361–1370.
[51] Holder A, Salter B. A tutorial on radiation oncology and optimization. InH Greenberg, ed., Tutorials on Emerging Methodologies and Applications in Op-erations Research. Kluwer Academic Press, Boston, MA, 2004.
[52] Morrill SM, Lane RG, Jacobson G, Rosen II. Treatment planning optimization usingconstrained simulated annealing. Phys Med Biol 1991;36:1341–61.
[53] Oldham M, Khoo V, Rowbottom CG, Bedford J, Webb S. A case study comparingthe relative benefit of optimising beam-weights, wedge-angles, beam orientationsand tomotherapy in stereotactic radiotherapy of the brain. Phys Med Biol 1998;43:2123–46.
[54] Rowbottom CG, Webb S, Oldham M. Improvements in prostate radiotherapy fromthe customization of beam directions. Medical Physics 1998;25:1171–1179.
144
[55] Wang X, Zhang X, Dong L, Lie H, Wu Q, Mohan R. Development of methods forbeam angle optimization for IMRT using an accelerated exhaustive search strategy.Int J Radiat Oncol Biol Phys 2004;60:1325–37.
[56] Wang X, Zhang X, Dong L, Liu H, Gillin M, Ahamad A, Ang K, Mohan R.Effectiveness of noncoplanar IMRT planning using a parallelized multiresolutionbeam angle optimization method for paranasal sinus carcinoma. Int J Radiat OncolBiol Phys 2005;63:594–601.
[57] Woudstra E, Heijman BJM. Automated beam angle and weight selection inradiotherapy treatment planning applied to pancreas tumors. Int J Radiat OncolBiol Phys 2004;56:878–88.
[58] D’Souza WD, Meyer RR, Shi L. Selection of beam orientations in intensity-modulatedradiation therapy using single-beam indices and integer programming. Phys Med Biol2004;49:3465–3481.
[59] Ehrgott M, Johnston R. Optimisation of beam directions in intensity modulatedradiation therapy planning. OR Spectrum 2003;25:251–264.
[60] Lim J, Ferris M, Shepard D, Wright S, Earl M. An optimization framework forconformal radiation treatment planning. INFORMS Journal On Computing 2006.
[61] Wang C, Dai J, Hu Y. Optimization of beam orientations and beam weights forconformal radiotherapy using mixed integer programming. Phys Med Biol 2003;48:4065–4076.
[62] Fox C, Romeijn HE, Dempsey JF. Fast voxel and polygon ray-tracing algorithms forIMRT treatment planning, 2005. Submitted to Medical Physics.
[63] Siddon RL. Prism representation: a 3d ray-tracing algorithm for radiotherapyapplications. Phys Med Biol 1985;8:817–824.
[64] Siddon RL. Fast calculation of the exact radiological path for a three-dimensional CTarray. Medical Physics 1985;12:252–255.
[65] Jacobs F, Sundermann E, Sutter BD, Christiaens M, Lemahieu I. A fast algorithmto calculate the exact radiological path through a pixel or voxel space. Journal ofComputing and Information Technology (CIT) 1998;6:89–94.
[66] Aleman DM, Romeijn HE, Dempsey JF. Beam orientation optimization methodsin intensity modulated radiation therapy treatment planning. IIE ConferenceProceedings 2006.
[67] Aleman DM, Romeijn HE, Dempsey JF. A response surface approach to beamorientation optimization in intensity modulated radiation therapy treatment planning.In review 2006.
145
[68] Jones DR. A taxonomy of global optimization methods based on response surfaces.Journal of Global Optimization 2001;21:345–383.
[69] Jones DR, Schonlau M, Welch WJ. Efficient global optimization of expensiveblack-box functions. Journal of Global Optimization 1998;13:455–492.
[70] Csallner AE, Csendes T, Markot MC. Multisection in interval branch-and-boundmethods for global optimization i. theoretical results. Journal of Global Optimization2000;16:371–392.
[71] Lagouanelle J, Soubry G. Optimal multisections in interval branch-and-boundmethods of global optimization. Journal of Global Optimization 2004;30:23–38.
[72] Epperly TGW, Pistikopoulos EN. A reduced space branch and bound algorithm forglobal optimization. Journal of Global Optimization 1997;11:287–311.
[73] Barrientos O, Correa R. An algorithm for global minimization of linearly constrainedquadratic functions. Journal of Global Optimization 2000;16:77–93.
[74] Thoai NV. Convergence of duality bound method in partly convex programming.Journal of Global Optimization 2002;22:263–270.
[75] Tuy H. On solving nonconvex optimization problems by reducing the duality gap.Journal of Global Optimization 2005;32:349–365.
[76] Phong TQ, An LTH, Tao PD. Decomposition branch and bound method for globallysolving linearly constrained indefinite quadratic minimization problems. OperationsResearch Letters 1995;17:215–220.
[77] Bomze I. Branch-and-bound approaches to standard quadratic optimization problems.Journal of Global Optimization 2002;2:17–37.
[78] Cambini R, Sodini C. Decomposition methods for solving nonconvex quadraticprograms via branch and bound. Journal of Global Optimization 2005;33:313–336.
[79] Aleman DM, Kumar A, Ahuja RK, Romeijn HE, Dempsey JF. Neighborhood searchapproaches to beam orientation optimization in intensity modulated radiation therapytreatment planning. in review 2007.
[80] Kumar A. Novel methods for intensity-modulated radiation therapy treatmentplanning. Ph.D. thesis, University of Florida, 2005.
[81] Geman S, Geman D. Stochastic relaxation, gibbs distributions, and the bayesianrestoration of images. IEEE Transactions on Pattern Analysis and Machine Intelli-gence 1984;6:721–741.
146
[82] Gelfand AE, Smith AFM. Sampling based approaches to calculating marginaldensities. Journal of the American Statistical Association 1990;85:398–409.
[83] Smith RL. A monte carlo procedure for the random generation of feasible solutionsto mathematical programming problems. Bulletin of the TIMS/ORSA Joint NationalMeeting 1980;:101.
[84] Belisle CJP, Romeijn HE, Smith RL. Hit-and-run algorithms for generatingmultivariate distributions. Mathematics of Operations Research 1993;18:255–266.
[85] Kirkpatrick S, Gelatt CD. Optimization by simulated annealing. Science 1983;220:671–680.
[86] Bomze I. Fast simulated annealing. Physics Letters 1987;122A:157–162.
[87] Belisle CJP. Convergence theorems for a class of simulated annealing algorithms onRd. Journal of Applied Probability 1992;29:885–895.
[88] Aleman DM, Glaser D, Romeijn HE, Dempsey JF. A primal-dual interior pointalgorithm for fluence map optimization in intensity modulated radiation therapytreatment planning. work in progress 2007.
[89] Jelen U, Alber M. A finite size pencil beam algorithm for IMRT dose optimization:density corrections. Physics in Medicine and Biology 2007;52:617–633.
[90] Jelen U, Sohn M, Alber M. A finite size pencil beam for IMRT dose optimization.Physics in Medicine and Biology 2005;50:1747–1766.
[91] Sempau J, Wilderman SJ, Bielajew AF. Dpm, a fast, accurate monte carlo codeoptimized for photon and electron radiotherapy treatment planning dose calculations.Phys Med Biol 2000;45:2263–91.
[92] Baro J, Sempau J, Fernandez-Varea JM, Salvat F. Penelope: An algorithm for montecarlo simulation of the penetration and energy loss of electrons and positrons inmatter. Nuclear Instruments and Methods 1995;B100:31–46.
[93] Sempau J, Baro J, Fernandez-Varea JM, Salvat F. An algorithm for monte carlosimulation of coupled electron-photon showers. Nuclear Instruments and Methods1997;B132:377–90.
[94] Khan FM. The Physics of Radiation Therapy. Lippincott William and Wilkins, 1994.
147
BIOGRAPHICAL SKETCH
Dionne M. Aleman completed her bachelor’s degree in industrial and systems
engineering at the University of Florida. She went on to study intensity modulated
radiation therapy (IMRT) treatment planning optimization in the graduate program of the
Department of Industrial and Systems Engineering at the University of Florida. She will
receive her Doctor of Philosophy in Industrial and Systems Engineering in December of
2007, after which she will pursue a career in the Department of Mechanical and Industrial
Engineering at the University of Toronto. Dionne plans to continue her research in cancer
treatments, as well as other applications of operations research techniques to the medical
and healthcare industries.
148