Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
-
Upload
saad-serghini -
Category
Documents
-
view
224 -
download
0
Transcript of Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 1/18
Numerical Solution of Continuous-State Dynamic Programs Using Linear and Spline
InterpolationAuthor(s): Sharon A. Johnson, Jery R. Stedinger, Christine A. Shoemaker, Ying Li, JoseAlberto Tejada-GuibertSource: Operations Research, Vol. 41, No. 3 (May - Jun., 1993), pp. 484-500Published by: INFORMSStable URL: http://www.jstor.org/stable/171851
Accessed: 27/01/2009 08:53
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=informs.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that
promotes the discovery and use of these resources. For more information about JSTOR, please contact [email protected].
INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Operations Research.
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 2/18
NUMERICAL OLUTIONOF CONTINUOUS-STATE YNAMICPROGRAMSUSING LINEARAND SPLINE INTERPOLATION
SHARON A. JOHNSONWorcester Polytechnic Institute, Worcester,Massachusetts
JERY R. STEDINGER and CHRISTINEA. SHOEMAKERCornell University, Ithaca, New York
YING LIBeijing Economic Research Institute of Water Resources and Electric Power, Beijing, Peoples Republic of China
JOSE ALBERTO TEJADA-GUIBERTUnited Nations, New York,New York
(ReceivedJuly 1990;revisionreceivedMarch 1991;acceptedJuly 1992)
This paper demonstrates hat the computationaleffort required o develop numericalsolutions to continuous-state
dynamicprograms an be reducedsignificantlywhen cubicpiecewisepolynomial unctions,rather hantensorproduct
linear interpolants,are used to approximate he value function.Tensorproduct cubic splines, representedn either
piecewisepolynomial or B-splineform,and multivariateHermitepolynomialsare considered.Computational avingsarepossiblebecause of the improved accuracyof higher-orderunctions and because the smoothnessof higher-order
functionsallowsefficientquasi-Newtonmethods o be used to compute optimaldecisions.Theuseof the more efficient
piecewisepolynomial orm of the splinewas slightly uperior o theuseof Hermitepolynomials orthe test problemand
easier o program. n comparison o linear nterpolation, se of splines n piecewisepolynomial ormreduced he CPU
timeto obtainresultsof equivalentaccuracyby a factorof 250-330 for a stochastic -dimensionalwater upplyreservoir
problemwith a smoothobjective unction,and factors anging rom 25-400 for a sequenceof 2-, 3-, 4-, and 5-dimensional
problems.As a result,a problem hat requiredwo hours to solve with linear nterpolationwas solved in a less than a
minute with spline nterpolationwith no loss of accuracy.
Dynamic programming (DP) is a versatile andpowerful optimization procedure because it
exploits the sequential character of the operation ofdynamic systems and allows nonlinearities, feedback,
and stochastic inputs to be represented.However, the
computational challenge posed by the numerical solu-tion of dynamic programs with continuous-state
spaces can be prohibitive for realistic problems
because of the dramatic growthin memory and com-putational requirements with the problem size andthe desiredaccuracy. Management problemsthat canbe modeled as dynamic programs with essentiallycontinuous-state spaces include the control of manu-
facturing processes, the operation of surface andground water reservoirs, he management of fisheries,crops, insect pests, and forests, and the managementof portfolios in markets with uncertain interest rates(see Shoemaker 1981, Stedinger, Sule and Loucks
1984, White 1985, 1988, Gal 1989, Shoemaker andJohnson 1989).
In a continuous-state DP model, the state space isusuallydiscretized so that the DP functional equationthat characterizes he solution need only be solved fora finite number of values of the state vector. Thenumber of discretevalues chosen affects the accuracyof the approximation to the continuous problem. Abackward recursion algorithm can generally be usedto solve the discretized functional equation. At eachstage, the algorithmrequires evaluation of the valuefunction from the previousstage,which has only beencalculated at the grid points defined by the discreti-
zation of the state space. When system dynamicsrequire he values of the value function at points otherthan state-space grid points, interpolation or anotherfunctionalapproximationscheme can be used to gen-erate them (Bellmanand Dreyfus 1962).
Subject lassifications:Dynamicprogramming,pplications: umerical olutionof continuous-stateynamicprograms. ynamicprogramming,Markov,infinite tate:higher-orderpproximation f the valuefunction.
Area of review:COMPUTING.
OperationsResearch 0030-364X/93/4103-0484$01.25Vol.41, No. 3, May-June1993 484 ? 1993OperationsResearch ocietyof America
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 3/18
Continuous-StateDynamic Program / 485
This paper examines the reductions in the compu-
tational effort required o develop numericalsolutions
to continuous-stateDP models that are possible whenhigher-order piecewise polynomial functions, rather
than tensor product linear interpolants, are used to
approximate the value function. The paper concen-trates on the use of tensor product cubic splines
(defined in Section 4) that are represented n piecewise
polynomial form for evaluation (pp-form splines)
because they were efficient in initial computationalstudies and easy to construct. Use of pp-form splineshas not been proposed previously in the literature.
Two piecewise polynomial methods suggested by
other authors are also considered: 1) cubic Hermite
polynomials, which form the basis of the gradi-ent dynamic programming algorithm proposed
by Kitanidis and Foufoula-Georgiou (1987), and
2) tensor product cubic splines in B-spline form
(B-formsplines), proposed by Daniel (1976) and
Birnbaum and Lapidus (1978).The computational performance s illustrated n the
numerical solution of a series of stochasticwater sup-
ply reservoir management problems. The paper
addresseswhether use of a coarser discretization of
the state spacecan compensate for the increased effort
to evaluate a higher-order approximant numericallyinstead of a simpler tensor product linear interpolant,while still maintaining the same accuracyin the solu-
tion. The smoothness of a higher-orderfunction is
also important because it allows efficient quasi-
Newton optimization algorithmsto be used to deter-mine optimal decisions. Previous papersthat describe
the use of higher-orderapproximants (Daniel 1976,Birnbaum and Lapidus 1978, L'Ecuyer 1985,
Kitanidis and Foufoula-Georgiou 1987, Foufoula-
Georgiouand Kitanidis 1988)did not providenumer-
ical evidence that the use of higher-orderapproxima-
tions can yield a significant computational advantageover the more commonly used linear interpolation.
The numerical studiesreportedhere also addresswhen
such approximations are likely to be appropriate byconsideringstochastic problems of up to five dimen-
sions and two different value functions. They dem-
onstratethat use of higher-orderapproximationscanyield significantcomputational savings on a collection
of examples.Section 1 describesthe formulation of DP models
and discusses algorithms for deriving numerical
approximations of their solutions. Section 2 reviewsothermethodsused to reducethe computationaleffortrequired to solve DP models numerically. The litera-
ture associated with analytical approximation of the
value function is discussed in Section 3. The charac-teristics of algorithmsthat employ different approxi-mations are discussed in Section 4, and the
computational effort associated with tensor productspline interpolation is compared with tensor prod-
uct linear interpolation in Section 5. Section 6presents extensive computational results for a four-
dimensional watersupplyexample. In Section 7, addi-tional computationalresultsare presentedfor a seriesof water supply examples of varying dimension andtwo different benefit functions. The results are sum-marizedin Section 8.
1. THE DP MODEL
Let X, be the continuous-state vector for a dynamicsystemthatdescribesa system'sstatusat the beginningof period or staget. In the example problemsdescribedin Sections 6 and 7, the state vector will describethevolume of water n each of several reservoirsand couldalso include hydrologic information (Loucks,Stedingerand Haith 1981, Stedinger,Sule and Loucks1984, Kelman et al. 1990). The dynamics of a systemare written
Xt+1= g(Xt, Rt, Qt), (1)
where Rt is the vector of decisions made in period tand Qtrepresentsrandom influencesupon the system.In the case of reservoiroperation, Rt correspondstothe releases from the reservoirsand Qtto the inflows
to the reservoirsduring the period.Denote the benefits fromsystemoperationin periodt by Bt[Xt,Rt, Qj, where the state is initially Xt, thedecision is RK,nd Qt is the value of randominfluencesupon system operation. The expected benefits to bemaximized from system operation from period 1untilperiod T are
J=E-{E, Bt[Xt, & Qt]+ QXT+I) (2)
where C(XT+l) is a terminal value function.Such sequential decision problems can be solved
throughthe recursivecalculation of an optimal value
functionFt[Xt] hat equals the expected futurebenefitsobtainable from the optimal operation of the systemfrom period t throughthe end of period T, given thatthe system begins period t in stateX, (Bellman 1957).For t = 1, . . T, Ft[Xt] s definedby
Ft[Xt] max EQ,lBt[Xt,Rt, Qt] + Ft+1[Xt+,]1, (3)Rw
whereXt+1=g(Xt,Rt, Qt)andFT+1(XT+1) Q(XT+I).
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 4/18
486 / JOHNSON ET AL.
The probability distribution over which the expecta-
tion El } is calculated may depend upon X, and R,.For finite-horizon problems, the value function
F,[X,] and policy R, can be found by recursively
solving (3) backward for t = T, . .., 1. When the
general functional form of Ft is not known, thecontinuous-state space for X, is replaced by a gridand
the value of Ft is computed from (3) at the grid points.
Because the value of Ft+, is only calculated at the grid
points of the state space, the value of Ft+, in (3) at
other points is determined by interpolating amongnearby grid points unless the transition function g is
definedso that a transitioncan only go to a grid point.For problems for which such a restriction is even
possible, it is seldom attractivebecause of the resultant
distortion of feasible system dynamics and/or thenecessity of using a very fine state-space grid to allow
reasonable resolution of any continuous control vari-
ables. Becauseof its simplicity, multidimensional lin-
ear interpolation is most often used to approximate
Ft,+, but other approximation schemes have been
suggested (see Section 3).In this paper, the location of the grid points are
chosen without consideration of the structure of the
optimal policy or value function, which is not knownin many practical problems.If the general form of the
value function or other characteristics of the model
solution are known, this knowledge may be used to
select the location of grid points or to suggest more
efficient solution methods tailored to the problem.
Calculation of the optimal decision Rtand the valueof Ft[Xt]on a fine grid poses a formidablecomputa-tional task for multidimensional problems. If k is the
dimension of Xt, and each of Xt's components is
approximated by N points, then such a grid wouldcontain Nk grid points. The optimization problem in
(3) must be solved at each of the grid points; thus, the
required computational effortE at each stage can be
as much as
E = W(p, k)Nk, (4)
where W(p, k) is the work or effort required to solveeach optimization problem. The effort W(p, k) will
generallyincreasewith p, the dimension of the deci-sion vector Rt, and also with k, the dimension of thestate vectorXt.
In the examples solved in Sections 6 and 7, thenumerical calculation of a reasonable approximationof Ft[Xt]for a 3-stage, 2-reservoirproblem when lin-ear interpolation was employed took 10 seconds onan IBM 3090-600E Supercomputer, and was esti-mated to require 70 hours for the corresponding5-dimensional problem. The effort requiredto solve
practical problems having nonlinear objectives with a
dimension k of more than 3 to 5 with reasonableaccuracy (a large value of N) is generally prohibitive
with the linear interpolation method commonly
employed. Using tensor product cubic splines allowed
solution of the 5-dimensional problem in just 10minutes of 3090-CPU time.
The backwardrecursion algorithmcan also be used
to solve (3) approximately for the long-run, steady-
state optimal value function of continuous-state,infinite-horizon problems when the state space has
been discretized (White 1963, Su and Deininger 1972,Federgruen and Schweitzer 1979). This value itera-
tion approach s advantageousfor numerically solving
the infinite-horizon, multistage periodic DP problem
posed by monthly reservoiroperatingproblems (Roefsand Guitron 1975). Policy iteration methodologies
can also be employed to solve infinite-horizon prob-lems numerically (Howard 1960). When the decision
spaceis discrete, policy iteration can be cast as a linear
programmingproblemwith decision variablesdefined
as the value function at the grid points (Heyman and
Sobel 1984), or the coefficients of polynomials and
spline functions that approximatethe value function
(Schweitzer and Seidmann 1985, L'Ecuyer 1989).Hence, the higher-order approximations developedhere for finite-horizon problems can also be used in
the value iterationand policy iterationapproachesfor
continuous-state, infinite-horizon DP problems.
2. REVIEWOF EFFORTS TO REDUCETHECOMPUTATIONALURDEN FOR DP
The computational effort required to solve a discre-tized version of a continuous DP problem has longbeen an issue. Several different strategies have beenemployedto combat the computational burden;bettermethods of approximating the value function is just
one strategy.
2.1. Modeling an Approximate Problem
Approximate solutions to complex discrete- andcontinuous-stateDP problems are often obtained byreformulating the problem using a simpler model
(Whitt 1978, Morin 1979). For example, the dimen-sion of the state space may be reduced by aggregatingstate-spacegrid points. Bean, Birge and Smith (1987)develop a state aggregation/disaggregation lgorithmfordeterministic,discretedynamic programs hatpro-duces a feasible solution to the original problem. Inthe reservoirliterature,several reservoirshave beencombined into a composite reservoir to reduce thenumber of state variables (Arvanitidis and Rosing1970a b, Turgeon 1980, 1981, Terry et al. 1986).
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 5/18
Continuous-StateDynamic Program / 487
Shoemaker reduced the state-spacedimension for pestproblems by using analytical solutions to describe
system dynamics (Shoemaker 1982) and by usingprevious decisions as state variables (Shoemaker
1979).
Other methods for approximating complicatedproblems include substituting deterministic equiv-
alentsfor stochastic components (Lovejoy 1986, Bragaet al. 1991), and partitioning the original probleminto smaller separableproblems (Trott and Yeh 1973,
Turgeon 1980, Lovejoy 1986, Braga et al. 1991).
Saad and Turgeon (1988) considered only the states
spanned by three principal components of the covar-
iance matrix for an entire reservoirsystem's operationwhen its operation was optimized deterministically.
This approach recognizes that many combinations of
state variablesare either impossible or unlikely.Solving an approximate problem introduces error.
Lovejoy (1986) gives conditions that ensure that his
solution of approximate discrete-state problems will
provide a bound on the optimal policies of the original
problem. Haurie and L'Ecuyer (1986), Bean, Birge
and Smith (1987), White (1979), and Whitt (1978,
1979) give bounds on the errorin the value functions
obtained by solving approximate problems.
2.2. Better InitialApproximations
In discrete or discretized infinite-horizon problems,
the backward olution of (3) until a stationarysolutionis identified can be acceleratedby use of a good initial
estimate of F(X) (Mawer and Thorn 1974). A simplerproblem can be used to generate initial estimates of
the infinite-horizon optimal value function (see Bras,Buchanan and Curry 1983; Wang and Adams 1986),
or a coarse gridcan be employed initially.
3. REVIEWOF EFFORTSTO APPROXIMATE
THE OPTIMALVALUE FUNCTION
The idea of approximatingthe value function Ft[Xt]by an analyticalfunction dates back to early publica-tions on dynamic programming; Bellman (1957),Bellman and Dreyfus (1962), Bellman, Kalaba andKotkin (1963) and White (1969) employed polynomi-
als to approximateFt[X,] over the entire state space.Takeuchi and Moreau (1974) used least-squares o fitlower-orderpolynomials over the entire spaceto solvea water resources problem numerically. Gal (1979,1989) developed an iterative algorithmwherethe opti-mal valuefunction is approximatedonly on the regionof the statespaceencounteredin previoussimulations.
Read (1990) developed an efficient computationalscheme by approximating the marginal value func-
tion with piecewise linear functions, and identifying
significant points in the dual space. Pereiraand Pinto(1991) propose a similaralgorithm based on Bender's
decomposition.
Polynomials are often used for approximationbecause they are easy to evaluate and differentiate,
and any continuous function on a bounded intervalcan be approximatedarbitrarilywell by polynomialsof sufficient order (Rudin 1976, Schumaker 1981).However, high-degree polynomials can oscillateseverely (Prenter 1981). In addition, in DP modelswith continuous decision variablesRt, the optimiza-tion step in (3) may seek out local optima induced bythe oscillations (Ginn 1986). Low-orderpolynomialsmay be unable to approximate the value function
adequately over the entire state space. As a conse-quence, piecewise polynomial approximations that
employ differentpolynomials in different subregionsof the domain have been suggested. With thesepiecewise polynomial approximation schemes, thesmoothness of the approximating function can beensured by requiringthat the values and derivativesof the polynomials match at the grid points andboundaries of the hypercubes they define.
Kitanidis and Foufoula-Georgiou, and Foufoula-Georgiou and Kitanidis proposed an algorithm theycalled gradient dynamic programming (GDP) that
employs cubic Hermite polynomials to approximatethe optimal value function for continuous-state DPs.Nonlinearprogramming s used to determine optimaldecisions.Kitanidis and Foufoula-Georgioushow that
asymptoticallyand for sufficiently smooth functions,the one-stage error n the control policy is of the order(AX)3 for a one-dimensional GDP algorithm, com-paredto (AX) with linear interpolation, where AzX sthe length of each intervalin the state-spacegrid. Theerrorin the value function at each stage is of order
(zLX)4 for the GDP algorithm and (VX)2 with linear
interpolation. Thus, for small AX, GDP can yieldmore accurate solutions than a backward recursionalgorithm that uses linearinterpolation. Kitanidis andFoufoula-Georgiou compared the accuracy of thesolutions producedusing GDP and an algorithmthatemployed linear interpolation on deterministic and
stochastic versions of a one-dimensional reservoirproblem, but did not report computation times. Inboth cases, GDP yielded solutions of equivalentaccuracy with less than half the number of discretelevels for the state variable. Foufoula-GeorgiouandKitanidis presented numerical results for the 4-
dimensional problem described in Section 6 usingGDP, but did not provide comparable numericalresultsfor tensorproduct linearinterpolation.Hence,the results presented here are the first numerical
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 6/18
488 / JOHNSON ET AL.
evidence that GDP with Hermite polynomials iscomputationally preferable o linear interpolationfor
multidimensional problems.
Daniel (1976) and Bi3rnbaum nd Lapidus (1978)
suggested hat tensor productcubic splines in B-spline
form (B-form splines) be used to approximate theoptimal value function. Daniel noted that theoreti-
callythesesplineshavethe same approximatingpower
as orthogonal polynomials with the same number
of coefficients, but require less effort to compute.
Birnbaum and Lapidus compared splines and poly-
nomial approximations on a two-dimensional,continuous-time, continuous-state, deterministiccon-
trol problem. After discretizingthe control variables,
they used enumeration to determine the optimal deci-
sion. Different computers were used and slightly dif-
ferent versions of the problem were solved in the
comparison. They concludedthat in terms of accuracyand computing time B-form splines were superiorto
orthogonal polynomials with the same number of
coefficients. Birnbaum and Lapidus also noted that
the gradient of the objective in (3) can be calculated
when spline approximations are employed, thus allow-
ing Newton-type optimization schemes to be used to
determine the optimal control. However, theyemployed enumerationin theircomputationalstudies
to allow comparison with previous literature.
This paper considers the computational advantage
associated with using the more efficient piecewisepolynomial form of the spline (pp-form spline) for
evaluation,ratherthan the B-form splinesuggestedbyDaniel and Birnbaum and Lapidus. The value of
employing a quasi-Newton algorithm to solve thenonlinear programmingproblem in (3), which is pos-
sible because the gradient of a spline is easily calcu-
lated, is also examined.
4. COMPARISONOF ALTERNATIVEAPPROXIMATION CHEMES
This paper examines the benefits of a backwardrecur-sion algorithm for numerically solving finite-horizon,
continuous-state DP problems, the spline DP algo-rithm, which employs tensor product cubic pp-form
splines to approximate the value function F,(X,).Use of the B-form spline proposed by Daniel andBirnbaum and Lapidus is also considered.The splineDP algorithm is compared with a backwardrecursion
algorithmthat employs multidimensional linearinter-
polation to approximate F,(X,), the multilinear DPalgorithm,as well as the GDP algorithm developed byFoufoula-Georgiou and Kitanidis.
4.1. The Multilinear DP Algorithm
In the multilinear DP algorithm, piecewise linearinterpolation is used to approximate the value func-tion FJ[X,] n each dimension. The resultingcontinu-ous multilinearapproximationof
F,[XIis then used
to generate he value of F,+Iat points in the state spacewhere it is needed when solving (3). The multilinear
approximation is local, depending only on the valuesof Ftat the vertices of the hypercube n the state-spacegrid that contains the point to be approximated (deBoor 1978).
The value of the multilinear approximant to
Ft[Xt]at a point Xt can be calculatedefficiently usingtensor product methods, which divide the k-dimen-sionalinterpolationprobleminto k parts,eachtreatingone dimension. Considera two-dimensionalproblemwhere Xt+1= (xi, x2), with x1 E (pli, p1,j+)and x2 E
(P2j, P21i+), wherePki denotes a state-spacegrid point.Let H(x1, x2) represent the bilinear approximantto F,+1 and let a = (x1 - P1j)/(Pi,j+i - Plu) and -
(X2 - P2j)/(P2,j+1 - p2j). The value of H at (xi, x2) is
calculated by first carrying out linear interpolationin xi, with x2 fixed at P2iand P2,j+1, yielding
h1(p2j)
= a * F,+l(pli+l, p2j)+ (1 -a) * Ft+1(p1,p2j) (5)
h2(P2,j+l)
=a * Ft+1(p,1+1,p2,j+1)+ (1 - a) *Ft+l(Pli,P2,i+l)
Then linear interpolation is carried out with respectto x2, so that
H(x1, x2) = d * h2(p2,j+l) + (1 -I) * h1(p2j). (6)
Using this procedure, the value of the approximantis generated each time it is needed when evaluating(3); the coefficients of the multilinearapproximationare never explicitly calculated or stored. For a k-dimensional problem, first 2(k-1) one-dimensional
interpolationsare done, then 2(k-2) and so forth, fora total of [2k - 1] one-dimensional interpolationsaltogether.
The first derivativeof the multilinearapproximant
is discontinuous at the faces of the rectangularvol-umes defined by the grid so that the optimizationproblem in (3) cannot be solved using quasi-Newtonmethods designed for use with smooth functions. Asa result,the multilinearDP programused the E04CCFalgorithm in the NAG subroutine library, which isan implementation of the polytope algorithm (Gill,Murrayand Wright 1981).The polytopemethod findsthe maximum of a function G by comparing thefunction values at the n + 1 vertices of a simplex and
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 7/18
Continuous-StateynamicProgram / 489
replacing the vertex with the lowest function valueuntil the simplex contracts to a final maximum. Themethod does not require evaluation of the gradientand is robust but slow (NAG FORTRAN LibraryManual 1984).
4.2. The Spline DP Algorithm
A one-dimensional cubic spline is composed of indi-vidual cubic polynomials, each defined over a subin-terval of the domain with end conditions specifiedso that overall the spline has continuous secondderivatives. Because they have excellent approx-imation power, and are easy to evaluate and manip-ulate, splines have found uses in many applications(Schumaker 1981).
The tensor product spline approximant of F,(X,)used in the spline DP algorithm consists of individ-ual multivariate cubic polynomials, each of whichis defined over a subregion of the state-spacedomain. The points that subdivide the domain aredetermined by the grid points of the state space ineach dimension, so that each polynomial is definedover one of the rectangularvolumes that make upthe state space. Thus, for X, = (xl, .. . , Xk),with xi E
[pi(ji), pi(ji + 1)], where pifji) denotes the jith state-space gridpoint that subdivides the domain in the ithdimension, the spline approximant H(x,, ..., Xk)
equals
4 4 4
E E X.. E C(il,,2,-..,jk)(l1, 12, * * , lk)(XI- )())
11=1 12=1 'k=1
(X2 - P2(j2))2 . . . (Xk -Pk(jk))k (7)
In this paper,a spline writtenin the form given by (7)is called a pp-form spline. Birnbaum and Lapidussubdivide the domain at points other than the state-space grid points.
The coefficients c(j ... jk)(ll, ..., ik) of the pp-formspline are determinedby requiringthat it interpolatethe values of F4[X,]at each state-space grid point. In
addition, the first and second derivativesof the cubicpolynomials defined within each volume arerequiredto match at the boundariesof the rectangularvolume
so that the resulting spline is smooth with second-degree continuity. Given these conditions, there aretwo extra degrees of freedom per grid line. In thespline DP algorithmthese were resolvedby using not-a-knot splines (de Boor). Thus, in a one-dimensionalproblem, with an N-point grid defining N - 1 inter-vals, only N - 3 separate cubic polynomials would be
defined; the second and second-to-last points wouldbe used only to specify function values that the firstand last cubic polynomials must interpolate. Such
one-dimensional splineshavethe same approximatingpower as cubic Hermitepolynomials (de Boor).Whennot-a-knot end conditions areused in eachdimension,the resulting spline has [4(N - 3)]k coefficients
CJ1. .k)(ll . . * , /k), where]j = 1, . . ., N- 3 and i =
1, ... i 4 for all i.In (7), the cubic spline is written as a polynomial in
powers of the components of a point X, and the
coefficientsc(j,.,jk)(J1,. . . , ik) are determinedby bothsmoothness requirements and interpolation condi-tions. Every cubic spline defined by (7) can also bewritten as a combination of one-dimensional cubicsplines, called B-splines(de Boor), which takes valuesbetween 0 and 1. Suppose that [ps, . . ., PN] representthe points that subdivide the domain [a, b] of a one-dimensional cubic spline. To define a sequence of B-splines over the interval [a, b], the points [P5, . . ., PN]
must be augmented by an additional 8 points at orbeyond the endpoints a and b. Because their exactplacementdoes not affect the accuracyof the approx-
imation, these points may be picked arbitrarily (deBoor); The cubic B-splineBi associatedwith point pi
is zero over all intervals except four adjacent ones.Figure 1 shows B-splines Bi and Bi+,1on the inter-val [a, b]; the points [pi, - . , Pi+5] are equallyspaced for convenience. Any one-dimensional cubicspline whose domain is defined by [a, P5, . .., PN, b]can be represented as a linear combination of theB-splines B1, ..., BN. A more complete discussionof B-splines, including methods for their efficient
evaluation, can be found in de Boor (1978) andJohnson (1989).In a k-dimensional, continuous-stateDP problem,
where each state variableis discretizedinto N levels,the spline approximantH(xl, . .. , Xk)with not-a-knotend conditions can also be written as
N N N
H(x, .xk) - E . . . a(i, i2, . ,ik)i1=1 i2 1 ik= I
*Bi(x)Bi2(x2)... Bik(xk), (8)
where Bi,(xi) represents the jth B-spline in the ithdimension. A splinewritten in the form in (8) is called
a B-form spline. Because each B-spline has therequired second-degree continuity, only the condi-tions that H(-) interpolateF,+1.] at each state-spacegrid point must be specified to determine the Nk
coefficientsa(i,* . ., ik,). In Section6, the performanceof the spline DP algorithm is examined using boththe pp-form and B-form spline for evaluation. In
Section 7, only the pp-formspline is used.
Using eitherrepresentation,calculationof the coef-ficients of a one-dimensional cubic spline whose
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 8/18
490 / JOHNSON ET AL.
Cubic B-Splines Bi
{pi}- points that subdivide the domain [a,b]
Pa
p5 P Pji+ Pi+2 Pi+3 Pi+4P i+5 PN b
N+t
P2 tN+2
P3 PN+3
P4 PN+4
Figure 1. Some of the cubic B-splines Bi defined on an interval [a, b].
second derivatives must match at every grid pointrequires solution of a set of tridiagonal (or almost
tridiagonal) linear equations to obtain the values ofthose second derivatives,and implicitlythe firstderiv-atives as well. With multidimensional tensor productcubic splines, the required second derivatives areobtained by solving a sequence of sets of linear equa-tions, one set foreach dimension (Pereyraand Scherer1973, de Boor 1978, Johnson, 1989). Because thespline approximation is not local, the coefficients ofthe spline arecalculated at the beginningof each stage,then stored for use in finding the value of F1+1 t anypoint where it is needed.
The second-degree continuity of the cubic splineapproximant of F, is very important because it allows
the use of efficient Newton-type or quasi-Newtonalgorithms (Gill, Murray and Wright) when solvingthe optimization problem in (3) at each grid point.The spline DP programuses the sequential quadraticoptimization routine E04VCF (NAG FORTRANLibrary Manual). A sequential quadratic program-ming algorithm is an active set method in which thedirection for a one-dimensional line search at eachiterationis determined by solving a quadraticprogram(NAG FORTRAN Library Manual). The objectivefunction of the quadratic program is a quadraticmodel of the Lagrangianassociatedwith the optimi-zation problem at the currentestimateof the solution,
and requiresthe evaluation of the objective functionand its gradientat that point. The value of any non-linear constraints and their Jacobian must also bedetermined.An approximation to the Hessian is builtup as the iterationsproceed.
The solution of the optimization in (3) at eachstate-space grid point X, using the quasi-Newtonalgorithm requires the evaluation of the objective
ElBt[Xt,Rt, Qt] + Ft+,[Xt+,]}and its gradient at eachiteration. For the test problemsconsidered here,which
have linear constraintsand dynamics, constraints on
Xt+i were converted into constraints on Rt and the
optimization was solved for Rt. In problems withnonlinear dynamics or constraints, it may be morenatural to solve for both Rt and Xt+1n the optimiza-tion, subject to the additional constraintsXt+,= g(Xt,Rt, Qt) (Gill, Murray and Wright).In eithercase, afterthe spline coefficients have been determined andstored for a given stage t, the value of Ft+1(Xt+1)ndits gradientare easily calculated using tensor productmethods for any Xt+1= g(Xt, Rt, Qt)(de Boor 1978,Johnson 1989).
To increase the efficiency of the DP algorithms, thebest available solution is used to start each numericaloptimization to compute Rt in (3). When the optimi-
zation problems at each stage are similar, the approx-imate Hessian from the previous stage is also used tostarteachoptimization afterthe firststage n the splineDP algorithm.
Concavityof Ft+1[Xt+l]oes not ensure the concav-ity of a cubic splineapproximant.Identification of theglobal optimum in (3) is not guaranteed by a quasi-Newton algorithmif the cubic spline approximationto Ft+,[Xt+l]s not concave. If the spline approxima-tion of Ft+, Xt+i]fails to be concave when Ft+1[Xt+1]is, it may be an indication that the spline is providinga poor approximation or that the optimizing algo-rithm is terminating unsuccessfully, introducing
spurious values of Ft+1 Xt+ . A concavity checker,described in the Appendix, was developed to detectsuch problems and to verify the in-progress per-formanceof the spline DP algorithm.
4.3. The Gradient Dynamic Programming(GDP)Algorithm
CubicHermitepolynomialsare the multivariatecubicpolynomials within each rectangular state-spacevol-ume that match the given values of Ft[Xt] and its
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 9/18
Continuous-StateDynamic Program / 491
gradients at the vertices of the hypercube.When (3) is
solved in the GDP algorithm, the gradient of FJ[X,]with respect to the state variableX, at each grid point
is also calculated and stored. Because these Hermite
polynomials provide a local approximation, it is pos-
sible to calculatethe coefficients for each polynomialonly when they are needed for interpolationin (3).
Our implementation of Foufoula-Georgiou andKitanidis's GDP algorithm employed their
FORTRAN program modified to use the same NAG
quasi-Newtonroutine E04VCF as the spline DP pro-
gram. Such algorithms generally perform well evenwhen the objective function does not have continuous
second derivatives (Gill, Murray and Wright), as is
the case here because the second derivatives of the
Hermite polynomials are not continuous at the faces
of the hypercubes defined by the state-space grid.The GDP program recomputesthe coefficients of the
Hermite polynomial each time an interpolationoccurs except when successive interpolations are
within the same hypercube.
5. COMPUTATIONAL FFORT FORMULTIVARIATEPLINE INTERPOLATION
Of interest is the computational gain that can be
achievedby using higher-order nterpolation schemes.Equation 4 indicates that the effort requiredto solve
a DP numericallyat each stage equals the number ofgrid points Nk at which (3) must be solved times the
effort required to solve (3), denoted W(p, k). HereW(p, k) is essentially the number of times the objec-
tive function EIB,[X,,Rt, Qt]+ F,+I[Xt+I]Ind perhapsits gradientmust be evaluated times the effortrequired
to evaluate that function. In most applications, the
expectation is computed by evaluating its argumentat a set of discrete points. If Bt[Xt,Rt, Qt]is a simple
and easy to compute analytical function of its argu-
ments, then interpolatingthe value of F.t+1[Xt+1]anbe the costly step of the objective function evaluationin multidimensional problems.
Computationaleffort is often measured in floatingpoint operations or flops (a floating point multiplyor divide and an associatedfloating point addition orsubtraction, plus any indexing;Golub and Van Loan1983).With this metric, the effort required o compute
the value of a k-dimensional multilinear interpolantis [k + 2(2k 1)] flops (Johnson 1989).
5.1. Effort for Tensor Product PP-FormSplines
The effort requiredto compute an approximationto
Ft[Xt] using tensor product cubic splines can be
divided into three parts. The LU decompositions of
the coefficient matrices needed to solve for the second
derivatives of the spline function only need to becalculated once for a given state-space grid, and may
be determined at the start of the algorithm unless the
grid changes (Daniel).Then, at the beginning of each stage, the coefficients
C(jU, ,jk)(l1, ... Ik) of the interpolatingcubic polyno-mials in (7) need to be calculated given the new values
F,+,[X,+1].The coefficients could be calculated by
directly solving the Nk linear equations generatedby the interpolationconditions. However, the coeffi-
cients are more efficiently computed using tensorproduct methods, which consider one dimension at a
time and take advantage of special structure. Solvingthe requiredequations in this calibrationstep to deter-
mine the coefficients requires approximately 4.67Nk(4k _ 1) flops (Johnson).
Finally, evaluation of the interpolatorycubic splinerequires 4k - 1) flops (Johnson). Because this evalu-ation must be done many times to compute the expec-tation in (3) at each iteration of the numerical
optimization algorithmfor each of the Nkgrid points,it dominates the effort required for calibration.
After the cubic polynomials' coefficients have been
computed and stored, evaluating the spline interpo-lant requires 4k - 1)/[k+ 2(2k - 1)]or roughly 2(k-1)
times more effort than computing the value of a
multilinear interpolant. For k = 1 there is no differ-
ence in effort,whereas for k = 4 the factoris nearly8.
5.2. Effort for Tensor Product B-FormSplines
The coefficients a(il, . . ., ik)of the B-form spline in
(8) can also be calculated using tensor productmethods. The effort requiredto calculate the coeffi-cients a(i,, . . . , ik) at each stage is kN(k- )[5N - 4]
flops (Johnson).The calibration effort associated with the B-form
spline is substantially less than that associated with
the pp-form spline, but the dominant evaluation costsare six times larger,approximately6(4k - 1) flops perfunctionevaluation(Johnson).Becauseof the number
of times evaluation is requiredwhen solving (3), useof the piecewise polynomial representation in (7)yieldsa more efficient algorithm. However,the storagerequirements for the coefficients of the B-splinerepresentation of the spline, equal to 8Nk byteswith double precision arithmetic, can be substantiallyless than the 8[4(N - 3)]k bytes required when the
interpolant is expressed as a piecewise polynomial(Johnson).
An advantageof the B-spline representation s that
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 10/18
492 / JOHNSON ET AL.
it is both general and flexible. The coefficients of thepiecewise polynomial approximation can be calcu-lated efficiently by converting from the B-spline rep-resentation (de Boor 1978, Johnson 1989) inapproximatelyNk (4k - 1) flops (Johnson). The
spline DP algorithm computes the coefficients ofthe piecewise polynomial splines by convertingfrom the B-spline representation.
5.3. Advantages of Splines
While the computation of the spline approximationtakes more effort than multilinear interpolation as k
increases, cubic splines should provide a more accu-rate approximation, as would Hermite polynomials,thus allowing a coarserstate-space grid correspondingto a smaller value of Nto be employed. This can makeup for the factor of roughly2(k-1) ncrease n evaluationeffort. If N could be halved by use of cubic splines, a
computational savingwould always result.As noted in Section 4, the use of efficient quasi-
Newton optimization methods made possible by the
smoothness of the spline interpolantcan significantlyreduce the effort W(p, k) requiredto solve (3) numer-ically. The value of F,+, and its gradient are required
when evaluating the expectation in (3) at each itera-tion of a quasi-Newton algorithm. The evaluation ofthe cubic spline and its gradientwith respect to X,+1requires .67 4k-1) flops(Johnson), rapproximately2.67(2(k-1)) times more effort than required o evaluatea multilinearinterpolant.
6. COMPUTATIONAL ESULTS FOR A SAMPLEFOUR-RESERVOIR ROBLEM
The performance of the spline DP, GDP, and multi-linear DP algorithms was compared on a seriesof stochastic water supply problems to examine theissues of accuracy, efficiency, and the advantages ofsmoothness.
6.1. Four-Reservoir Test Problem
The sample four-reservoir ystemdescribed hereis the
same test problem used by Foufoula-Georgiou andKitanidis and has served as a benchmark for testingnumerical methods in the water resources literature(Yakowitz 1982). Let R(t) be the decision vector(release) and S(t) the state vector (beginning storage)forperiodt. The inflows to reservoirs1and 2 in periodt areassumedto be independent and denoted by ql(t)and q2(t), respectively. The releases from reservoir2flow into reservoir3, and the releasesfrom reservoirs1 and 3 flow into reservoir 4. The state transition
equationsare:
S1(t + 1) = SI(t) - RI(t) + ql(t)
S2(t + 1) = S2(t) - R2(t) + q2(t)
(9)
S3(t + 1) = S3(t) - R3(t) + R2(t), and
S4(t + 1) = S4(t) - R4(t) + [RI(t) + R3(t)].
The cost function associatedwith current period oper-
ations is
B[S(t), R(t), q(t)] = E Ci(t)[Ri(t) - 1]2. (10)i=l, . 4
The terminal cost
C[S(t + 1)] = E [Si(T + 1) -_ M]2
i=l. . . ,4
is incurred at the end of the operating horizon, where
mi is the desiredvolume of waterin reservoiri at the
end of period T. Consideringthe constraints on both
decision and statevectors,the problemis to determinethe policyRi(t) (i = 1, . . ., 4; t = 1, . . ., T) thatminimizes the objective function in (2) subjectto
0 _ Sp(t) < STa(t)
for i = 1, ..., 4; t = 1, ..., T (hla)
R in(t) < Ri(t) < Rl ax(t)
for i 1,.. ., 4; t = 1,..., T (1 lb)
and the state transitionequations (9).Following Foufoula-Georgiou and Kitanidis, T = 3
operating periods are considered,the maximum stor-age ST'(t) for all reservoirs is 12, and the desired
storagesmi at the end of the horizon equal (5, 5, 5, 7).The cost coefficients Ci(t) are (1.1, 1.2, 1.0, 1.3) for
all t.
In the deterministic version of the problem, the
inflows ql(t) and q2(t) have values (2,4) in all
periods. In the stochastic version, the inflows areindependent lognormal random variables, with
means (2, 4) and standard deviations (1.5, 1.5).Both continuous inflow distributions were replacedby a three-point discrete approximation with proba-bilities of 1/6, 2/3, and 1/6 assigned, respectively,tothe 5, 50, and 95 percentiles.The releasesRi(t) wereconstrained so that Si(t + 1) satisfied (1 la) for each
discrete inflow.
6.2. Results
Both the deterministic and stochastic versions of the4-reservoirproblem were solved using the three inter-polation schemeswith the discretization level N rang-ing from 3 to 17 points per dimension. The optimal
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 11/18
Continuous-StateDynamic Program / 493
solution to the continuous deterministicproblemwas
determined using nonlinear programming,and used
to measure error in discretized problems. The first-
period value function obtained from the 17-point/dimension splinerundefinedthe true or best available
solution in the stochasticcase and was used to definethe errors n the value functions calculatedusingother
values of N. At a particularpoint in the state space,
the error is calculated by dividing the difference
between the true value function and the calculatedvalue function by the true value function, then taking
the absolute value. The overall erroris calculatedby
averagingover the errorscomputed at all the individ-
ual points in the state space. The errorwas examined
forsystematicbias,whichwouldinflate errorestimates
but is not important operationallybecause the shapeof the value function determines a policy. Bias was
not found to be significant (Tejada-Guibert 1990).
TableI displays the resultsfor the stochastic problems.The CPU times are in seconds on the Cornell National
Supercomputer Center's IBM 3090-600E (Johnson,Stedingerand Shoemaker 1988).
As indicated by a comparison of the fourth and fifth
columns of Table I, use of the B-spline representationfor evaluation in the spline DP algorithm instead ofthe piecewise polynomial representation ncreased theactual run time by approximately a factor of 3. Thisincrease is less than the factor of six suggested by the
analysis in Section 5, but clearly indicates the com-
putational advantage associated with using the piece-
wise polynomial representation.For a given discretization level N, the spline DP
program (using pp-form splines) actually ran fasterthan the multilinearDP program, despite the roughly20-fold increase in the number of flops requiredto dothe spline interpolation and to calculate the spline'sgradient analytically. This result is due to the greater
efficiency of the quasi-Newton algorithm, whichemploys the calculated gradients to select a searchdirection and to build an approximation of the
Hessian matrix. Of greater significance, for the same
discretization level, the spline DP program yieldedsolutions that were approximatelytwo ordersof mag-nitude more accurate than the solutions obtained
using the multilinear DP program.
Figure 2 compares the average of the relative abso-lute errorswith which multilinear, Hermite GDP, and
pp-form splineinterpolants n a DP algorithm approx-imated the optimal value function FI(S(l)) as a func-tion of total CPU time. Results forboth the stochastic
and deterministic versions are presented. The use of13 points per dimension with multilinear interpola-tion yielded an averageerrorof 1.09% n the stochasticversion of the problem,whereas he spline DP solutionwith 4 points per dimension had an error of 1.02%.If
a problemcan toleratea 1%error n the optimal value
function,the splineDP algorithmcan yield the answerwith 255 times less effort; for an 0.5% averageerror
the gain increasedto 330 times. On the deterministicversion of the 4-reservoir problem, the spline DP
algorithm yields a solution with 1% error 180 timesfaster than the multilinear DP program, and for an
0.5% error was 280 times faster. While results for
deterministic and stochastic versions of the problemare qualitatively similar, the gains are greaterin the
stochastic problems because more interpolations are
requiredper iteration.The dramatic savings in CPU time associated with
the DP spline algorithm is due both to the greater
accuracy of the spline and the ability to use quasi-
Newton methods to determine optimal decisions. Forthe 4-, 5-, and 7-point discretization schemes in the
stochastic problem, the spline DP algorithmwith the
quasi-Newtonroutine E04VCF ran approximately10times faster than when the polytope routine E04CCFwas employed, despite the extra effort required tocalculate the gradientsanalytically (Tejada-Guibert).In a case with a 250-fold reduction in CPU time, thisimplies that a 10-fold reduction is associated withthe use of a quasi-Newton algorithm and a 25-fold
Table IAverage Relative Absolute Error and CPU Time for Solution of the Stochastic 4-Reservoir Test Problem
Cubic SplinesN Multilinear
B-Form PP-FormHermite GDP
Pts./Dim. CPU(s) Error(%) CPU(s) CPU(s) Error(%) CPU(s) Error(%)
3 25 52.527 28 2.7794 64 21.733 72 26 1.024 87 0.4785 145 12.859 181 63 0.270 212 0.1697 462 5.385 641 225 0.087 770 0.0619 1,096 3.126 1,610 604 0.052 2,124 0.018
13 4,043 1.092 6,976 2,444 0.009 12,800 0.00717 10,520 0.860 6,950 0
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 12/18
494 / JOHNSONET AL.
Errorvs. CPU Time for 4-Reservoir Problem
100 -
10-U ()ML-St
SP-St
Rel e A ( ) poGDP-St
Error A s ML-Det% 0.1-
A A SP-DetI ~~~~~~A
0.01 ~-A* GDP-Det
0.001CP
F'igure2. Averagerelative errors n the optimal value function are plotted againstthe CPU time for solutions ofthe 4-reservoir est problem with quadratic penaltiesover a rangeof values of the points/dimension N,and use of either multilinear (ML), pp-form spline (SP), or Hermite polynomial interpolation(GDP)for both deterministic(Det) and stochastic (St) problems,as noted.
reduction is associated with the increasedaccuracy ofthe spline.
The GDP algorithmusing the derivatives of Ft[Xt]and Hermite polynomials with a quasi-Newton opti-mization scheme required more effort than use ofsplines and linear interpolationfor the same discreti-zation level N, but yieldedsolutionswith greateraccu-racy than the spline DP program. Figure 2 shows that
for the same effort the GDP algorithmusing Hermite
polynomials was substantially more efficient thanmultilinear nterpolation,but generallynot as efficient
as the spline DP algorithm.An advantage of the use of splines over the use of
GDP with Hermite polynomials is that a splineapproximation is constructed in much the same wayas a multilinear approximation, allowing the devel-
opment of spline subroutines that can be incorporatedinto a computer program with no greater difficultythan subroutines for multilinear interpolation. Bycontrast, the Hermite polynomials require that thegradient of the optimal value function F(X) withrespect to the decision vector R be computed, whichrequires a more elaboratecomputer code. Such pro-gramming
decisions play a larger role in DP applica-tions because general purpose algorithms are notavailable.
7. OTHER EXAMPLES
7.1. Effect of State Variable Dimension on CPURatio
To illustrate the relative change in the performanceof the spline and multilinear DP algorithms with
problem dimension, the series of 2-, 3-, 4-, and5-dimensional problems described in Table II wascreated (Tejada-Guibert). They correspond to the4-dimensional problem described in Section 6 withthe addition or deletion of reservoirs in order togenerate problems with state spaces of differentdimensions.
Figure 3 contains plots of the averagerelative errorsversusthe CPU time for solutions obtained using themultilinear DP and spline DP algorithms(using thepp-form spline) for the cases in Table II. Table IIIreportsestimatesof the ratio of the CPU time requiredto solve these problems with a 0.5% average errorusing the multilinear DP and the spline DP algo-rithms. The numerical advantage of splines increasesrapidly with problem size, although not as quickly as(4) might suggest.
7.2. Effect of the Smoothness of the ObjectiveFunction on the CPURatio
To test the effect of the smoothness of the sampleproblems on the results, the quadratic penalties ondeviations of releasesand final storage volumes from
the specified targets in (10) were replaced in the 2-and 4-reservoirproblems by absolute values to gen-erate two new test cases (Tejada-Guibert)where
B[S(t),R(t),q(t)]= E CiIRi(t)- 1I and (12a)i=l_., 4
C[S(T+ 1)] = Si(T+ 1) - mi. (12b)
Because the currentperiod objective function (12a) ispiecewise linear, the transition equations are linear,and the problem is constrained, the value function
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 13/18
Continuous-StateDynamic Program / 495
Table IISequence of DP ReservoirProblems
DP FunctionalEquation
Ft[S(t)] = min Eq(I){B[S(t),R(t),q(t)]+ Ft+,[S(t + 1)]}R(t)
CurrentPeriodObjectiveFunctionB[S(t),R(t),q(t)]For 3-, 4- and 5-reservoir roblems: jCi*(R,(t)- 1)2
For2-reservoir roblem: RI(t)+ R2(t) - 2)2
TerminalValue functionC[S(T+ 1)]= [S,(T + 1) - m,]2; T = 3Cases
2: 2 reservoirsn parallel. CI = (1.0, 1.0), Imi}= (5.0, 7.0).3: 2 reservoirsn parallel,1reservoir ownstream fterconfluence.{C} = (1.1, 1.2, 1.3),{mi} (5.0, 5.0, 7.0).4: 1 reservoirn parallelwith respect o 2 reservoirsn series, ourthreservoir ownstreamromconfluence. CI}= (1.1, 1.2, 1.0,
1.3), {miI= (5.0, 5.0, 5.0, 7.0).5: As for 4-reservoir roblem,with fifth reservoir elow the fourth.{CI} (1.1, 1.2, 1.0, 1.3, 1.1), Im} = (5.0, 5.0, 5.0, 7.0, 7.0).
Constraints
Sm'i(t)= 0 and Si"(t) = 12forall reservoirs andperiods .
Imn(t)= 0 for all i and t; releasesare also constrained o that S(t + 1)remainswithin ts bound for all levels of the discrete nflow.
Ft(X,)is composed of linear hyperplanes. Hence, themultilinear interpolant may conceivably provide a
better approximation of F,(X,) than a smooth spline.
However, when the value function is made up of a
great many hyperplanes and the boundaries of eachhyperplaneare not identified, the tensor product cubic
spline might approximate the value function betterthan the multilinear interpolant.
Figure4 contains plots of the average relative errorsversus CPU time obtained by solving the 2- and 4-
reservoir problem with absolute value penalty func-
tions. For the 2-reservoir problem, the spline algo-rithm has trouble approximating the optimal value
function reliably when there are few discretizationlevels. However, the spline DP algorithm performed
significantly better than the multilinearDP algorithmwhen the state space contained 9 or more points perdimension. The Appendix describes the results of
using the concavity checker to verify the behavior ofthe spline approximationin this problem.
In the 4-dimensional problem, the spline DP algo-rithm always performed better than the multilinearDP algorithm, although neither algorithm producedsolutions that were as accurate as those achieved forsimilarCPU times when the quadraticobjective func-tion (10) was employed. Because therearemany inter-polation points in a 4-dimensional problem even forsmall values of N, the spline DP was better behavedand for an averageerror of 2% provided the solution40 times fasterthan the multilinearalgorithm.
Errorvs. CPU Time for 2-, 3-, and 5-Reservoir Problems
100 vU A
U A
10 A
0.1 - A AA 3L2o A
A ~~~~~~A A ML3Avg. 1A* MLRel.
Error a u agn PU
0.1 AAP
D SP 50.01 A
A
0.001 ~~~~~~~CPUsec) 1,0
Figure3. Averagerelativeerrorsin the optimal value function are plotted againstthe CPU time for a rangeofvaluesof the points/dimension N obtainedusing multilinear(ML) and pp-form spline(SP) interpolation
for solutions of the 2-, 3-, and 5-reservoir est problemswith quadratic penalties, as noted.
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 14/18
496 / JOHNSON ET AL.
Table IIIRatio of Multilinear to Spline CPU Times for
0.5% Relative Error
Number of Reservoirs Ratio
2 25
3 110
4 330
5 400
8. SUMMARY
Employing multivariate,cubic, piecewise polynomialfunctions to approximate the value function in thenumerical solution of continuous-state dynamic pro-grams can reduce the computational effort requiredto solve such problems because of the ability to use acoarser grid and to employ efficient quasi-Newtonoptimization algorithms. Using a test problem thathas served as a benchmark in the waterresourcesarea,we compared the performance of our spline DP algo-
rithm, which uses the piecewise polynomial form ofthe spline approximant,to algorithmsemployingmul-tivariate Hermite polynomials and splines in B-splineform. Use of our piecewise polynomial representationdecreased the CPU time by a factor of 3 over thatrequired by the B-spline representation proposed inthe literature. For the same computational effort, thesolution to the 4-dimensional test problem obtained
using the spline DP algorithm had slightly greater
accuracy than the solution obtained using the GDPalgorithm that employs multivariate Hermite poly-nomials. Moreover,becausespline interpolantsdo notrequire that the gradient at each grid point be calcu-
lated and stored,the spline DP algorithm has the samesimple structureas an algorithm that employs tensorproduct linearinterpolation.
Computational experiments on a series of watersupply reservoirproblems of varying dimension and
two differentvalue functions were carriedout to inves-tigate the impact of these parameterson computa-tional savings. For smooth problems, the observedcomputational advantageof the spline DP algorithmrelative o the multilinear DP algorithm ncreasedwiththe state-spacedimension and the requiredaccuracy.For the 4-reservoir stochastic test problem with aquadraticpenalty function, the computationalsavingswere in excess of a factor of 250 for a 1% averagerelativeerrorand 330 for a 0.5 %averagerelativeerror.
The spline DP algorithm outperformed the multi-linear DP algorithm on test problemswhere the opti-mal value function was piecewise linear, althoughthecomputationaladvantagewas not as significantas forthe case with a quadraticobjective function. Neitherthe spline DP nor the multilinear DP algorithmpro-duced very accurate solutions for the 4-dimensionalstochastic test problem with an absolute value costfunction; the computational savings were approxi-mately 40 for an averagerelativeerror of 2%.
The computational studies reported here suggestthat significant computational savings may beachieved when tensor product, cubic spline interpo-lation is used to develop numerical solutions to con-tinuous-state dynamic programs, particularly those
with several continuous state variables and smoothobjectiveswithcurvature.In Tejada-Guibert,Johnsonand Stedinger(1993) and Tejada-Guibert, he splineDP and multilinear DP algorithms were used to
Errorvs. CPU Time for Absolute-Value Penalties
100 .
10
* * ~~~~~~~~~~~~ML2Avg. E *MRel. MI
Error E)0 IC>P2
ElLSP4
0.1 C
0.01 - _ I
0 1 10 100 1,00 10,00
CPU (see)
Figure4. Average relative errors n the optimal value function are plotted againstthe CPU time over a rangeofvaluesof the points/dimension N obtainedusing multilinear (ML) and pp-form spline(SP) interpolationfor solutions of the 2- and 4-reservoir est problemswith absolute value penalties,as noted.
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 15/18
Continuous-StateDynamic Program / 497
generate operating policies for the two-reservoir
Shasta/Trinity system in northern California.When
energy targets were high, and large penalties were
placed on water and power shortages,the spline DP
algorithmgeneratedvaluefunctions with substantially
smaller error. However, for relatively flat objectiveswith moderatewater and power targets, he splineand
multilinearDP algorithmsproducedsolutions of sim-
ilar accuracy or the same effort. In other cases,savings
may also be significantwhen the value function con-
sists of many hyperplanesapproximatingan almost
smooth surface.EmployingHermite polynomials will
likely result in similar computational savings.
Althoughonly demonstratedin this paperfor a back-
ward recursionalgorithm,approximationof the value
functionwith tensorproductcubic splinesshould also
yield improvementsin policy iterationand linearpro-
grammingmodels.
APPENDIX
A Concavity Checker for the Spline DP Algorithm
(J. AlbertoTejada-Guibertand JeryR. Stedinger)
To demonstrate the concavity of a function it is suf-
ficient to show the negative semidefiniteness of its
Hessian matrix of second partialderivatives Sydsaeter
1981). Demonstratingthe concavity of the entire sur-
face F,(X,) would be a large numerical task because
the entire state space would have to be examined.
However, interactions among many of the compo-
nents of reservoirsystemsand other systems are often
weak. If the objective in the optimization in (3) were
composed of separable erms, it would be sufficientto
verify the concavitypropertiesof FJXJ] n eachdimen-
sion separately.Our concavity test considerswhether
the second derivatives of F4[X,]with respect to each
state variable at each state-spacegrid point are nega-
tive. Second partial derivatives of cubic splines vary
linearly between grid points and cannot become pos-
itive in the interior of a hypercubeunless they arealso
positive at a vertex.
A Nonconcavity IndexIf nonconcavities are present in the spline approxi-
mant of F[X,], an index is needed to identify situa-tions likely to pose a problem. The desirableproperties
of an index include: ease and economy of computa-tion, unitlessness and scale invariance, and generalityso that solutions to different problems of differentdimensions can be compared. For a k-dimensionalDP problem, let Xim,mnd Xi,min, = 1, . . . , k, be the
maximum and minimum valuesof each state variable
in the rectangularstate space, and L = Xi,max Xi,min
be their range. If F' is the second derivative of
FJ[XJ],the averagepositive second partial derivative
with respect to a given state variableXi (the subscript
t has been dropped) over a k-dimensional state space
equals
[f ]8+)Fl
Xk,min Xl,min
where V5Ss the state-spacevolume equal to (fli Li for
i 1, ..., k), and [FlX](+)s the value of the second
derivativeof F(X) with respect to Xi if it is positive,
and zero otherwise.This expression is not unitless. A reasonable nor-
malizing factorcould employ the rangeLi, i = 1,.
k, squared,and
MF = [Fmax Fm%], (A.2)
whereFmaxs the maximum value of F and Fm%s the
mth quantile (m% of the values of F are higherthan
Fm%).Here Fm%s employed in (A.2) ratherthan the
minimum value of F because penaltiesat boundaries
or on infeasible conditions can result in unusual F-
values that would result in large and unusual MF.
To translate he index (A.1) nto a computable form,
the integralwas discretizedusing a trapezoid-likerule
so that the second derivativeat each of the 2k vertices
in a k-dimensional problem has a weight of 1/2k
towardthe value of the integral.For example, for a 2-
dimensional (k = 2) problem, the index becomes
I 21
nl-1 JA 1jhA.2,jh
F VS+ h= = 1 2k [([Fli] )j,h
+ ([Fh](d+)%+1,h + ([F]ll(+)),h+l
+ ([F!J](+))j+lh+l]j for i = 1and 2, (A.3)
where n1 and n2 are the number of discretization
intervals for state variables X1 and X2, respectively,and I1Ajh and A2,j,h are the length of each hyper-
cube along dimensions X1 and X2. There are
(n, - 1)(n2 - 1) hypercubes in a 2-dimensional
state space.Using the normalizing factors and the averages
[F1'J(+),he overall nonconcavity index is:
NCI = (A.4)
Results
Using the median of the F-values, F50%,o compute
MF,numericaltestsindicated thatthe NCI index (A.4)
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 16/18
498 / JOHNSON ET AL.
Table IV
Nonconcavity Tests Results for the Two-Reservoir
Problem With Absolute-Value Penalties
Discretization AREa
Level N NCI (%)
4 1.230 11.005 0.0518 0.523
0.0917 1.759 0.0153 0.122
13 1.400 0.59113 0.0238 0.099917 2.510 0.46317 0.0203 0.072425 0.292 0.050833 0.867 0.053549 0.220 0.023765 0.427
aAverage relativeerrors(ARE) re included using the 65discretizationevelrunas base case.
provided a consistent and general measure of non-
concavity on a series of two- and four-dimensional
problems,for different patternsand densities of state-
spacediscretizationgrids.Our experience ndicatesthat
NCI > 1.0 corresponds to unsatisfactorysituations,
and that problems with NCI < 0.1 did not have
computationalproblems.Table IV shows resultsfor the 2-reservoirabsolute-
value penalty problem reportedin Figure4. For N =
4 (NCI = 1.23,ARE = 11.0%), he splines seem unable
to describe the value function well. There aretwo runs
for N = 13 and 17; in each case the first experienced
unsuccessful terminations in the optimization steps.They have high NCI's and also large errors,ARE. The
N = 7 case exhibits low nonconcavity, but a relatively
highARE. However, this problemwas found to have a
high NCI after the firststage;the bias from the initial
errorsdid not disappear.Ourexperiencewasthat NCI
was useful in identifyingcases where something went
wrong, and as a consequence had largererrors n the
resultantapproximationof F[X].The CPU time requiredby the concavity checker
was 5 %or less of the averagetime taken to run each
stage of the spline DP algorithm for the problem
tested, with a slight additional memory requirement.
The run-time cost of checking concavity thereforeissmall, particularly n view of the benefitsof obtaining
some assurance that the algorithm is working as
anticipated.
ACKNOWLEDGMENT
The authors thank E. Foufoula-Georgiou and P.
Kitanidis forgenerouslyprovidingthe originalversion
of their GDP code. This research was supported by
National Science Foundation grant #CEE-8351819,Pacific Gas and Electric Company of San Francisco,
and a fellowship from the U.S. Army Mathematical
Sciences Institute at Cornell. Computing resourceswere provided by the CornellNational Supercomputer
Facility, a resource of the Cornell Theory Center,which receives major funding from the NationalScience Foundation and IBM Corp., with supportfrom New York State and members of the CorporateResearch Institute.
REFERENCES
ARVANITIDIS, N. K., AND J. ROSING. 1970a. Optimal
Operation of Multireservoir Systems Using a Com-
posite Representation. IEEE Trans. Power Appar.
and Syst. PAS-89, 327-335.
ARVANITIDIS, N. V., AND J. ROSING. 1970a. Composite
Representation of a Multireservoir HydroelectricPower System. IEEE Trans. Power Appar. and Syst.
PAS-89, 319-326.
BEAN, J. C., J. R. BIRGE AND R. L. SMITH. 1987. Aggre-
gation in Dynamic Programming. Opns. Res. 35,
215-220.
BELLMAN, R. E. 1957. Dynamic Programming. Princeton
University Press, Princeton, N.J.
BELLMAN, R. E., AND S. DREYFUS. 1962. AppliedDynamic Programming, Princeton University Press,
Princeton, N.J.
BELLMAN, R. E., R. KALABA AND B. KOTKIN. 1963.
Polynomial Approximation-A New Computa-
tional Technique in Dynamic Programming. Math.
Comp. 17, 155-161.BIRNBAUM, I., AND L. LAPIDUS. 1978. Studies in Approx-
imation Methods-I: Splines and Control Via Dis-
crete Dynamic Programming. Chem. Eng. Sci. 33,415-426.
BRAGA, B. P. F., W. W. YEH, L. BECKER AND M. T. L.
BARROS. 199 1. Stochastic Optimization of Multiple-
Reservoir System Operation. J. Water Resour. Plan.
and Mgmt. 117 (4), 471-481.
BRAS, R. L., R. BUCHANAN AND K. C. CURRY. 1983.
Real Time Adaptive Closed Loop Control of Reser-
voirs With the High Aswan Dam as a Case Study.
WaterResour.Res. 19 (1), 33-52.DANIEL, J. W. 1976. Splines and Efficiency in Dynamic
Programming. J. Math. Anal. & Appl. 54, 402-407.DE BOOR,C. 1978. A Practical Guide to Splines. Springer-
Verlag, New York.
FEDERGRUEN, A., AND P. J. SCHWEITZER. 1979. Dis-
counted and Undiscounted Value-Iteration in
Markov Decision Problems: A Survey. In
Dynamic Programmingand Its Applications,M.Puterman (ed.). Academic Press, New York, 23-52.
FOUFOULA-GEORGIOU, E., AND P. K. KITANIDIS. 1988.
Gradient Dynamic Programming for Stochastic
Optimal Control of Multidimensional Water
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 17/18
Continuous-StateDynamic Program / 499
Resources Systems. Water Resour. Res. 24 (8),
1345-1359.
GAL, S. 1979. Optimal Management of a Multireservoir
Water Supply System. Water Resour. Res. 15 (4),737-748.
GAL, S. 1989. The Parameter Iteration Method inDynamic Programming. Mgmt. Sci. 35, 675-684.
GILL, P. E., W. MURRAY AND M. H. WRIGHT.1981.Practical Optimization. Academic Press, London.
GINN, T. R. 1986. Personal communication (August).
GOLUB,G. H., AND C. H. VAN LOAN. 1983. Matrix
Computations. The Johns Hopkins University Press,
Baltimore, Md.
HAURIE, A., AND P. L'ECUYER.1986. Approximationand Bounds in Discrete Event Dynamic Program-ming. IEEE Trans. Auto. Control AC-31, 227-235.
HEYMAN, D. P., AND M. J. SOBEL. 1984. StochasticModels in Operations Research, Vol. II. Stochastic
Optimization. McGraw-Hill, New York.
HOWARD, R. A. 1960. Dynamic Programming andMarkov Processes. John Wiley, New York.
JOHNSON,. A. 1989. Spline Approximation in Discrete
Dynamic Programming With Application to Sto-
chastic Multireservoir Systems. Ph.D. Dissertation,Cornell University, Ithaca, N.Y.
JOHNSON,. A., J. R. STEDINGERNDC. A. SHOEMAKER.
1988. Computational Improvements in Dynamic
Programming. FOREFRONTS, Center for Theory
and Simulation in Science and Engineering, Cornell
University, 7, 3-7.
KELMAN,., J. R. STEDINGER,. A. COOPER, . Hsu AND
S. YUAN. 1990. Sampling Stochastic Dynamic Pro-
gramming Applied to Reservoir Operation. Water
Resour. Res. 26 (3), 447-454.KITANIDIS, P. K., AND E. FoUFOULA-GEORGIOU. 1987.
Error Analysis of Conventional Discrete and Gra-
dient Dynamic Programming. Water Resour. Res.23 (5), 845-858.
L'EcuYER, P. 1985. Computing Transfer Lines Perfor-
mance Measures Using Dynamic Programming.
Comput. and Indus. Eng. 9, 387-393.
L'EcuYER, P. 1989. Computing Approximate Solutions
to Markov Renewal Programs With ContinuousState Spaces. Research Report DIUL-RR-8912,Universite Laval, Quebec, Canada.
LOUCKS,D. P., J. R. STEDINGERAND D. A. HAITH. 1981.
Water Resource Systems Planning and Analysis.
Prentice-Hall, Englewood Cliffs, N.J.LOVEJOY,W. S. 1986. Policy Bounds for Markov Deci-
sion Processes. Opns. Res. 34, 630-637.
Mawer, P. A., and D. Thorn. 1974. Improved DynamicProgramming Procedures and Their Practical Appli-cation to Water Resource Systems. Water Resour.Res. 10 (2),183-190.
MORIN, T. L. 1979. Computational Advances in
Dynamic Programming. In Dynamic Programmingand Its Applications, M. Puterman (ed.). AcademicPress, New York, 53-90.
NAG FORTRAN Library Manual. 1984. Mark 11, Vol.
3, Numerical Algorithms Group, Downers Grove,
Ill.
PEREIRA,M. V. F., AND L. M. V. G. PINTO. 1991. Multi-
Stage Stochastic Optimization Applied to Energy
Planning. Math. Prog. 22, 359-375.
PEREYRA,V., AND G. SCHERER.1973.EfficientComputerManipulation of Tensor Products With Applications
to Multidimensional Approximation. Math. Comp.
27, 595-605.PRENTER,P. M. 1975. Splines and VariationalMethods.
John Wiley, New York.
READ, E. G. 1990. Dual Dynamic Programming for
Linear Production/Inventory Systems. Computers
Math.Applic.19, 29-42.ROEFS,T. G., AND A. GUITRON. 1975.StochasticReser-
voir Models: Relative Computational Effort. Water
Resour.Res. 11 (6), 801-804.
RUDIN, W. 1976. Principlesof MathematicalAnalysis,
McGraw-Hill, New York.SAAD, M., AND A. TURGEON. 1988. Applicationof Prin-cipal Component Analysis to Long-Term Reservoir
Management. Water Resour. Res. 24 (7), 907-912.
SCHUMAKER,L.L. 1981.SplineFunctions:BasicTheory,John Wiley, New York.
SCHWEITZER, . J., AND A. SEIDMANN.1985.GeneralizedPolynomial Approximations in Markovian Decision
Processes. J. Math. Anal. & Appl.110, 568-582.
SHOEMAKER, C. A. 1979. Optimal Timing of MultipleApplications of Pesticides With Residual Toxicity.
Biometrics35, 803-812.SHOEMAKER, C. A. 1981. The Applicationof Dynamic
Programming and Other Optimization Methods to
Pest Management. EEE Trans.Auto. Control26,1125-1132.
SHOEMAKER, C. A. 1982. Optimal IntegratedControlofUnivoltine Pest Populations With Age Structure.
Opns. Res. 30, 40-61.
SHOEMAKER,C. A., AND S. A. JOHNSON.1989. StochasticNonlinear Optimal Control of Populations: Com-
putational Difficulties and Possible Solutions. In
MathematicalApproacheso Problems n ResourceManagement nd Epidemiology,C. Castillo-Chavez,S. A. Levin and C. A. Shoemaker (eds.). Springer-Verlag, New York, 67-81.
STEDINGER, J. R., B. F. SULE AND D. P. LOUCKS. 1984.
Stochastic Dynamic Programming Models for
Reservoir-Operation Optimization. Water Resour.Res. 20 (11), 1499-1505.
SU, Y., AND R. DEININGER. 1972. GeneralizationofWhite's Method of Successive Approximations
to Periodic Markovian Processes. Opns. Res. 20,
318-326.
SYDSAETER,K. 1981. Topicsin MathematicalAnalysisfor Economists. Academic Press, London.
TAKEUCHI, K., AND D. H. MOREAU. 1974.OptimalCon-trol of Multiunit Interbasin Water Resource Sys-tems. Water Resour. Res. 10 (3), 407-414.
8/6/2019 Numerical Solution of Continuous-state Dynamic Programs Using Linear and Spline Interpolation
http://slidepdf.com/reader/full/numerical-solution-of-continuous-state-dynamic-programs-using-linear-and-spline 18/18
500 / JOHNSON ET AL.
TEJADA-GuIBERT,. A. 1990. Spline Stochastic Dynamic
Programming or Multiple ReservoirSystem Oper-ation Optimization. Ph.D. Dissertation, CornellUniversity,Ithaca,New York.
TEJADA-GuIBERT,. A., S. A. JOHNSONAND J. R.
STEDINGER.993. Comparison of Two Approaches
forImplementingMultireservoirOperatingPoliciesUsing Stochastic Dynamic Programming. provi-sionallyaccepted).
TERRY, L. A., M. V. F. PEREIRA, . A. ARARIPE NETO,
L. F. C. A. SILVAAND P. R. H. SALES.1986. Coor-
dinating the Energy Generation of the BrazilianNational HydrothermalElectricalGeneratingSys-tem. Interfaces16, 16-38.
TROTT, W. J., ANDW. W-G. YEH. 1973. Optimizationof MultipleReservoirSystems.J. Hydraulic Eng.99, 1865-1884.
TURGEON, . 1980. OptimalOperationof MultireservoirPower Systems With Stochastic Inflows. WaterResour.Res. 16 (2), 275-283.
TURGEON,A. 1981. A DecompositionMethod for theLong-Term Scheduling of Reservoirs in Series.WaterResour.Res. 17 (6), 1565-1570.
WANG, D., AND B. J. ADAMS. 1986. Optimizationof Real-Time Reservoir Operation With MarkovDecision Processes. Water Resour. Res. 22 (3),
345-352.
WHITE,D. J. 1963. Dynamic Programming, Markov
Chains, and the Method of Successive Approxima-
tions. J. Math. Anal. & Appl. 6, 373-376.WHITE,D. J. 1969. Dynamic Programming, Holden-Day,
San Francisco.
WHITE,D. J. 1979. Elimination of Non Optimal Actions
in Markov Decision Processes. In Dynamic Pro-
gramming and Its Applications, M. Puterman (ed.).
Academic Press, New York, 131-160.
WHITE,D. J. 1985. Real Applications of Markov Deci-
sion Processes. Interfaces 15, 73-83.
WHITE,D. J. 1988. Further Real Applications of Markov
Decision Processes. Interfaces 18, 55-61.
WHITT,W. 1978. Approximation of Dynamic ProgramsI and II. Math. Opns. Res. 3, 231-243 (see also 1979,
179- 185).
YAKOWITZ, . 1982. Dynamic Programming Applica-
tions in Water Resources. Water Resour. Res. 18,
673-696.