Page 260 - Numerical Methods for Chemical Engineering

P. 260

Optimal control 249

the additional cost accrued after time t if we were to implement the optimal inputs at all
subsequent times s ∈ [t, t H ],
 
 t H ' 
σ(s, x(s), u(s))ds + π(x(t H )) (5.156)
V (t, x) = min u(s>t)
 
t
[0]
[0]
V (t 0 , x ) is the optimal value of F[u(t); x ]. At the horizon time V (t H , x) = π(x), but
how do we work backwards in time from this known result, and how from such a calculation,
do we determine the optimal u(t)?
We obtain a differential equation for V (t, x) by taking the derivative of (5.156), evaluated
at the optimal control input u(t),
d
V (t, x) =−σ(t, x(t), u(t)) (5.157)
dt
We relate this total time derivative to partial derivatives of V (t, x),
d ∂V dx ∂V
V (t, x(t)) = + ∇V · = + ∇V · f (t, x, u) (5.158)
dt ∂t dt ∂t
Thus, for the optimal control trajectory, (5.157) and (5.158) yield the partial differential
equation

∂V
=−σ(t, x, u) − ∇V · f (t, x, u) (5.159)
∂t
opt
We next deﬁne the “backward” time τ = t H − t, and rewrite the Bellman function as
ϕ(τ, x) = V (t, x). We have for this equation the “initial” condition ϕ(0, x) = π(x). We
then rewrite (5.159) as

∂ϕ
= σ(t H − τ, x, u) + ∇ϕ · f (t H − τ, x, u) (5.160)
∂τ
opt
For nonoptimal trajectories, ϕ(τ, x) increases faster with increasing τ than for the optimal
trajectory, so that, in general,
∂ϕ
≥ σ(t H − τ, x, u) + ∇ϕ · f (t H − τ, x, u) (5.161)
∂τ
Thus, for the optimal trajectory, we have the following PDE for ϕ(τ, x), known as the
Hamilton–Jacobi–Bellman (HJB) equation:
∂ϕ
= min u(τ, x) [σ(t H − τ, x, u) + ∇ϕ · f (t H − τ, x, u)] (5.162)
∂τ
We work backwards in time, and at each (τ, x), use the input u(τ, x) that minimizes the term
in the square brackets. Note that it is possible for ∂ϕ/∂τ < 0evenif σ(t H − τ, x, u) > 0, as
the requirement for monotonic decrease in V (t, x) with increasing t is dϕ/dτ = ∂φ/∂τ −
∇ϕ · f > 0.
In an open-loop control problem, we compute the optimal control input trajectory and
then fully implement it over the entire period t 0 ≤ t ≤ t H . For this, the direct approach

255 256 257 258 259 260 261 262 263 264 265