3.6.2 Gradient Based Optimization
Most optimization algorithms are iterative. We begin with an initial guess for our design variables, denoted \(x^0\). We then iteratively update this guess until the optimal design is achieved.
where
\(q\)=iteration number
\(x^ q\) is our guess for \(x\) at iteration \(q\)
\(S^ q \in \mathbb {R}^ n\) is our vector search direction at iteration \(q\)
\(\alpha ^ q\) is the scalar step length at iteration \(q\)
\(x^0\) is given initial guess.
At each iteration we have two decisions to make: in which direction to move (i.e., what \(S^ q\) to choose, and how far to move along that direction (i.e., how large should \(\alpha ^ q\) be). Optimization algorithms determine the search direction \(S^ q\) according to some criteria. Gradient-based algorithms use gradient information to compute the search direction.
For \(J(x)\) a scalar objective function that depends on \(n\) design variables, the gradient of \(J\) with respect to \(x=[x_1 x_2 \ldots x_ n]^ T\) is a vector of length \(n\). In general, we need the gradient evaluated at some point \(x^ k\):
The second derivative of \(J\) with respect to \(x\) is a matrix, called the Hessian matrix, of dimension \(n \times n\). In general, we need the Hessian evaluated at some point \(x^ k\):
In other words, the \((i,j)\) entry of the Hessian is given by,
Consider the function \(J(x)=3x_1 + x_1 x_2 + x_3^{2} + 6x_2^{3} x_3\).
Enter the gradient evaluated at the point \((x_1,x_2,x_3)=(1,1,1)\):
Enter the last row of the Hessian evaluated at the point \((x_1,x_2,x_3)=(1,1,1)\):