Optimality conditions

Assume u is the solution of a problem parameterized by r. The state equation \(E(r,u)=0\) defines implicitly u with respect to r. The derivative \(A=E_u\) is the matrix of the system, assumed regular. The derivative of the implicit function gives:

\[E_r+E_u\frac{\partial u}{\partial r}=0\]

that is:

\[A\frac{\partial u}{\partial r}=-E_r\]

These are the sensitivities of the solution with respect to the parameters.

Having a cost function \(J = J(r,u)\), the Euler-Lagrange conditions of optimality are derived -derivated- from the Lagrangian function \(L=J+\lambda.E\), where \(\lambda\) is the lagrange multiplier of the state equation, named adjoint state.

\[\begin{split}J_r+\lambda.E_r=0\\ J_u+\lambda.E_u=0\\ E=0\end{split}\]

The variable \(\lambda\) can be deduced from the second equation re-written as \(A^T\lambda=-J_u\). Substituting \(\lambda\) in the first equation, we get

\[\begin{split}J_r+(-A^{-T}J_u).E_r=0\\ \text{i.e. } J_r+J_u.(-A^{-1}E_r)=0\\ \text{i.e. } J_r+J_u.\frac{\partial u}{\partial r}=0\end{split}\]

as could have been derived from the composition of \(J=J(r,u(r))\). This is the total derivative of J vs r. It needs only one resolution of a linear system the matrix of which is \(A^T\), giving the adjoint state \(\lambda\) - that method is called adjoint, meanwhile the direct method needs the resolution of as many linear systems of matrix A as there are parameters r, that is the sensitivities. This is why it is preferred in gradient-base methods of optimisation. For higher derivatives or higher order methods, like second-order so-called Newton’s algorithm, the adjoint state and the derivatives of u are useful.

Solving the optimality conditions by Newton’s algorithm needs to get one order higher derivative than in the simple resolution of the state equation, since the first order derivatives of the state function appear yet in the conditions. Derivating the matrix of the linearized state equation? Isn’it too hard and too big, that is foolish?

In fact, as we want the linearized system of optimality conditions reduced to the parameters space, these worrying derivatives appear in a matrix product (Schur complement in the right-hand side), so that it should remain feasible with standard computations of right-hand sides and matrices.

Inequality conditions

Optimisations seldom come without these inequalities, such as some interval of definition for r or other constraints ensuring the validity of the state equation. Just consider a domain with some thickness necesseraliy positive…

What Navpactos can do

Since all we need to formulate the problem is J and E, in Navpactos a ScalarBuilder for J and a VectorBuilder for E are the only entries. The ScalarBuilder and the VectorBuilder are designed to accumulate tensorial loops. As an example, J could be the sum of different integrals over surfaces of a mesh and E an integral over the whole domain.

If a first order method (gradient-based) is used, the adjoint state will be computed, and the inequality part with its lagrange multiplier be added to the gradient.

For second order methods the hessian of the lagrangian with the inequality part will be computed. All automatically, and exactly!

Nodal constraints

In geometrical or shape design, the nodes of the mesh must follow the evolution of the geometrical parameters, be it either explicit when shapes are simple or implicit. In those cases, u involves the position of the nodes and a derivative with respect to them is required. Navpactos just does it for you.

Your variant

Optimisation methods have a lot of variants: for the line search and, before the line search, for the correction of the direction given by Newton’s algorithm, specially in the case of inequalities. Interior-point methods, deflection of the direction towards a more inner descent in order to move off the inequality limits, taking in account the curvature of the boundaries…

All these methods can be easily controlled in the high level main algorithm, probably calling some function built with Navpactos' formulating system. Navpactos is aimed at easing the computations that involve complicated tensorial loops like finite elements ones, not replacing the methods you are designing. If you can easily get the hessian, you can vary about the variants… Navpactos will neithertheless give you some default algorithms.

Some examples…

… to come.