9.4 Exploiting Structure
- Providing a function for solving KKT equations.
- The most expensive step of each iteration of cp() is the
solution of a set of linear equations (`KKT equations') of the form
where
The matrix W depends on the current iterates and is defined as follows. Suppose
Then W is a block-diagonal matrix,
with the following diagonal blocks.- The first block is a positive diagonal scaling with a
vector
:
This transformation is symmetric:
- The second block is a positive diagonal scaling with a
vector
:
This transformation is symmetric:
- The next M blocks are positive multiples of hyperbolic
Householder transformations:
where
These transformations are also symmetric:
- The last N blocks are congruence transformations with
nonsingular matrices:
In general, this operation is not symmetric, and
- W['dnl'] is the positive vector that defines the diagonal scaling for the nonlinear inequalities. W['dnli'] is its componentwise inverse.
- W['d'] is the positive vector that defines the diagonal scaling for the componentwise linear inequalities. W['di'] is its componentwise inverse.
- W['beta'] and W['v'] are lists of length M with the coefficients and vectors that define the hyperbolic Householder transformations.
- W['r'] is a list of length N with the matrices that define the the congruence transformations. W['rti'] is a list of length N with the transposes of the inverses of the matrices in W['r'].
The function call "f = kktsolver(x, z, W)" should return a routine for solving the KKT system (9.2) defined by x, z, W. It will be called as "f(bx, by, bz)". On entry, bx, by, bz contain the righthand side. On exit, they should contain the solution of the KKT system, with the last component scaled, i.e., on exit,
The role of the argument kktsolver in the function cpl() is similar, except that in (9.2),
- The first block is a positive diagonal scaling with a
vector
- Specifying constraints via Python functions.
- In the default use of cp(), the arguments G and
A are the coefficient matrices in the constraints
of (9.2).
It is also possible to specify these matrices by providing Python
functions that evaluate the corresponding matrix-vector products and
their adjoints.
If the argument G of conelp() is a Python function, it should be defined as follows:
G(u, v [, alpha[, beta[, trans]]])
-
- This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = 'N'.
Similarly, if the argument A is a Python function, then it must be defined as follows.
A(u, v [, alpha[, beta[, trans]]])
-
- This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = 'N'.
In a similar way, when the first argument F() of cp() returns matrices of first derivatives or second derivatives Df, H, these matrices can be specified as Python functions. If Df is a Python function, it should be defined as follows:
Df(u, v [, alpha[, beta[, trans]]])
-
- This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = 'N'.
If H is a Python function, it should be defined as follows:
H(u, v [, alpha[, beta]])
-
- This evaluates the matrix-vector product
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = 'N'.
If G, A, Df, or H are Python functions, then the argument kktsolver must also be provided.
-
As an example, we consider the unconstrained problem
where A is an m by n matrix with m less than n. The Hessian of the objective is diagonal plus a low-rank term:
We can exploit this property when solving (9.2) by applying the matrix inversion lemma. We first solve
and then obtain
The following code follows this method. It also uses BLAS functions for matrix-matrix and matrix-vector products.
from cvxopt.base import matrix, spdiag, mul, div, log
from cvxopt import base, blas, lapack, solvers, base
def l2ac(A, b):
"""
Solves
minimize (1/2) * ||A*x-b||_2^2 - sum log (1-xi^2)
assuming A is m x n with m << n.
"""
m, n = A.size
def F(x = None, z = None):
if x is None:
return 0, matrix(0.0, (n,1))
if max(abs(x)) >= 1.0:
return None
# r = A*x - b
r = -b
blas.gemv(A, x, r, beta = -1.0)
w = x**2
f = 0.5 * blas.nrm2(r)**2 - sum(log(1-w))
# gradf = A'*r + 2.0 * x ./ (1-w)
gradf = div(x, 1.0 - w)
blas.gemv(A, r, gradf, trans = 'T', beta = 2.0)
if z is None:
return f, gradf.T
else:
def Hf(u, v, alpha = 1.0, beta = 0.0):
# v := alpha * (A'*A*u + 2*((1+w)./(1-w)).*u + beta *v
v *= beta
v += 2.0 * alpha * mul(div(1.0+w, (1.0-w)**2), u)
blas.gemv(A, u, r)
blas.gemv(A, r, v, alpha = alpha, beta = 1.0, trans = 'T')
return f, gradf.T, Hf
# Custom solver for the Newton system
#
# z[0]*(A'*A + D)*x = bx
#
# where D = 2 * (1+x.^2) ./ (1-x.^2).^2. We apply the matrix inversion
# lemma and solve this as
#
# (A * D^-1 *A' + I) * v = A * D^-1 * bx / z[0]
# D * x = bx / z[0] - A'*v.
S = matrix(0.0, (m,m))
Asc = matrix(0.0, (m,n))
v = matrix(0.0, (m,1))
def Fkkt(x, z, W):
ds = (2.0 * div(1 + x**2, (1 - x**2)**2))**-0.5
base.gemm(A, spdiag(ds), Asc)
blas.syrk(Asc, S)
S[::m+1] += 1.0
lapack.potrf(S)
a = z[0]
def g(x, y, z):
x[:] = mul(x, ds) / a
blas.gemv(Asc, x, v)
lapack.potrs(S, v)
blas.gemv(Asc, v, x, alpha = -1.0, beta = 1.0, trans = 'T')
x[:] = mul(x, ds)
return g
return solvers.cp(F, kktsolver = Fkkt)['x']
![\begin{displaymath}
\left[\begin{array}{ccc}
H & A^T & \tilde G^T \\
A & 0 &...
...= \left[\begin{array}{c} b_x \\ b_y \\ b_z \end{array}\right],
\end{displaymath}](img189.gif)