ADMM¶
The fundamental class from which all ADMM algorithms are derived is
ADMM
, which supports problems of the form
See SplineL1
as an example of a class derived directly from
ADMM
. Additional classes within the admm.admm
module
support less general forms of problem; this specialisation allows for
a smaller number of methods that have to be overriden in derived
classes.
Classes derived from ADMM
should override/define the methods
and attributes in the following sections.
Initialisation¶
The __init__
method of the derived class should call the
ADMM
__init__
method to ensure proper initialisation.
State variables \(\mathbf{y}\) and \(\mathbf{u}\) are
initialised to zero by ADMM.yinit
and ADMM.uinit
respectively. These methods should be overridden if a different
initialization is desired.
Update Steps¶
The ADMM updates steps are defined by the following methods:
-
Solve
\[\mathbf{x}^{(j+1)} = \mathrm{argmin}_{\mathbf{x}} \;\; f(\mathbf{x}) + \frac{\rho}{2} \left\| A\mathbf{x} - \left( -B\mathbf{y}^{(j)} + \mathbf{c} - \mathbf{u}^{(j)} \right) \right\|_2^2\]This method should set
self.X
as a function ofself.Y
andself.U
. -
Solve
\[\mathbf{y}^{(j+1)} = \mathrm{argmin}_{\mathbf{y}} \;\; g(\mathbf{y}) + \frac{\rho}{2} \left\| B\mathbf{y} - \left( -A\mathbf{x}^{(j+1)} + \mathbf{c} - \mathbf{u}^{(j)} \right) \right\|_2^2\]This method should set
self.Y
as a function ofself.AX
andself.U
. The use ofself.AX
(i.e. \(A \mathbf{x}^{(j+1)}\) in the equation above) instead ofself.X
(i.e. \(\mathbf{x}^{(j+1)}\) in the equation above) is to allow relaxation (see Sec. 3.4.3 of [9]), as implemented inADMM.relax_AX
, to occur without need for explicit implementation in classes derived fromADMM
. -
Update dual variable
\[\mathbf{u}^{(j+1)} = \mathbf{u}^{(j)} + A\mathbf{x}^{(j+1)} + B\mathbf{y}^{(j+1)} - \mathbf{c}\]This method should set
self.U
as a function of the previous value ofself.U
andself.X
andself.Y
.
A derived class implementing a fully-specified ADMM problem (as
opposed to a partial specialisation) must define ADMM.xstep
and ADMM.ystep
. It is usually not necessary to override
ADMM.ustep
since it is defined in ADMM
in terms of
a call to ADMM.rsdl_r
: the \(\mathbf{u}\) update is
and the primal residual is
so we can express the \(\mathbf{u}\) update as
It is quite common for one of the update steps (ADMM.xstep
in
particular) to make use of pre-computed values, such as factorisations
of matrices involved in computing the update. If these pre-computed
values depend on the penalty parameter self.rho
they need to be
recomputed when the penalty parameter changes when the AutoRho
mechanism is enabled (see ADMM.update_rho
); this will take
place automatically if ADMM.rhochange
is overridden with a
method that updates these pre-computed values.
Constraint Definition¶
The ADMM problem constraint is defined by the following methods, which define the linear operators \(A\) and \(B\) and the transpose of \(A\) (required for computing the dual residual), and the constant vector \(\mathbf{c}\):
-
Compute and return \(A \mathbf{x}\)
-
Compute and return \(A^T \mathbf{u}\)
-
Compute and return \(B \mathbf{y}\)
-
Return constant \(\mathbf{c}\)
A derived class implementing a fully-specified ADMM problem (as opposed to a partial specialisation) will usually define all of these methods. If it does not, it is necessary to override all of the methods in Residual Evaluation.
Residual Evaluation¶
The following methods support evaluation of the primal and dual residuals:
-
Compute primal residual
\[\mathbf{r} = A\mathbf{x}^{(j+1)} + B\mathbf{y}^{(j+1)} - \mathbf{c}\] -
Compute dual residual
\[\mathbf{s} = \rho A^T B (\mathbf{y}^{(j+1)} - \mathbf{y}^{(j)})\] -
Compute primal residual normalisation factor
\[\mathrm{rn} = \mathrm{max}(\|A\mathbf{x}^{(j+1)}\|_2, \|B\mathbf{y}^{(j+1)}\|_2, \|\mathbf{c}\|_2)\] -
Compute dual residual normalisation factor
\[\mathrm{sn} = \rho \|A^T \mathbf{u}^{(j+1)} \|_2\]
These methods need not be overridden if those in
Constraint Definition are defined since ADMM
includes definitions of ADMM.rsdl_r
, ADMM.rsdl_s
,
ADMM.rsdl_rn
, and ADMM.rsdl_sn
in terms of
ADMM.cnst_A
, ADMM.cnst_AT
, ADMM.cnst_B
, and
ADMM.cnst_c
.
Iteration Statistics¶
There is a flexible but relatively complex mechanism supporting the recording of statistics such as objective function and residual values for each iteration.
IterationStats Definition¶
These statistics are recorded as a collections.namedtuple
class, self.IterationStats
. The fields of this namedtuple
are
defined by class method IterativeSolver.itstat_fields
, which
returns a tuple of fields consisting of the following components:
Iter
: Iteration numberA tuple of field names in
ADMM.itstat_fields_objfn
: Fields representing the objective function and and its individual termsPrimalRsdl
: Norm of primal residualDualRsdl
: Norm of dual ResidualEpsPrimal
: Primal residual stopping tolerance \(\epsilon_{\mathrm{pri}}\)EpsDual
: Dual residual stopping tolerance \(\epsilon_{\mathrm{dua}}\)Rho
: Penalty parameterA tuple of field names in
ADMM.itstat_fields_extra
: Optional extra fieldsTime
: Cumulative run time
In most cases a derived class will simply override
ADMM.itstat_fields_objfn
and
ADMM.itstat_fields_extra
to customise the desired iteration
statistics fields, but if more flexibility is required,
IterativeSolver.itstat_fields
should be overridden.
IterationStats Construction¶
The actual construction of the self.IterationStats
namedtuple
for each iteration is performed by ADMM.iteration_stats
,
which expects that self.IterationStats
follows the structure
defined by IterativeSolver.itstat_fields
. Tuples of values
corresponding to the fields defined in
ADMM.itstat_fields_objfn
and
ADMM.itstat_fields_extra
should be returned by
ADMM.eval_objfn
and ADMM.itstat_extra
respectively.
In ADMM
, ADMM.itstat_fields_objfn
is defined as the
tuple ('ObjFun', 'FVal', 'GVal')
, and ADMM.eval_objfn
constructs the corresponding field values by calls to
ADMM.obfn_f
and ADMM.obfn_g
, which are expected to
return the values of \(f(\mathbf{x})\) and \(g(\mathbf{y})\)
respectively. In the simplest case it is sufficient to just define
ADMM.obfn_f
and ADMM.obfn_g
in a derived class, but
in most cases one would instead override
ADMM.itstat_fields_objfn
and ADMM.eval_objfn
(and
possibly ADMM.itstat_fields_extra
and
ADMM.itstat_extra
as well).
Status Display¶
When option Verbose
is enabled, a summary of the iterations
statistics is printed to the standard output. The printing of this
summary is controlled by ADMM.display_start
,
ADMM.display_status
, and ADMM.display_end
, which
will usually not need to be overridden since there is a flexible
method of customising the information displayed by these methods.
Class method ADMM.hdrtxt
returns a tuple of strings which
will be displayed as the headings for the displayed columns of
iteration statistics, and class method ADMM.hdrval
constructs
a dictionary that defines a mapping between these column heading
strings and corresponding field names in the self.IterationStats
namedtuple
. These two methods can be overridden if necessary, but
in most cases it is sufficient to override ADMM.hdrtxt_objfn
and ADMM.hdrval_objfun
, which respectively define the header
strings and mappings for the statistics related to the objective
function (see ADMM.itstat_fields_objfn
and
ADMM.eval_objfn
in IterationStats Construction).