ADMM

The fundamental class from which all ADMM algorithms are derived is ADMM, which supports problems of the form

\[\mathrm{argmin}_{\mathbf{x},\mathbf{y}} \;\; f(\mathbf{x}) + g(\mathbf{y}) \;\;\mathrm{such\;that}\;\; A\mathbf{x} + B\mathbf{y} = \mathbf{c} \;\;.\]

See SplineL1 as an example of a class derived directly from ADMM. Additional classes within the admm.admm module support less general forms of problem; this specialisation allows for a smaller number of methods that have to be overriden in derived classes.

Classes derived from ADMM should override/define the methods and attributes in the following sections.

Initialisation

The __init__ method of the derived class should call the ADMM __init__ method to ensure proper initialisation.

State variables \(\mathbf{y}\) and \(\mathbf{u}\) are initialised to zero by ADMM.yinit and ADMM.uinit respectively. These methods should be overridden if a different initialization is desired.

Update Steps

The ADMM updates steps are defined by the following methods:

  • ADMM.xstep

    Solve

    \[\mathbf{x}^{(j+1)} = \mathrm{argmin}_{\mathbf{x}} \;\; f(\mathbf{x}) + \frac{\rho}{2} \left\| A\mathbf{x} - \left( -B\mathbf{y}^{(j)} + \mathbf{c} - \mathbf{u}^{(j)} \right) \right\|_2^2\]

    This method should set self.X as a function of self.Y and self.U.

  • ADMM.ystep

    Solve

    \[\mathbf{y}^{(j+1)} = \mathrm{argmin}_{\mathbf{y}} \;\; g(\mathbf{y}) + \frac{\rho}{2} \left\| B\mathbf{y} - \left( -A\mathbf{x}^{(j+1)} + \mathbf{c} - \mathbf{u}^{(j)} \right) \right\|_2^2\]

    This method should set self.Y as a function of self.AX and self.U. The use of self.AX (i.e. \(A \mathbf{x}^{(j+1)}\) in the equation above) instead of self.X (i.e. \(\mathbf{x}^{(j+1)}\) in the equation above) is to allow relaxation (see Sec. 3.4.3 of [9]), as implemented in ADMM.relax_AX, to occur without need for explicit implementation in classes derived from ADMM.

  • ADMM.ustep

    Update dual variable

    \[\mathbf{u}^{(j+1)} = \mathbf{u}^{(j)} + A\mathbf{x}^{(j+1)} + B\mathbf{y}^{(j+1)} - \mathbf{c}\]

    This method should set self.U as a function of the previous value of self.U and self.X and self.Y.


A derived class implementing a fully-specified ADMM problem (as opposed to a partial specialisation) must define ADMM.xstep and ADMM.ystep. It is usually not necessary to override ADMM.ustep since it is defined in ADMM in terms of a call to ADMM.rsdl_r: the \(\mathbf{u}\) update is

\[\mathbf{u}^{(j+1)} = \mathbf{u}^{(j)} + A\mathbf{x}^{(j+1)} + B\mathbf{y}^{(j+1)} - \mathbf{c}\]

and the primal residual is

\[\mathbf{r}^{(j+1)} = A\mathbf{x}^{(j+1)} + B\mathbf{y}^{(j+1)} - \mathbf{c} \;,\]

so we can express the \(\mathbf{u}\) update as

\[\mathbf{u}^{(j+1)} = \mathbf{u}^{(j)} + \mathbf{r}^{(j+1)} \;.\]

It is quite common for one of the update steps (ADMM.xstep in particular) to make use of pre-computed values, such as factorisations of matrices involved in computing the update. If these pre-computed values depend on the penalty parameter self.rho they need to be recomputed when the penalty parameter changes when the AutoRho mechanism is enabled (see ADMM.update_rho); this will take place automatically if ADMM.rhochange is overridden with a method that updates these pre-computed values.

Constraint Definition

The ADMM problem constraint is defined by the following methods, which define the linear operators \(A\) and \(B\) and the transpose of \(A\) (required for computing the dual residual), and the constant vector \(\mathbf{c}\):


A derived class implementing a fully-specified ADMM problem (as opposed to a partial specialisation) will usually define all of these methods. If it does not, it is necessary to override all of the methods in Residual Evaluation.

Residual Evaluation

The following methods support evaluation of the primal and dual residuals:

  • ADMM.rsdl_r

    Compute primal residual

    \[\mathbf{r} = A\mathbf{x}^{(j+1)} + B\mathbf{y}^{(j+1)} - \mathbf{c}\]
  • ADMM.rsdl_s

    Compute dual residual

    \[\mathbf{s} = \rho A^T B (\mathbf{y}^{(j+1)} - \mathbf{y}^{(j)})\]
  • ADMM.rsdl_rn

    Compute primal residual normalisation factor

    \[\mathrm{rn} = \mathrm{max}(\|A\mathbf{x}^{(j+1)}\|_2, \|B\mathbf{y}^{(j+1)}\|_2, \|\mathbf{c}\|_2)\]
  • ADMM.rsdl_sn

    Compute dual residual normalisation factor

    \[\mathrm{sn} = \rho \|A^T \mathbf{u}^{(j+1)} \|_2\]

These methods need not be overridden if those in Constraint Definition are defined since ADMM includes definitions of ADMM.rsdl_r, ADMM.rsdl_s, ADMM.rsdl_rn, and ADMM.rsdl_sn in terms of ADMM.cnst_A, ADMM.cnst_AT, ADMM.cnst_B, and ADMM.cnst_c.

Iteration Statistics

There is a flexible but relatively complex mechanism supporting the recording of statistics such as objective function and residual values for each iteration.

IterationStats Definition

These statistics are recorded as a collections.namedtuple class, self.IterationStats. The fields of this namedtuple are defined by class method IterativeSolver.itstat_fields, which returns a tuple of fields consisting of the following components:

  • Iter : Iteration number

  • A tuple of field names in ADMM.itstat_fields_objfn : Fields representing the objective function and and its individual terms

  • PrimalRsdl : Norm of primal residual

  • DualRsdl : Norm of dual Residual

  • EpsPrimal : Primal residual stopping tolerance \(\epsilon_{\mathrm{pri}}\)

  • EpsDual : Dual residual stopping tolerance \(\epsilon_{\mathrm{dua}}\)

  • Rho : Penalty parameter

  • A tuple of field names in ADMM.itstat_fields_extra : Optional extra fields

  • Time : Cumulative run time

In most cases a derived class will simply override ADMM.itstat_fields_objfn and ADMM.itstat_fields_extra to customise the desired iteration statistics fields, but if more flexibility is required, IterativeSolver.itstat_fields should be overridden.

IterationStats Construction

The actual construction of the self.IterationStats namedtuple for each iteration is performed by ADMM.iteration_stats, which expects that self.IterationStats follows the structure defined by IterativeSolver.itstat_fields. Tuples of values corresponding to the fields defined in ADMM.itstat_fields_objfn and ADMM.itstat_fields_extra should be returned by ADMM.eval_objfn and ADMM.itstat_extra respectively.

In ADMM, ADMM.itstat_fields_objfn is defined as the tuple ('ObjFun', 'FVal', 'GVal'), and ADMM.eval_objfn constructs the corresponding field values by calls to ADMM.obfn_f and ADMM.obfn_g, which are expected to return the values of \(f(\mathbf{x})\) and \(g(\mathbf{y})\) respectively. In the simplest case it is sufficient to just define ADMM.obfn_f and ADMM.obfn_g in a derived class, but in most cases one would instead override ADMM.itstat_fields_objfn and ADMM.eval_objfn (and possibly ADMM.itstat_fields_extra and ADMM.itstat_extra as well).

Status Display

When option Verbose is enabled, a summary of the iterations statistics is printed to the standard output. The printing of this summary is controlled by ADMM.display_start, ADMM.display_status, and ADMM.display_end, which will usually not need to be overridden since there is a flexible method of customising the information displayed by these methods.

Class method ADMM.hdrtxt returns a tuple of strings which will be displayed as the headings for the displayed columns of iteration statistics, and class method ADMM.hdrval constructs a dictionary that defines a mapping between these column heading strings and corresponding field names in the self.IterationStats namedtuple. These two methods can be overridden if necessary, but in most cases it is sufficient to override ADMM.hdrtxt_objfn and ADMM.hdrval_objfun, which respectively define the header strings and mappings for the statistics related to the objective function (see ADMM.itstat_fields_objfn and ADMM.eval_objfn in IterationStats Construction).