Prox and gradient
ProximalOperators relies on the first-order primitives defined in ProximalCore. The following methods allow to evaluate the proximal mapping (and gradient, when defined) of mathematical functions, which are constructed according to what described in Functions and Calculus rules.
ProximalCore.prox
— Functionprox(f, x, gamma=1)
Proximal mapping for f
, evaluated at x
, with stepsize gamma
.
The proximal mapping is defined as
\[\mathrm{prox}_{\gamma f}(x) = \arg\min_z \left\{ f(z) + \tfrac{1}{2\gamma}\|z-x\|^2 \right\}.\]
Returns a tuple (y, fy)
consisting of
y
: the output of the proximal mapping off
atx
with stepsizegamma
fy
: the value off
aty
See also: prox!
.
ProximalCore.prox!
— Functionprox!(y, f, x, gamma=1)
In-place proximal mapping for f
, evaluated at x
, with stepsize gamma
.
The proximal mapping is defined as
\[\mathrm{prox}_{\gamma f}(x) = \arg\min_z \left\{ f(z) + \tfrac{1}{2\gamma}\|z-x\|^2 \right\}.\]
The result is written to the (pre-allocated) array y
, which should have the same shape/size as x
.
Returns the value of f
at y
.
See also: prox
.
ProximalCore.gradient
— Functiongradient(f, x)
Gradient (and value) of f
at x
.
Return a tuple (y, fx)
consisting of
y
: the gradient off
atx
fx
: the value off
atx
See also: gradient!
.
ProximalCore.gradient!
— Functiongradient!(y, f, x)
In-place gradient (and value) of f
at x
.
The gradient is written to the (pre-allocated) array y
, which should have the same shape/size as x
.
Returns the value f
at x
.
See also: gradient
.
Complex and matrix variables
The proximal mapping is usually discussed in the case of functions over $\mathbb{R}^n$. However, by adapting the inner product $\langle\cdot,\cdot\rangle$ and associated norm $\|\cdot\|$ adopted in its definition, one can extend the concept to functions over more general spaces. When functions of unidimensional arrays (vectors) are concerned, the standard Euclidean product and norm are used in defining prox
(therefore prox!
, but also gradient
and gradient!
). This are the inner product and norm which are computed by dot
and norm
in Julia.
When bidimensional, tridimensional (matrices and tensors) and higher dimensional arrays are concerned, then the definitions of proximal mapping and gradient are naturally extended by considering the appropriate inner product. For $k$-dimensional arrays, of size $n_1 \times n_2 \times \ldots \times n_k$, we consider the inner product
\[\langle A, B \rangle = \sum_{i_1,\ldots,i_k} A_{i_1,\ldots,i_k} \cdot B_{i_1,\ldots,i_k}\]
which reduces to the usual Euclidean product in case of unidimensional arrays, and to the trace product $\langle A, B \rangle = \mathrm{tr}(A^\top B)$ in the case of matrices (bidimensional arrays). This inner product, and the associated norm, are again the ones computed by dot
and norm
in Julia.
Multiple variable blocks
By combining functions together through SeparableSum
, the resulting function will have multiple inputs, i.e., it will be defined over the Cartesian product of the domains of the individual functions. To represent elements (points) of such product space, here we use Julia's Tuple
objects.
Example. Suppose that the following function needs to be represented:
\[f(x, Y) = \|x\|_1 + \|Y\|_*,\]
that is, the sum of the $L_1$ norm of some vector $x$ and the nuclear norm (the sum of the singular values) of some matrix $Y$. This is accomplished as follows:
using ProximalOperators
f = SeparableSum(NormL1(), NuclearNorm());
Now, function f
is defined over pairs of appropriate Array
objects. Likewise, the prox
method will take pairs of Array
s as inputs, and return pairs of Array
s as output:
x = randn(10); # some random vector
Y = randn(20, 30); # some random matrix
f_xY = f((x, Y)); # evaluates f at (x, Y)
(u, V), f_uV = prox(f, (x, Y), 1.3); # computes prox at (x, Y)
The same holds for the separable sum of more than two functions, in which case "pairs" are to be replaced with Tuple
s of the appropriate length.