Custom objective terms
ProximalAlgorithms relies on the first-order primitives defined in ProximalCore. While a rich library of function types, implementing such primitives, is provided by ProximalOperators, one may need to formulate problems using custom objective terms. When that is the case, one only needs to implement the right first-order primitive, $\nabla f$ or $\operatorname{prox}_{\gamma f}$ or both, for algorithms to be able to work with $f$.
Defining the proximal mapping for a custom function type requires adding a method for ProximalCore.prox!
.
To compute gradients, ProximalAlgorithms provides a fallback definition for ProximalCore.gradient!
, relying on Zygote to use automatic differentiation. Therefore, you can provide any (differentiable) Julia function wherever gradients need to be taken, and everything will work out of the box.
If however one would like to provide their own gradient implementation (e.g. for efficiency reasons), they can simply implement a method for ProximalCore.gradient!
.
ProximalCore.prox
— Functionprox(f, x, gamma=1)
Proximal mapping for f
, evaluated at x
, with stepsize gamma
.
The proximal mapping is defined as
\[\mathrm{prox}_{\gamma f}(x) = \arg\min_z \left\{ f(z) + \tfrac{1}{2\gamma}\|z-x\|^2 \right\}.\]
Returns a tuple (y, fy)
consisting of
y
: the output of the proximal mapping off
atx
with stepsizegamma
fy
: the value off
aty
See also: prox!
.
ProximalCore.prox!
— Functionprox!(y, f, x, gamma=1)
In-place proximal mapping for f
, evaluated at x
, with stepsize gamma
.
The proximal mapping is defined as
\[\mathrm{prox}_{\gamma f}(x) = \arg\min_z \left\{ f(z) + \tfrac{1}{2\gamma}\|z-x\|^2 \right\}.\]
The result is written to the (pre-allocated) array y
, which should have the same shape/size as x
.
Returns the value of f
at y
.
See also: prox
.
ProximalCore.gradient
— Functiongradient(f, x)
Gradient (and value) of f
at x
.
Return a tuple (y, fx)
consisting of
y
: the gradient off
atx
fx
: the value off
atx
See also: gradient!
.
ProximalCore.gradient!
— Functiongradient!(y, f, x)
In-place gradient (and value) of f
at x
.
The gradient is written to the (pre-allocated) array y
, which should have the same shape/size as x
.
Returns the value f
at x
.
See also: gradient
.
Example: constrained Rosenbrock
Let's try to minimize the celebrated Rosenbrock function, but constrained to the unit norm ball. The cost function is
rosenbrock2D(x) = 100 * (x[2] - x[1]^2)^2 + (1 - x[1])^2
rosenbrock2D (generic function with 1 method)
To enforce the constraint, we define the indicator of the unit ball, together with its proximal mapping: this is simply projection onto the unit norm ball, so it is sufficient to normalize any given point that lies outside of the set.
using LinearAlgebra
using ProximalCore
struct IndUnitBall end
(::IndUnitBall)(x) = norm(x) > 1 ? eltype(x)(Inf) : eltype(x)(0)
function ProximalCore.prox!(y, ::IndUnitBall, x, gamma)
if norm(x) > 1
y .= x ./ norm(x)
else
y .= x
end
return zero(eltype(x))
end
We can now minimize the function, for which we will use PANOC
, which is a Newton-type method:
using ProximalAlgorithms
panoc = ProximalAlgorithms.PANOC()
solution, iterations = panoc(x0=-ones(2), f=rosenbrock2D, g=IndUnitBall())
([0.7864151541667157, 0.6176983125255732], 36)
Plotting the solution against the cost function contour and constraint, gives an idea of its correctness.
using Plots
contour(-2:0.1:2, -2:0.1:2, (x,y) -> rosenbrock2D([x, y]), fill=true, framestyle=:none, background=nothing)
plot!(Shape(cos.(0:0.01:2*pi), sin.(0:0.01:2*pi)), opacity=.5, label="feasible set")
scatter!([solution[1]], [solution[2]], color=:red, markershape=:star5, label="computed solution")
Example: counting operations
It is often interesting to measure how many operations (gradient- or prox-evaluation) an algorithm is taking. In fact, in algorithms involving backtracking or some other line-search logic, the iteration count may not be entirely representative of the amount of operations are being performed; or maybe some specific implementations require additional operations to be performed when checking stopping conditions. All of this makes it difficult to quantify the exact iteration complexity.
We can achieve this by wrapping functions in a dedicated Counting
type:
mutable struct Counting{T}
f::T
gradient_count::Int
prox_count::Int
end
Counting(f::T) where T = Counting{T}(f, 0, 0)
Main.Counting
Now we only need to intercept any call to gradient!
and prox!
and increase counters there:
function ProximalCore.gradient!(y, f::Counting, x)
f.gradient_count += 1
return ProximalCore.gradient!(y, f.f, x)
end
function ProximalCore.prox!(y, f::Counting, x, gamma)
f.prox_count += 1
return ProximalCore.prox!(y, f.f, x, gamma)
end
We can run again the previous example, this time wrapping the objective terms within Counting
:
f = Counting(rosenbrock2D)
g = Counting(IndUnitBall())
solution, iterations = panoc(x0=-ones(2), f=f, g=g)
([0.7864151541667157, 0.6176983125255732], 36)
and check how many operations where actually performed:
println(f.gradient_count)
println(g.prox_count)
115
79
This page was generated using Literate.jl.