Custom objective terms

ProximalAlgorithms relies on the first-order primitives defined in ProximalCore. While a rich library of function types, implementing such primitives, is provided by ProximalOperators, one may need to formulate problems using custom objective terms. When that is the case, one only needs to implement the right first-order primitive, $\nabla f$ or $\operatorname{prox}_{\gamma f}$ or both, for algorithms to be able to work with $f$.

Defining the proximal mapping for a custom function type requires adding a method for ProximalCore.prox!.

To compute gradients, algorithms use value_and_gradient_closure: this relies on AbstractDifferentiation, for automatic differentiation with any of its supported backends, when functions are wrapped in AutoDifferentiable, as the examples below show.

If however you would like to provide your own gradient implementation (e.g. for efficiency reasons), you can simply implement a method for value_and_gradient_closure on your own function type.

ProximalCore.prox — Function

prox(f, x, gamma=1)

Proximal mapping for f, evaluated at x, with stepsize gamma.

The proximal mapping is defined as

\[\mathrm{prox}_{\gamma f}(x) = \arg\min_z \left\{ f(z) + \tfrac{1}{2\gamma}\|z-x\|^2 \right\}.\]

Returns a tuple (y, fy) consisting of

y: the output of the proximal mapping of f at x with stepsize gamma
fy: the value of f at y

Example: constrained Rosenbrock

Let's try to minimize the celebrated Rosenbrock function, but constrained to the unit norm ball. The cost function is

using Zygote
using AbstractDifferentiation: ZygoteBackend
using ProximalAlgorithms

rosenbrock2D = ProximalAlgorithms.AutoDifferentiable(
    x -> 100 * (x[2] - x[1]^2)^2 + (1 - x[1])^2,
    ZygoteBackend(),
)

ProximalAlgorithms.AutoDifferentiable{Main.var"#1#2", AbstractDifferentiation.ReverseRuleConfigBackend{Zygote.ZygoteRuleConfig{Zygote.Context{false}}}}(Main.var"#1#2"(), AbstractDifferentiation.ReverseRuleConfigBackend{Zygote.ZygoteRuleConfig{Zygote.Context{false}}}(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing))))

To enforce the constraint, we define the indicator of the unit ball, together with its proximal mapping: this is simply projection onto the unit norm ball, so it is sufficient to normalize any given point that lies outside of the set.

using LinearAlgebra
using ProximalCore

struct IndUnitBall end

(::IndUnitBall)(x) = norm(x) > 1 ? eltype(x)(Inf) : eltype(x)(0)

function ProximalCore.prox!(y, ::IndUnitBall, x, gamma)
    if norm(x) > 1
        y .= x ./ norm(x)
    else
        y .= x
    end
    return zero(eltype(x))
end

We can now minimize the function, for which we will use PANOC, which is a Newton-type method:

panoc = ProximalAlgorithms.PANOC()
solution, iterations = panoc(x0 = -ones(2), f = rosenbrock2D, g = IndUnitBall())

([0.7864151541667181, 0.6176983125255701], 36)

Plotting the solution against the cost function contour and constraint, gives an idea of its correctness.

using Plots

contour(
    -2:0.1:2,
    -2:0.1:2,
    (x, y) -> rosenbrock2D([x, y]),
    fill = true,
    framestyle = :none,
    background = nothing,
)
plot!(Shape(cos.(0:0.01:2*pi), sin.(0:0.01:2*pi)), opacity = 0.5, label = "feasible set")
scatter!(
    [solution[1]],
    [solution[2]],
    color = :red,
    markershape = :star5,
    label = "computed solution",
)

Example: counting operations

It is often interesting to measure how many operations (gradient- or prox-evaluation) an algorithm is taking. In fact, in algorithms involving backtracking or some other line-search logic, the iteration count may not be entirely representative of the amount of operations are being performed; or maybe some specific implementations require additional operations to be performed when checking stopping conditions. All of this makes it difficult to quantify the exact iteration complexity.

We can achieve this by wrapping functions in a dedicated Counting type:

mutable struct Counting{T}
    f::T
    eval_count::Int
    gradient_count::Int
    prox_count::Int
end

Counting(f::T) where {T} = Counting{T}(f, 0, 0, 0)

Main.Counting

Now we only need to intercept any call to value_and_gradient_closure and prox! and increase counters there:

function ProximalAlgorithms.value_and_gradient_closure(f::Counting, x)
    f.eval_count += 1
    fx, pb = ProximalAlgorithms.value_and_gradient_closure(f.f, x)
    function counting_pullback()
        f.gradient_count += 1
        return pb()
    end
    return fx, counting_pullback
end

function ProximalCore.prox!(y, f::Counting, x, gamma)
    f.prox_count += 1
    return ProximalCore.prox!(y, f.f, x, gamma)
end

We can run again the previous example, this time wrapping the objective terms within Counting:

f = Counting(rosenbrock2D)
g = Counting(IndUnitBall())

solution, iterations = panoc(x0 = -ones(2), f = f, g = g)

([0.7864151541667181, 0.6176983125255701], 36)

and check how many operations where actually performed:

println("function evals: $(f.eval_count)")
println("gradient evals: $(f.gradient_count)")
println("    prox evals: $(g.prox_count)")

function evals: 115
gradient evals: 107
    prox evals: 79

This page was generated using Literate.jl.