Disciplined Nonlinear Programming

Author

CVXPY Developers and Balasubramanian Narasimhan

Introduction

Disciplined Nonlinear Programming (DNLP), introduced in CVXR 1.9.1, extends CVXR beyond convex optimization to smooth nonlinear programs, which need not be convex. You build the problem from differentiable atoms, check it with is_dnlp(), and solve it with psolve(prob, nlp = TRUE). Every DCP problem is also a DNLP, and the disciplined nonlinear grammar additionally allows smooth atoms in forms DCP forbids (for example, a product of two variable-dependent expressions).

The DNLP path needs two optional pieces: the sparsediff package for automatic derivatives, and a nonlinear solver. These live in Enhances (not installed by default); CVXR checks for them at solve time and errors informatively if one is missing, so you do not need to guard your own calls. You can still see what is available:

have_dnlp <- requireNamespace("sparsediff", quietly = TRUE) &&
             requireNamespace("Uno", quietly = TRUE)
c(sparsediff = requireNamespace("sparsediff", quietly = TRUE),
  Uno        = requireNamespace("Uno", quietly = TRUE),
  ipopt      = requireNamespace("ipopt", quietly = TRUE))
sparsediff        Uno      ipopt 
      TRUE       TRUE       TRUE 

This vignette uses Uno — the solver most R users will have, since it is headed for CRAN and bundles its own linear-algebra backend. We name it explicitly with solver = "UNO" and quiet it with logger = "SILENT". (The optional ipopt package provides IPOPT, which nlp = TRUE prefers when present; see Solvers below.)

A first smooth program

Every DCP problem is a DNLP, so the simplest way to meet the NLP path is to send a convex problem down it.

x <- Variable(2)
prob <- Problem(Minimize(sum_squares(x - c(1, 2))))
is_dnlp(prob)
[1] TRUE
opt <- psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT")
opt                 # optimal value (0)
[1] 0
value(x)            # (1, 2)
     [,1]
[1,]    1
[2,]    2

A nonconvex example: maximizing a quadratic

Unlike DCP, DNLP can express genuinely nonconvex problems. A classic example maximizes a quadratic form over the unit sphere, \[\text{maximize}\ \ x^\top A x \quad\text{subject to}\quad \lVert x \rVert_2^2 = 1,\] whose optimal value is the largest eigenvalue of \(A\). Maximizing a convex quadratic is nonconvex, so this is not a DCP problem — but it is a DNLP. We set the starting point with value(x) <- ....

set.seed(0)
n <- 3
A <- matrix(rnorm(n * n), n); A <- t(A) %*% A      # a random PSD matrix

x <- Variable(n)
prob <- Problem(Maximize(quad_form(x, A)), list(sum_squares(x) == 1))
is_dnlp(prob)
[1] TRUE
value(x) <- rep(1, n)                              # initial point
opt_nlp <- psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT")

c(dnlp = opt_nlp, max_eigenvalue = max(eigen(A, only.values = TRUE)$values))
          dnlp max_eigenvalue 
      4.659154       4.659154 
Warning

Unlike convex optimization, an NLP solver gives no global guarantee. On a nonconvex problem it may return a local (not global) optimum, give a result that depends on the starting point, or even fail to find a feasible point. Supplying a good starting point with value(x) <- ... — or trying several with best_of (below) — improves the odds of finding the global optimum.

Smooth atoms

DNLP adds a family of classically differentiable atoms that are not part of the convex grammar, so they can appear anywhere in a smooth objective or constraint. Each is twice continuously differentiable on the interior of its domain:

Atom Meaning Domain
sin(x), cos(x) Trigonometric functions all \(x\)
tan(x) Tangent \(x \in (-\pi/2,\ \pi/2)\)
sinh(x), tanh(x) Hyperbolic functions all \(x\)
asinh(x) Inverse hyperbolic sine all \(x\)
atanh(x) Inverse hyperbolic tangent \(x \in (-1,\ 1)\)
normcdf(x) Standard normal CDF all \(x\)
prod(x) Product of entries (also prod_entries(X, axis = ...)) all \(x\)

These join the smooth atoms CVXR already had — affine maps, exp, log, entr, logistic, kl_div, rel_entr, xexp, power (constant exponent), geo_mean, log_sum_exp, quad_form, and quad_over_lin — which DNLP can now use in either curvature direction.

Atoms with new meaning. A few existing atoms become smooth in DNLP and may be used in ways DCP forbids: elementwise multiplication x * y and quad_form(x, Q) accept two variable arguments (in DCP, x * y requires one operand to be constant). Nonsmooth convex or concave atoms (abs, max, min, norm1, …) keep their curvature and may appear wherever a convex or concave expression is allowed. (When dividing, as in 1/x, DNLP assumes the denominator is nonnegative.)

Inspecting an expression

Three predicates classify an individual expression under the DNLP grammar, mirroring CVXPY’s expr.is_smooth() family. is_smooth(e) is TRUE when e is twice continuously differentiable on the interior of its domain; is_linearizable_convex(e) and is_linearizable_concave(e) report whether e becomes convex (respectively concave) once every smooth subexpression is linearized — the composition rule is_dnlp() applies problem-wide.

x <- Variable()
classify <- function(e) c(smooth      = is_smooth(e),
                          lin_convex  = is_linearizable_convex(e),
                          lin_concave = is_linearizable_concave(e))
rbind(`sin(x)`  = classify(sin(x)),    # smooth atom
      `abs(x)`  = classify(abs(x)),    # nonsmooth, convex only
      `-abs(x)` = classify(-abs(x)))   # nonsmooth, concave only
        smooth lin_convex lin_concave
sin(x)    TRUE       TRUE        TRUE
abs(x)   FALSE       TRUE       FALSE
-abs(x)  FALSE      FALSE        TRUE

A smooth atom such as sin(x) is linearizable in both directions (it linearizes to an affine expression), so it may appear anywhere in a smooth objective or constraint. A nonsmooth convex atom like abs(x) is linearizable-convex only, and its negation is linearizable-concave only — they keep the curvature DCP assigns them.

Beyond DCP: a nonconvex form of a convex problem

Consider the maximum-entropy distribution on \(\{1,\dots,K\}\) with a prescribed mean. The negative entropy \(\sum_i p_i \log p_i\) is convex, but if we write it literally as a sum of products p * log(p) — rather than with the entr atom — the expression is not DCP, because it multiplies two variable-dependent terms. It is, however, a perfectly good smooth objective, so it is a DNLP.

K <- 5; support <- 1:K; mu <- 2
p <- Variable(K, nonneg = TRUE)

objective   <- Minimize(sum(p * log(p)))          # not DCP, but smooth
constraints <- list(sum(p) == 1, sum(support * p) == mu)
prob <- Problem(objective, constraints)

c(is_dcp = is_dcp(prob), is_dnlp = is_dnlp(prob))
 is_dcp is_dnlp 
  FALSE    TRUE 
neg_entropy <- psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT")
round(value(p), 5)
        [,1]
[1,] 0.45936
[2,] 0.26079
[3,] 0.14806
[4,] 0.08406
[5,] 0.04772
c(total = sum(value(p)), mean = sum(support * value(p)))
total  mean 
    1     2 

Maximum-entropy theory says the solution is a discrete exponential (Gibbs) distribution \(p_i \propto e^{-\theta i}\) — equivalently, \(\log p_i\) is affine in \(i\). The solution bears this out: the successive differences of \(\log p_i\) are constant, and that common value is \(-\theta\).

log_p <- log(as.numeric(value(p)))
round(diff(log_p), 5)                 # constant => Gibbs form
[1] -0.5661 -0.5661 -0.5661 -0.5661
theta <- -mean(diff(log_p))
cf <- exp(-theta * support); cf <- cf / sum(cf)
round(cf, 5)                          # matches value(p)
[1] 0.45936 0.26079 0.14806 0.08406 0.04772

Duals on the NLP path

As a CVXR extension beyond CVXPY, the UNO solver recovers constraint duals after an NLP solve. For the entropy problem, the multiplier on the mean constraint is exactly the Gibbs parameter \(\theta\) found above:

dual_value(constraints[[2]])          # mean-constraint multiplier (= theta)
          [,1]
[1,] 0.5660964

The IPOPT path, by contrast, returns no duals — matching CVXPY’s NLP interface. If the optional ipopt package is installed, you can see the difference (dual_value() is NULL after an IPOPT solve):

psolve(prob, nlp = TRUE, solver = "IPOPT", print_level = 0L, sb = "yes")
[1] -1.344023
dual_value(constraints[[2]])          # NULL: IPOPT path exposes no duals
          [,1]
[1,] 0.5660964

Nonconvex problems and random restarts

DNLP problems can be genuinely nonconvex, with multiple local minima. A nonlinear solver returns a local optimum that depends on the starting point. For example, minimizing \(\sin(x) + \tfrac{1}{10}x^2\) over \([-10, 10]\) has several local minima.

psolve(prob, nlp = TRUE, best_of = n) solves from n random initial points and keeps the best result. Each starting point is drawn from the variable’s finite bounds, or from a sampling region you set with sample_bounds().

set.seed(1)
x <- Variable(bounds = list(-10, 10))
prob <- Problem(Minimize(sin(x) + 0.1 * x^2))

best <- psolve(prob, nlp = TRUE, solver = "UNO", logger = "SILENT", best_of = 5)
best                                  # best objective over the 5 restarts
[1] -0.7945823
value(x)
         [,1]
[1,] -1.30644
## the per-restart objectives are available for inspection
solver_stats(prob)@extra_stats$all_objs_from_best_of
[1] -0.7945823 -0.7945823 -0.7945823 -0.7945823 -0.7945823

For a variable with no finite bounds, supply a sampling region explicitly:

x <- Variable(2)
sample_bounds(x) <- c(-5, 5)          # draw restarts from [-5, 5]^2

Solvers

CVXR supports two open-source nonlinear solvers for the DNLP path, each from an optional R package (both in Enhances, so not installed by default):

Solver R package Notes
UNO Uno On CRAN; self-contained (bundles MUMPS/HiGHS); also recovers dual_value(). The practical default for R users.
IPOPT ipopt Interior point; not on CRAN (license), so installed only by determined users. When present, nlp = TRUE prefers it (matching CVXPY).

(KNITRO and COPT solver names are registered for CVXPY parity but require bindings not yet available in R.)

With nlp = TRUE and no explicit solver, CVXR picks an installed NLP solver automatically — IPOPT first if available, otherwise UNO. Because IPOPT is rarely installed, most R users get UNO; we name it explicitly above for reproducibility. You can always request one:

psolve(prob, nlp = TRUE, solver = "UNO")       # Uno (recommended for R)
psolve(prob, nlp = TRUE, solver = "uno_sqp")   # Uno's filterSQP preset
psolve(prob, nlp = TRUE, solver = "IPOPT")     # Ipopt, if the ipopt package is installed

A note on UNO: the CVXR interface defaults UNO to its interior-point ipopt preset (with MUMPS), which differs from CVXPY’s filtersqp default. The bundled Uno build is HiGHS-only, and HiGHS solves only convex quadratic subproblems, so filtersqp is reliable only for problems whose subproblem Hessians stay positive semidefinite; the interior-point preset is robust on nonconvex problems too.

Notes

  • DNLP needs the sparsediff package (for derivatives) and an NLP solver — Uno or the optional ipopt. All are in Enhances (not installed by default). CVXR checks for them at solve time, so you need not guard your own calls: if anything is missing, psolve(nlp = TRUE) raises an informative error naming exactly what to install (sparsediff and Uno from CRAN; ipopt from https://bnaras.github.io/ipopt/).
  • is_dnlp(prob) reports whether a problem satisfies the disciplined nonlinear grammar; since DCP \(\subset\) DNLP, every DCP problem qualifies.
  • Because the problems may be nonconvex, the returned point is a local optimum; use best_of to improve the odds of finding the global optimum.

References