x <- Variable()
p <- Parameter()
quadratic <- power(x - 2 * p, 2)
problem <- Problem(Minimize(quadratic))Sensitivity Analysis and Gradients
Introduction
The derivative functionality (backward, derivative, gradient, delta, requires_grad) is not implemented in CVXR at this time. The code below illustrates the intended API based on CVXPY but is not executable.
An optimization problem can be viewed as a function mapping parameters to solutions. This solution map is sometimes differentiable. CVXR has built-in support for computing the derivative of the optimal variable values of a problem with respect to small perturbations of the parameters (i.e., the Parameter instances appearing in a problem).
The Problem class exposes two methods related to computing the derivative:
derivative()evaluates the derivative given perturbations to the parameters. This lets you calculate how the solution to a problem would change given small changes to the parameters, without re-solving the problem.backward()evaluates the adjoint of the derivative, computing the gradient of the solution with respect to the parameters. This can be useful when combined with automatic differentiation software.
The derivative() and backward() methods are only meaningful when the problem contains parameters. In order for a problem to be differentiable, it must be DPP-compliant. CVXR can compute the derivative of any DPP-compliant DCP or DGP problem. At non-differentiable points, CVXR computes a heuristic quantity.
Example: A Trivial Quadratic
As a first example, we solve a trivial problem with an analytical solution, to illustrate the usage of backward() and derivative(). We construct a problem with a scalar variable x and a scalar parameter p. The problem is to minimize the quadratic
Next, we solve the problem for the particular value of requires_grad = TRUE to psolve().
value(p) <- 3.0
result <- psolve(problem, requires_grad = TRUE)
cat("Optimal value:", result, "\n")
cat("x:", value(x), "\n")Using backward()
Having solved the problem with requires_grad = TRUE, we can now use backward() to differentiate through the problem. We compute the gradient of the solution with respect to its parameter by calling backward(). As a side-effect, backward() populates the gradient() attribute on all parameters with the gradient of the solution with respect to that parameter.
backward(problem)
cat("The gradient is", gradient(p), "\n")In this case, the problem has the trivial analytical solution
Using derivative()
Next, we use derivative() to see how a small change in p would affect the solution x. We perturb p by delta(p) <- 1e-5, and calling derivative() will populate the delta() attribute of x with the change in x predicted by a first-order approximation (which is
delta(p) <- 1e-5
derivative(problem)
cat("x delta is", delta(x), "\n")In this case the solution is trivial and its derivative is just
We emphasize that this example is trivial, because it has a trivial analytical solution with a trivial derivative. The backward() and derivative() methods are useful because the vast majority of convex optimization problems do not have analytical solutions: in these cases, CVXR can compute solutions and their derivatives, even though it would be impossible to derive them by hand.
When to Use backward() vs. derivative()
backward()should be used when you need the gradient of (a scalar-valued function of) the solution with respect to the parameters.derivative()should be used for sensitivity analysis, i.e., when you want to know how the solution would change if one or more parameters were changed.
When there are multiple variables, it is much more efficient to compute sensitivities using derivative() than it would be to compute the entire Jacobian (which can be done by calling backward() multiple times, once for each standard basis vector).
A Note on backward() with Multiple Variables
In this simple example, the variable x was a scalar, so backward() computed the gradient of x with respect to p. When there is more than one scalar variable, by default, backward() computes the gradient of the sum of the optimal variable values with respect to the parameters.
More generally, backward() can be used to compute the gradient of a scalar-valued function backward() can be used to compute the gradient of backward(problem), just set gradient(x) <- dx.
Example: Least Squares with Regularization
Here we demonstrate sensitivity analysis on a more practical problem: a regularized least-squares problem where we want to understand how the solution changes with respect to the regularization parameter.
set.seed(42)
n <- 5
m <- 3
A <- matrix(rnorm(n * m), n, m)
b <- rnorm(n)
x <- Variable(m)
lam <- Parameter(nonneg = TRUE)
objective <- sum_squares(A %*% x - b) + lam * sum_squares(x)
problem <- Problem(Minimize(objective))
value(lam) <- 1.0
result <- psolve(problem, requires_grad = TRUE)
cat("Optimal value:", result, "\n")
cat("x:", value(x), "\n")Now compute the gradient of the solution with respect to the regularization parameter:
backward(problem)
cat("Gradient of solution w.r.t. lambda:", gradient(lam), "\n")And use derivative() to predict the effect of a small perturbation in
delta(lam) <- 0.01
derivative(problem)
cat("Predicted change in x:", delta(x), "\n")Session Info
R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.3
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] CVXR_1.8.0.9214
loaded via a namespace (and not attached):
[1] slam_0.1-55 cli_3.6.5 knitr_1.51 ECOSolveR_0.6.1
[5] rlang_1.1.7 xfun_0.56 clarabel_0.11.2 otel_0.2.0
[9] gurobi_13.0-1 Rglpk_0.6-5.1 highs_1.12.0-3 cccp_0.3-3
[13] scs_3.2.7 S7_0.2.1 jsonlite_2.0.0 Rcplex_0.3-8
[17] backports_1.5.0 htmltools_0.5.9 Rmosek_11.1.1 gmp_0.7-5.1
[21] piqp_0.6.2 rmarkdown_2.30 grid_4.5.2 evaluate_1.0.5
[25] fastmap_1.2.0 yaml_2.3.12 compiler_4.5.2 codetools_0.2-20
[29] htmlwidgets_1.6.4 Rcpp_1.1.1 osqp_1.0.0 lattice_0.22-9
[33] digest_0.6.39 checkmate_2.3.4 Matrix_1.7-4 tools_4.5.2
References
- Agrawal, A., Barratt, S., Boyd, S., Busseti, E., Moursi, W. M. (2019). Differentiating through a cone program. Journal of Applied and Numerical Optimization, 1(2), 107–115.
- Agrawal, A., Verschueren, R., Diamond, S., Boyd, S. (2018). A rewriting system for convex optimization problems. Journal of Control and Decision, 5(1), 42–60.