x <- Variable(pos = TRUE)
y <- Variable(pos = TRUE)
z <- Variable(pos = TRUE)
a <- Parameter(pos = TRUE)
b <- Parameter(pos = TRUE)
cc <- Parameter() # 'c' is reserved in R
objective_fn <- 1 / (x * y * z)
objective <- Minimize(objective_fn)
constraints <- list(a * (x * y + x * z + y * z) <= b, x >= y^cc)
problem <- Problem(objective, constraints)Derivatives Fundamentals
Introduction
This example introduces the fundamentals of computing the derivative of the solution map to optimization problems. The derivative can be used for sensitivity analysis, to see how a solution would change given small changes to the parameters, and to compute gradients of scalar-valued functions of the solution.
We will consider a simple disciplined geometric program (DGP). The geometric program under consideration is
\[ \begin{array}{ll} \text{minimize} & 1/(xyz) \\ \text{subject to} & a(xy + xz + yz) \leq b \\ & x \geq y^c, \end{array} \]
where \(x \in \mathbf{R}_{++}\), \(y \in \mathbf{R}_{++}\), and \(z \in \mathbf{R}_{++}\) are the variables, and \(a \in \mathbf{R}_{++}\), \(b \in \mathbf{R}_{++}\), and \(c \in \mathbf{R}\) are the parameters. The vector
\[ \alpha = \begin{bmatrix} a \\ b \\ c \end{bmatrix} \]
is the vector of parameters.
Setting Up the Problem
We verify that the problem is a DGP.
is_dgp(problem)[1] TRUE
Solving the Problem
Next, we solve the problem, setting the parameters \(a\), \(b\), and \(c\) to \(2\), \(1\), and \(0.5\).
value(a) <- 2.0
value(b) <- 1.0
value(cc) <- 0.5
result <- psolve(problem, gp = TRUE, requires_grad = TRUE)
cat(sprintf("x = %f\n", value(x)))x = 0.561214
cat(sprintf("y = %f\n", value(y)))y = 0.314961
cat(sprintf("z = %f\n", value(z)))z = 0.368921
Notice the keyword argument requires_grad = TRUE; this is necessary to subsequently compute derivatives.
Solution Map
The solution map of the above problem is a function
\[ \mathcal{S} : \mathbf{R}^2_{++} \times \mathbf{R} \to \mathbf{R}^3_{++} \]
which maps the parameter vector to the vector of optimal solutions
\[ \mathcal{S}(\alpha) = \begin{bmatrix} x(\alpha) \\ y(\alpha) \\ z(\alpha) \end{bmatrix}. \]
Here, \(x(\alpha)\), \(y(\alpha)\), and \(z(\alpha)\) are the optimal values of the variables corresponding to the parameter vector.
As an example, we just saw that
\[ \mathcal{S}((2.0, 1.0, 0.5)) \approx \begin{bmatrix} 0.5612 \\ 0.3150 \\ 0.3690 \end{bmatrix}. \]
Sensitivity Analysis
When the solution map is differentiable, we can use its derivative
\[ \mathsf{D}\mathcal{S}(\alpha) \in \mathbf{R}^{3 \times 3} \]
to perform a sensitivity analysis, which studies how the solution would change given small changes to the parameters.
Suppose we perturb the parameters by a vector of small magnitude \(\mathsf{d}\alpha \in \mathbf{R}^3\). We can approximate the change \(\Delta\) in the solution due to the perturbation using the derivative, as
\[ \Delta = \mathcal{S}(\alpha + \mathsf{d}\alpha) - \mathcal{S}(\alpha) \approx \mathsf{D}\mathcal{S}(\alpha) \mathsf{d}\alpha. \]
We can compute this in CVXR as follows. Partition the perturbation as
\[ \mathsf{d}\alpha = \begin{bmatrix} \mathsf{d}a \\ \mathsf{d}b \\ \mathsf{d}c \end{bmatrix}. \]
We set the delta attributes of the parameters to their perturbations, and then call the derivative method.
da <- 1e-2
db <- 1e-2
dc <- 1e-2
delta(a) <- da
delta(b) <- db
delta(cc) <- dc
derivative(problem)The derivative method populates the delta attributes of the variables as a side-effect, with the predicted change in the variable. We can compare the predictions to the actual solution of the perturbed problem.
x_hat <- value(x) + delta(x)
y_hat <- value(y) + delta(y)
z_hat <- value(z) + delta(z)
value(a) <- value(a) + da
value(b) <- value(b) + db
value(cc) <- value(cc) + dc
psolve(problem, gp = TRUE)[1] 15.18549
cat(sprintf("x: predicted %.5f actual %.5f\n", x_hat, value(x)))x: predicted 0.55728 actual 0.55733
cat(sprintf("y: predicted %.5f actual %.5f\n", y_hat, value(y)))y: predicted 0.31782 actual 0.31782
cat(sprintf("z: predicted %.5f actual %.5f\n", z_hat, value(z)))z: predicted 0.37181 actual 0.37178
## Restore original parameter values
value(a) <- value(a) - da
value(b) <- value(b) - db
value(cc) <- value(cc) - dcIn this case, the predictions and the actual solutions are fairly close.
Gradient
We can compute the gradient of a scalar-valued function of the solution with respect to the parameters. Let \(f : \mathbf{R}^{3} \to \mathbf{R}\), and suppose we want to compute the gradient of the composition \(f \circ \mathcal{S}\). By the chain rule,
\[ \nabla f(\mathcal{S}(\alpha)) = \mathsf{D}^T\mathcal{S}(\alpha) \begin{bmatrix} \mathsf{d}x \\ \mathsf{d}y \\ \mathsf{d}z \end{bmatrix}, \]
where \(\mathsf{D}^T\mathcal{S}\) is the adjoint (or transpose) of the derivative operator, and \(\mathsf{d}x\), \(\mathsf{d}y\), and \(\mathsf{d}z\) are the partial derivatives of \(f\) with respect to its arguments.
We can compute the gradient in CVXR, using the backward method. As an example, suppose
\[ f(x, y, z) = \frac{1}{2}(x^2 + y^2 + z^2), \]
so that \(\mathsf{d}x = x\), \(\mathsf{d}y = y\), and \(\mathsf{d}z = z\). Let \(\mathsf{d}\alpha = \nabla f(\mathcal{S}(\alpha))\), and suppose we subtract \(\eta \, \mathsf{d}\alpha\) from the parameter, where \(\eta\) is a positive constant. Using the following code, we can compare \(f(\mathcal{S}(\alpha - \eta \, \mathsf{d}\alpha))\) with the value predicted by the gradient,
\[ f(\mathcal{S}(\alpha - \eta \, \mathsf{d}\alpha)) \approx f(\mathcal{S}(\alpha)) - \eta \, \mathsf{d}\alpha^T \mathsf{d}\alpha. \]
result <- psolve(problem, gp = TRUE, requires_grad = TRUE)
f_val <- function(xv, yv, zv) 0.5 * (xv^2 + yv^2 + zv^2)
original <- f_val(value(x), value(y), value(z))
gradient(x) <- value(x)
gradient(y) <- value(y)
gradient(z) <- value(z)
backward(problem)
eta <- 0.5
da_grad <- gradient(a)
db_grad <- gradient(b)
dc_grad <- gradient(cc)
dalpha <- c(da_grad, db_grad, dc_grad)
predicted <- original - eta * sum(dalpha^2)
value(a) <- value(a) - eta * da_grad
value(b) <- value(b) - eta * db_grad
value(cc) <- value(cc) - eta * dc_grad
psolve(problem, gp = TRUE)[1] 19.34202
actual <- f_val(value(x), value(y), value(z))
cat(sprintf("original %.5f predicted %.5f actual %.5f\n",
original, predicted, actual))original 0.27513 predicted 0.22703 actual 0.22939
Session Info
R version 4.6.0 (2026-04-24)
Platform: aarch64-apple-darwin23
Running under: macOS Tahoe 26.5.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] CVXR_1.9.1
loaded via a namespace (and not attached):
[1] slam_0.1-55 cli_3.6.6 knitr_1.51 ECOSolveR_0.6.1
[5] rlang_1.2.0 xfun_0.57 clarabel_0.11.2 otel_0.2.0
[9] Rglpk_0.6-5.1 highs_1.12.0-3 cccp_0.3-3 scs_3.2.7
[13] S7_0.2.2 jsonlite_2.0.0 backports_1.5.1 htmltools_0.5.9
[17] gmp_0.7-5.1 diffcp_0.1.1 rmarkdown_2.31 piqp_0.6.2
[21] grid_4.6.0 evaluate_1.0.5 fastmap_1.2.0 yaml_2.3.12
[25] compiler_4.6.0 codetools_0.2-20 htmlwidgets_1.6.4 Rcpp_1.1.1-1.1
[29] osqp_1.0.0 lattice_0.22-9 digest_0.6.39 checkmate_2.3.4
[33] Matrix_1.7-5 tools_4.6.0