Getting Faster Results
Warning
The solution described below is useful when you mathematically know a problem is DCP-compliant and none of your data inputs will change the nature of the problem. We recommend that users check the DCP-compliance of a problem (via a call to
is_dcp(prob)
for example) at least once to ensure this is the case. Not verifying DCP-compliance may result in garbage!Note also that the large speed gains in previous versions are no longer evident in the version 1.x because the new reduction framework has really made
CVXR
faster.
Introduction
As was remarked in the introduction to
CVXR
, its chief advantage is
flexibility: you can specify a problem in close to mathematical form
and CVXR
solves it for you, if it can. Behind the scenes, CVXR
compiles the domain specific language and verifies the convexity of
the problem before sending it off to solvers. If the problem violates
the rules of Disciplined Convex Programming it is
rejected.
Therefore, it is generally slower than tailor-made solutions to a given problem.
An Example
To understand the speed issues, let us consider the global warming data from the Carbon Dioxide Information Analysis Center (CDIAC) again. The data points are the annual temperature anomalies relative to the 1961–1990 mean. We will fit the nearly-isotonic approximation \(\beta \in {\mathbf R}^m\) by solving
\[ \begin{array}{ll} \underset{\beta}{\mbox{Minimize}} & \frac{1}{2}\sum_{i=1}^m (y_i - \beta_i)^2 + \lambda \sum_{i=1}^{m-1}(\beta_i - \beta_{i+1})_+, \end{array} \] where \(\lambda \geq 0\) is a penalty parameter and \(x_+ =\max(x,0)\).
This can be solved as follows.
suppressMessages(suppressWarnings(library(CVXR)))
data(cdiac)
y <- cdiac$annual
m <- length(y)
lambda <- 0.44
beta <- Variable(m)
obj <- 0.5 * sum((y - beta)^2) + lambda * sum(pos(diff(beta)))
prob <- Problem(Minimize(obj))
soln <- solve(prob, solver = "ECOS")
betaHat <- soln$getValue(beta)
This is the recommended way to solve a problem.
However, suppose we wished to construct bootstrap confidence intervals for the estimate using 100 resamples. It is clear that this computation time can quickly become limiting .
Below, we show how one can get at the problem data and directly call a solver to get faster results.
Profile the code
Profiling a single fit to the model is useful to figure out where most of the time is spent.
library(profvis)
y <- cdiac$annual
profvis({
beta <- Variable(m)
obj <- Minimize(0.5 * sum((y - beta)^2) + lambda * sum(pos(diff(beta))))
prob <- Problem(obj)
soln <- solve(prob, solver = "ECOS")
betaHat <- soln$getValue(beta)
})
It is especially instructive to click on the data tab and open up
the tree for solve
to see the sequence of calls and cumulative time
used.
The profile shows that most of the total time (2400ms for one of our
runs) time is spent in the call to is_dcp
generic (about
2000ms). This generic is responsible to ensuring that all the problem
is DCP-compliant by checking the nature of each of the components that
make up the problem. The actual solving took a much smaller fraction
of the time.
Directly Calling the Solver
We are mathematically certain that the above is convex and so we can
avoid the is_dcp
hit. We can obtain the the problem data for a
particular solver (like OSQP
, or ECOS
or SCS
) using the function
get_problem_data
and directly hand that data to the solver to get
the solution.
prob_data <- get_problem_data(prob, solver = "ECOS")
ASIDE: How did we know ECOS was the solver to use? Future versions
will provide a function to match a solver to a problem. (Actually, it
is available already, but not exported yet!). For now, a single call
to solve
with the verbose option set to TRUE
can provide that
information.
soln <- solve(prob, verbose = TRUE)
Now that we have the problem data and know which solver to use, we can
call the ECOS solver with the right arguments. (The ECOS solver is
provided by the package ECOSolveR
, which CVXR
imports.)
if (packageVersion("CVXR") > "0.99-7") {
ECOS_dims <- ECOS.dims_to_solver_dict(prob_data$data[["dims"]])
} else {
ECOS_dims <- prob_data$data[["dims"]]
}
solver_output <- ECOSolveR::ECOS_csolve(c = prob_data$data[["c"]],
G = prob_data$data[["G"]],
h = prob_data$data[["h"]],
dims = ECOS_dims,
A = prob_data$data[["A"]],
b = prob_data$data[["b"]])
Finally, we can obtain the results by asking CVXR
to unpack the
solver results for us. (See ?unpack_results
for further examples.)
if (packageVersion("CVXR") > "0.99-7") {
direct_soln <- unpack_results(prob, solver_output, prob_data$chain, prob_data$inverse_data)
} else {
direct_soln <- unpack_results(prob, "ECOS", solver_output)
}
Profile the Direct Call
We can profile this direct call now.
profvis({
beta <- Variable(m)
obj <- Minimize(0.5 * sum((y - beta)^2) + lambda * sum(pos(diff(beta))))
prob <- Problem(obj)
prob_data <- get_problem_data(prob, solver = "ECOS")
if (packageVersion("CVXR") > "0.99-7") {
ECOS_dims <- ECOS.dims_to_solver_dict(prob_data$data[["dims"]])
} else {
ECOS_dims <- prob_data$data[["dims"]]
}
solver_output <- ECOSolveR::ECOS_csolve(c = prob_data$data[["c"]],
G = prob_data$data[["G"]],
h = prob_data$data[["h"]],
dims = ECOS_dims,
A = prob_data$data[["A"]],
b = prob_data$data[["b"]])
if (packageVersion("CVXR") > "0.99-7") {
direct_soln <- unpack_results(prob, solver_output, prob_data$chain, prob_data$inverse_data)
} else {
direct_soln <- unpack_results(prob, "ECOS", solver_output)
}
})
For one of our runs, the total time went down from \(2400ms\) to \(690ms\), more than a 3-fold speedup! In cases where the objective function and constraints are more complex, the speedup can be more than 10-fold.
Same Answer?
Of course, we should also verify that the results obtained in both cases are same.
identical(betaHat, direct_soln$getValue(beta))
## [1] TRUE
Session Info
sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-apple-darwin20
## Running under: macOS Sequoia 15.1
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/Los_Angeles
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices datasets utils methods base
##
## other attached packages:
## [1] profvis_0.4.0 CVXR_1.0-15
##
## loaded via a namespace (and not attached):
## [1] slam_0.1-54 cli_3.6.3 knitr_1.48 ECOSolveR_0.5.5
## [5] rlang_1.1.4 xfun_0.49 clarabel_0.9.0.1 gurobi_11.0-0
## [9] Rglpk_0.6-5.1 cccp_0.3-1 assertthat_0.2.1 jsonlite_1.8.9
## [13] bit_4.5.0 Rcplex_0.3-6 Rmosek_10.2.0 htmltools_0.5.8.1
## [17] rcbc_0.1.0.9001 sass_0.4.9 gmp_0.7-5 rmarkdown_2.29
## [21] grid_4.4.2 evaluate_1.0.1 jquerylib_0.1.4 fastmap_1.2.0
## [25] yaml_2.3.10 lifecycle_1.0.4 bookdown_0.41 compiler_4.4.2
## [29] codetools_0.2-20 htmlwidgets_1.6.4 Rcpp_1.0.13-1 blogdown_1.19
## [33] lattice_0.22-6 digest_0.6.37 R6_2.5.1 bslib_0.8.0
## [37] Matrix_1.7-1 tools_4.4.2 bit64_4.5.2 Rmpfr_0.9-5
## [41] cachem_1.1.0