data("pituitary")
str(pituitary)'data.frame': 11 obs. of 2 variables:
$ age : num 8 8 8 10 10 10 12 12 12 14 ...
$ size: num 21 23.5 23 24 21 25 21.5 22 19 23.5 ...
Anqi Fu and Balasubramanian Narasimhan
Isotonic regression is regression with monotonic constraints. There are several packages in R to fit isotonic regression models. In this example, we consider isotone which uses a pooled-adjacent-violators algorithm (PAVA) and active set methods to perform the fit.
We will use data from the isotone package (see de Leeuw, Hornik, and Mair (2009)) on the size of pituitary fissures for 11 subjects between 8 and 14 years of age.
'data.frame': 11 obs. of 2 variables:
$ age : num 8 8 8 10 10 10 12 12 12 14 ...
$ size: num 21 23.5 23 24 21 25 21.5 22 19 23.5 ...
Since the size is expected to increase with age, an isotonic fit is suggested.
The CVXR formulation expresses this pretty much in the mathematical form.
We define a variable x of size n, the number of observations. The objective to be minimized is the least-squares error (p_norm). The constraint specifies monotonicity and it uses the diff function familiar to R users (see ?diff), except that it now refers to elements of the x. A problem is created and solved via psolve. The resulting optimal value for x is retrieved using value(x).
As the output below shows, the results are very close.
Package isotone provides additional methods for handling tied data besides the default ties = "primary" method; ties = "secondary" enforces equality within ties, and ties = "tertiary" enforces monotonicity on the means. (The latter may cause individual fits to be non-monotonic.)
In CVXR, the secondary method just requires an additional constraint to enforce equality within tied values; no other modification is necessary. We do that below by figuring out the tied observation indices using base::split and force those x values to be equal (i.e. diff == 0).
x_s <- with(pituitary, {
n <- length(size)
x <- Variable(n)
objective <- Minimize(p_norm(size - x, 2))
secondary_constraints <- lapply(base::split(x = seq_len(n),
f = age),
function(i) diff(x[i]) == 0)
constraint <- c(diff(x) >= 0,
secondary_constraints)
problem <- Problem(objective, constraint)
psolve(problem)
check_solver_status(problem)
value(x)
})Finally, the tertiary method requires computing the block means to be computed for use in enforcing monotonicity.
x_t <- with(pituitary, {
n <- length(size)
x <- Variable(n)
objective <- Minimize(p_norm(size - x, 2))
blocks <- base::split(x = seq_len(n),
f = pituitary$age)
block_means <- lapply(blocks, function(i) {
v <- numeric(n)
v[i] <- 1.0 / length(i)
matrix(v, nrow = 1) %*% x
})
block_mean_vector <- do.call(vstack, block_means)
constraint <- list(diff(block_mean_vector) >= 0)
problem <- Problem(objective, constraint)
psolve(problem)
check_solver_status(problem)
value(x)
})In the above, we make use of the vstack function to create a single vector of the block means.
We can now compare all three solutions.
| Isotone (P) | CVXR (P) | Isotone (S) | CVXR(S) | Isotone (T) | CVXR (T) |
|---|---|---|---|---|---|
| 21.00 | 21.00 | 22.22 | 22.22 | 20.72 | 20.72 |
| 22.38 | 22.37 | 22.22 | 22.22 | 23.22 | 23.22 |
| 22.38 | 22.37 | 22.22 | 22.22 | 22.72 | 22.72 |
| 22.38 | 22.37 | 22.22 | 22.22 | 22.89 | 22.89 |
| 22.38 | 22.37 | 22.22 | 22.22 | 19.89 | 19.89 |
| 22.38 | 22.37 | 22.22 | 22.22 | 23.89 | 23.89 |
| 22.38 | 22.37 | 22.22 | 22.22 | 22.89 | 22.89 |
| 22.38 | 22.37 | 22.22 | 22.22 | 23.39 | 23.39 |
| 22.38 | 22.37 | 22.22 | 22.22 | 20.39 | 20.39 |
| 23.50 | 23.50 | 24.25 | 24.25 | 23.50 | 23.50 |
| 25.00 | 25.00 | 24.25 | 24.25 | 25.00 | 25.00 |
Comparison of Solutions(P = Primary, S = Secondary, T = Tertiary)
Other arbitrary desired behavior can be specified for tied observations using the above as a guide.
R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.3.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] kableExtra_1.4.0 isotone_1.1-2 CVXR_1.8.1
loaded via a namespace (and not attached):
[1] gmp_0.7-5.1 clarabel_0.11.2 xml2_1.5.2 slam_0.1-55
[5] stringi_1.8.7 lattice_0.22-9 digest_0.6.39 magrittr_2.0.4
[9] evaluate_1.0.5 grid_4.5.2 RColorBrewer_1.1-3 fastmap_1.2.0
[13] rprojroot_2.1.1 jsonlite_2.0.0 Matrix_1.7-4 ECOSolveR_0.6.1
[17] backports_1.5.0 scs_3.2.7 Rmosek_11.1.1 viridisLite_0.4.3
[21] scales_1.4.0 codetools_0.2-20 textshaping_1.0.4 cli_3.6.5
[25] rlang_1.1.7 Rglpk_0.6-5.1 yaml_2.3.12 otel_0.2.0
[29] tools_4.5.2 osqp_1.0.0 Rcplex_0.3-8 checkmate_2.3.4
[33] here_1.0.2 gurobi_13.0-1 R6_2.6.1 lifecycle_1.0.5
[37] stringr_1.6.0 htmlwidgets_1.6.4 cccp_0.3-3 glue_1.8.0
[41] Rcpp_1.1.1 systemfonts_1.3.1 xfun_0.56 rstudioapi_0.18.0
[45] knitr_1.51 dichromat_2.0-0.1 highs_1.12.0-3 farver_2.1.2
[49] htmltools_0.5.9 rmarkdown_2.30 svglite_2.2.2 piqp_0.6.2
[53] compiler_4.5.2 S7_0.2.1 nnls_1.6