Isotonic Regression

Author

Anqi Fu and Balasubramanian Narasimhan

Introduction

Isotonic regression is regression with monotonic constraints. There are several packages in R to fit isotonic regression models. In this example, we consider isotone which uses a pooled-adjacent-violators algorithm (PAVA) and active set methods to perform the fit.

Pituitary Data Example

We will use data from the isotone package (see de Leeuw, Hornik, and Mair ()) on the size of pituitary fissures for 11 subjects between 8 and 14 years of age.

data("pituitary")
str(pituitary)
'data.frame':   11 obs. of  2 variables:
 $ age : num  8 8 8 10 10 10 12 12 12 14 ...
 $ size: num  21 23.5 23 24 21 25 21.5 22 19 23.5 ...

Since the size is expected to increase with age, an isotonic fit is suggested.

res_p <- with(pituitary, gpava(age, size))

The CVXR formulation expresses this pretty much in the mathematical form.

x_p <- with(pituitary, {
      n <- length(size)
      x <- Variable(n)
      objective <- Minimize(p_norm(size - x, 2))
      constraint <- list(diff(x) >= 0)
      problem <- Problem(objective, constraint)
      psolve(problem)
      check_solver_status(problem)
      value(x)
})

We define a variable x of size n, the number of observations. The objective to be minimized is the least-squares error (p_norm). The constraint specifies monotonicity and it uses the diff function familiar to R users (see ?diff), except that it now refers to elements of the x. A problem is created and solved via psolve. The resulting optimal value for x is retrieved using value(x).

As the output below shows, the results are very close.

cbind(res_p$x, x_p)
        [,1]     [,2]
 [1,] 21.000 20.99971
 [2,] 22.375 22.37496
 [3,] 22.375 22.37496
 [4,] 22.375 22.37496
 [5,] 22.375 22.37496
 [6,] 22.375 22.37496
 [7,] 22.375 22.37496
 [8,] 22.375 22.37496
 [9,] 22.375 22.37496
[10,] 23.500 23.50023
[11,] 25.000 25.00039

Handling Ties

Package isotone provides additional methods for handling tied data besides the default ties = "primary" method; ties = "secondary" enforces equality within ties, and ties = "tertiary" enforces monotonicity on the means. (The latter may cause individual fits to be non-monotonic.)

res_s <- with(pituitary, gpava(age, size, ties = "secondary"))
res_t <- with(pituitary, gpava(age, size, ties = "tertiary"))

In CVXR, the secondary method just requires an additional constraint to enforce equality within tied values; no other modification is necessary. We do that below by figuring out the tied observation indices using base::split and force those x values to be equal (i.e. diff == 0).

x_s <- with(pituitary, {
    n <- length(size)
    x <- Variable(n)
    objective <- Minimize(p_norm(size - x, 2))
    secondary_constraints <- lapply(base::split(x = seq_len(n),
                                                f = age),
                                    function(i) diff(x[i]) == 0)
    constraint <- c(diff(x) >= 0,
                    secondary_constraints)
    problem <- Problem(objective, constraint)
    psolve(problem)
    check_solver_status(problem)
    value(x)
})

Finally, the tertiary method requires computing the block means to be computed for use in enforcing monotonicity.

x_t <- with(pituitary, {
    n <- length(size)
    x <- Variable(n)
    objective <- Minimize(p_norm(size - x, 2))
    blocks <- base::split(x = seq_len(n),
                          f = pituitary$age)
    block_means <- lapply(blocks, function(i) {
        v <- numeric(n)
        v[i] <- 1.0 / length(i)
        matrix(v, nrow = 1) %*% x
    })
    block_mean_vector <- do.call(vstack, block_means)
    constraint <- list(diff(block_mean_vector) >= 0)
    problem <- Problem(objective, constraint)
    psolve(problem)
    check_solver_status(problem)
    value(x)
})

In the above, we make use of the vstack function to create a single vector of the block means.

We can now compare all three solutions.

P = primary, S = Secondary, T = Tertiary
Isotone (P) CVXR (P) Isotone (S) CVXR(S) Isotone (T) CVXR (T)
21.00 21.00 22.22 22.22 20.72 20.72
22.38 22.37 22.22 22.22 23.22 23.22
22.38 22.37 22.22 22.22 22.72 22.72
22.38 22.37 22.22 22.22 22.89 22.89
22.38 22.37 22.22 22.22 19.89 19.89
22.38 22.37 22.22 22.22 23.89 23.89
22.38 22.37 22.22 22.22 22.89 22.89
22.38 22.37 22.22 22.22 23.39 23.39
22.38 22.37 22.22 22.22 20.39 20.39
23.50 23.50 24.25 24.25 23.50 23.50
25.00 25.00 24.25 24.25 25.00 25.00

Comparison of Solutions(P = Primary, S = Secondary, T = Tertiary)

Notes

Other arbitrary desired behavior can be specified for tied observations using the above as a guide.

Session Info

R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.3.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] kableExtra_1.4.0 isotone_1.1-2    CVXR_1.8.1      

loaded via a namespace (and not attached):
 [1] gmp_0.7-5.1        clarabel_0.11.2    xml2_1.5.2         slam_0.1-55       
 [5] stringi_1.8.7      lattice_0.22-9     digest_0.6.39      magrittr_2.0.4    
 [9] evaluate_1.0.5     grid_4.5.2         RColorBrewer_1.1-3 fastmap_1.2.0     
[13] rprojroot_2.1.1    jsonlite_2.0.0     Matrix_1.7-4       ECOSolveR_0.6.1   
[17] backports_1.5.0    scs_3.2.7          Rmosek_11.1.1      viridisLite_0.4.3 
[21] scales_1.4.0       codetools_0.2-20   textshaping_1.0.4  cli_3.6.5         
[25] rlang_1.1.7        Rglpk_0.6-5.1      yaml_2.3.12        otel_0.2.0        
[29] tools_4.5.2        osqp_1.0.0         Rcplex_0.3-8       checkmate_2.3.4   
[33] here_1.0.2         gurobi_13.0-1      R6_2.6.1           lifecycle_1.0.5   
[37] stringr_1.6.0      htmlwidgets_1.6.4  cccp_0.3-3         glue_1.8.0        
[41] Rcpp_1.1.1         systemfonts_1.3.1  xfun_0.56          rstudioapi_0.18.0 
[45] knitr_1.51         dichromat_2.0-0.1  highs_1.12.0-3     farver_2.1.2      
[49] htmltools_0.5.9    rmarkdown_2.30     svglite_2.2.2      piqp_0.6.2        
[53] compiler_4.5.2     S7_0.2.1           nnls_1.6          

References

de Leeuw, J., K. Hornik, and P. Mair. 2009. “Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods.” Journal of Statistical Software 32 (5): 1–24.