Getting Equivalent Results from `glmnet` and `CVXR`

Introduction

We’ve had several questions of the following type:

When I fit the same model in glmnet and CVXR, why are the results different?

For example, see this.

Obviously, unless one actually solves the same problem in both places, there’s no reason to expect the same result. The documentation for glmnet::glmnet clearly states the optimization objective and so one just has to ensure that the CVXR objective also matches that.

We illustrate below.

Lasso

Consider a simple Lasso fit from the glmnet example, for a fixed \(\lambda\).

suppressMessages(suppressWarnings(library(glmnet)))
set.seed(123)
n <- 100; p <- 20; thresh <- 1e-12; lambda <- .05
x <-  matrix(rnorm(n * p), n, p); xDesign <- cbind(1, x)
y <-  rnorm(n)
fit1 <-  glmnet(x,y, lambda = lambda, thresh = thresh)

The glmnet documentation notes that the objective being maximized, in the default invocation, is

\[ \frac{1}{2n}\|(y - X\beta)\|_2^2 + \lambda \|\beta_{-1}\|_1, \]

where \(\beta_{-1}\) is the beta vector excluding the first component, the intercept. Yes, the intercept is not penalized in the default invocation!

So we will use this objective with CVXR in the problem specification.

suppressMessages(suppressWarnings(library(CVXR)))
beta <- Variable(p + 1)
obj <- sum_squares(y - xDesign %*% beta) / (2 * n) + lambda * p_norm(beta[-1], 1)
prob <- Problem(Minimize(obj))
result <- solve(prob, FEASTOL = thresh, RELTOL = thresh, ABSTOL = thresh)

We can print the coefficients side-by-side from glmnet and CVXR to compare. The results below should be close, and any differences are minor, due to different solver implementations.

est.table <- data.frame("CVXR.est" = result$getValue(beta), "GLMNET.est" = as.vector(coef(fit1)))
rownames(est.table) <- paste0("$\\beta_{", 0:p, "}$")
knitr::kable(est.table, format = "html", digits = 3) %>%
    kable_styling("striped") %>%
    column_spec(1:3, background = "#ececec")
CVXR.est GLMNET.est
\(\beta_{0}\) -0.125 -0.126
\(\beta_{1}\) -0.022 -0.028
\(\beta_{2}\) 0.000 -0.002
\(\beta_{3}\) 0.101 0.104
\(\beta_{4}\) 0.000 0.000
\(\beta_{5}\) 0.000 0.000
\(\beta_{6}\) 0.000 0.000
\(\beta_{7}\) 0.000 0.000
\(\beta_{8}\) -0.094 -0.091
\(\beta_{9}\) 0.000 0.000
\(\beta_{10}\) 0.000 0.000
\(\beta_{11}\) 0.106 0.105
\(\beta_{12}\) 0.000 0.000
\(\beta_{13}\) -0.057 -0.063
\(\beta_{14}\) 0.000 0.000
\(\beta_{15}\) 0.000 0.000
\(\beta_{16}\) 0.000 0.000
\(\beta_{17}\) 0.000 0.000
\(\beta_{18}\) 0.000 0.000
\(\beta_{19}\) 0.000 0.000
\(\beta_{20}\) -0.087 -0.083

A Penalized Logistic Example

We now consider a logistic fit, again with a penalized term with a specified \(\lambda\).

lambda <- .025
y2 <- sample(x = c(0, 1), size = n, replace = TRUE)
fit2 <-  glmnet(x, y2, lambda = lambda, thresh = thresh, family = "binomial")

For logistic regression, the glmnet documentation states that the objective minimized is the negative log-likelihood divided by \(n\) plus the penalty term which once again excludes the intercept in the default invocation. Below is the CVXR formulation, where we use the logistic atom as noted earlier in our other example on logistic regression.

beta <- Variable(p + 1)
obj2 <- (sum(xDesign[y2 <= 0, ] %*% beta) + sum(logistic(-xDesign %*% beta))) / n +
    lambda * p_norm(beta[-1], 1)
prob <- Problem(Minimize(obj2))
result <- solve(prob, FEASTOL = thresh, RELTOL = thresh, ABSTOL = thresh)

Once again, the results below should be close enough.

est.table <- data.frame("CVXR.est" = result$getValue(beta), "GLMNET.est" = as.vector(coef(fit2)))
rownames(est.table) <- paste0("$\\beta_{", 0:p, "}$")
knitr::kable(est.table, format = "html", digits = 3) %>%
    kable_styling("striped") %>%
    column_spec(1:3, background = "#ececec")
CVXR.est GLMNET.est
\(\beta_{0}\) -0.139 -0.138
\(\beta_{1}\) 0.167 0.178
\(\beta_{2}\) -0.023 -0.027
\(\beta_{3}\) -0.035 -0.040
\(\beta_{4}\) -0.029 -0.026
\(\beta_{5}\) -0.275 -0.277
\(\beta_{6}\) 0.000 0.000
\(\beta_{7}\) -0.281 -0.278
\(\beta_{8}\) 0.041 0.036
\(\beta_{9}\) 0.084 0.081
\(\beta_{10}\) 0.245 0.243
\(\beta_{11}\) 0.000 0.000
\(\beta_{12}\) 0.000 0.000
\(\beta_{13}\) 0.000 0.000
\(\beta_{14}\) 0.000 0.000
\(\beta_{15}\) -0.141 -0.145
\(\beta_{16}\) 0.052 0.066
\(\beta_{17}\) 0.000 0.000
\(\beta_{18}\) -0.381 -0.380
\(\beta_{19}\) 0.071 0.068
\(\beta_{20}\) 0.000 0.000

Session Info

sessionInfo()
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin17.7.0 (64-bit)
## Running under: macOS  10.14.1
## 
## Matrix products: default
## BLAS/LAPACK: /usr/local/Cellar/openblas/0.3.3/lib/libopenblasp-r0.3.3.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices datasets  utils     methods   base     
## 
## other attached packages:
## [1] CVXR_0.99-2      glmnet_2.0-16    foreach_1.4.4    Matrix_1.2-15   
## [5] kableExtra_0.9.0
## 
## loaded via a namespace (and not attached):
##  [1] gmp_0.5-13.2      Rcpp_0.12.19      highr_0.7        
##  [4] pillar_1.3.0      compiler_3.5.1    R.methodsS3_1.7.1
##  [7] R.utils_2.7.0     iterators_1.0.10  tools_3.5.1      
## [10] bit_1.1-14        digest_0.6.18     evaluate_0.12    
## [13] tibble_1.4.2      viridisLite_0.3.0 lattice_0.20-35  
## [16] pkgconfig_2.0.2   rlang_0.3.0.1     rstudioapi_0.8   
## [19] yaml_2.2.0        blogdown_0.9.2    xfun_0.4         
## [22] Rmpfr_0.7-1       ECOSolveR_0.4     httr_1.3.1       
## [25] stringr_1.3.1     knitr_1.20        xml2_1.2.0       
## [28] hms_0.4.2         bit64_0.9-7       rprojroot_1.3-2  
## [31] grid_3.5.1        R6_2.3.0          rmarkdown_1.10   
## [34] bookdown_0.7      readr_1.1.1       magrittr_1.5     
## [37] scs_1.1-1         backports_1.1.2   scales_1.0.0     
## [40] codetools_0.2-15  htmltools_0.3.6   rvest_0.3.2      
## [43] colorspace_1.3-2  stringi_1.2.4     munsell_0.5.0    
## [46] crayon_1.3.4      R.oo_1.22.0

Source

R Markdown