set.seed(1)
n <- 50
m <- 50
sigmoid <- function(z) 1 / (1 + exp(-z))
beta_true <- c(1, 0.5, -0.5, rep(0, n - 3))
X <- (matrix(runif(m * n), nrow = m, ncol = n) - 0.5) * 10
Y <- round(sigmoid(X %*% beta_true + rnorm(m, sd = 0.5)))
X_test <- (matrix(runif(2 * m * n), nrow = 2 * m, ncol = n) - 0.5) * 10
Y_test <- round(sigmoid(X_test %*% beta_true + rnorm(2 * m, sd = 0.5)))Logistic Regression with l1 Regularization
Introduction
In this example, we use CVXR to train a logistic regression classifier with
Our goal is to construct a linear classifier
This implies that
We fit
Because
Generating Data
In the following code we generate data with
Formulating the Problem
We next formulate the optimization problem using CVXR. The logistic atom in CVXR computes
beta <- Variable(n)
log_likelihood <- sum_entries(
multiply(Y, X %*% beta) - logistic(X %*% beta)
)Warning: `multiply()` is deprecated. Use `x * y` instead.
This warning is displayed once per session.
Fitting the Model
We solve the optimization problem for a range of
## Classification error function
calc_error <- function(scores, labels) {
preds <- ifelse(scores > 0, 1, 0)
sum(abs(preds - labels)) / length(labels)
}trials <- 100
train_error <- numeric(trials)
test_error <- numeric(trials)
lambda_vals <- 10^seq(-2, 0, length.out = trials)
beta_vals <- vector("list", trials)
for (i in seq_len(trials)) {
problem <- Problem(Maximize(log_likelihood / m -
lambda_vals[i] * p_norm(beta, 1)))
result <- psolve(problem)
check_solver_status(problem)
beta_hat <- value(beta)
train_error[i] <- calc_error(X %*% beta_hat, Y)
test_error[i] <- calc_error(X_test %*% beta_hat, Y_test)
beta_vals[[i]] <- beta_hat
}Warning: Solution may be inaccurate. Try another solver, adjusting the solver settings,
or solve with `verbose = TRUE` for more information.
df_error <- data.frame(lambda = lambda_vals,
Train = train_error,
Test = test_error) |>
pivot_longer(-lambda, names_to = "Set", values_to = "Error")
ggplot(df_error, aes(x = lambda, y = Error, color = Set)) +
geom_line() +
scale_x_log10() +
scale_color_manual(values = c(Train = "blue", Test = "red")) +
labs(x = expression(lambda), y = "Classification Error",
title = "Train vs. Test Error") +
theme_minimal()
Regularization Path
We also plot the regularization path, or the
beta_mat <- do.call(cbind, beta_vals)
df_path <- data.frame(lambda = lambda_vals, t(beta_mat)) |>
pivot_longer(-lambda, names_to = "Coefficient", values_to = "Value")
ggplot(df_path, aes(x = lambda, y = Value, color = Coefficient)) +
geom_line() +
scale_x_log10() +
labs(x = expression(lambda), y = expression(beta[i]),
title = "Regularization Path") +
theme_minimal() +
guides(color = "none")
Comparing True and Reconstructed Coefficients
We plot the true
idx <- which.min(test_error)
df_coef <- data.frame(i = seq_along(beta_true),
True = beta_true,
Reconstructed = drop(beta_vals[[idx]])) |>
pivot_longer(-i, names_to = "Type", values_to = "Value")
ggplot(df_coef, aes(x = i, y = Value, color = Type)) +
geom_line() +
scale_color_manual(values = c(True = "blue", Reconstructed = "red")) +
labs(x = "i", y = expression(beta[i]),
title = "True vs. Reconstructed Coefficients") +
theme_minimal()
Session Info
R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.3
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] tidyr_1.3.2 ggplot2_4.0.2 CVXR_1.8.1
loaded via a namespace (and not attached):
[1] gmp_0.7-5.1 generics_0.1.4 clarabel_0.11.2 slam_0.1-55
[5] lattice_0.22-9 digest_0.6.39 magrittr_2.0.4 evaluate_1.0.5
[9] grid_4.5.2 RColorBrewer_1.1-3 fastmap_1.2.0 rprojroot_2.1.1
[13] jsonlite_2.0.0 Matrix_1.7-4 ECOSolveR_0.6.1 backports_1.5.0
[17] scs_3.2.7 purrr_1.2.1 Rmosek_11.1.1 scales_1.4.0
[21] codetools_0.2-20 cli_3.6.5 rlang_1.1.7 Rglpk_0.6-5.1
[25] withr_3.0.2 yaml_2.3.12 otel_0.2.0 tools_4.5.2
[29] osqp_1.0.0 Rcplex_0.3-8 checkmate_2.3.4 dplyr_1.2.0
[33] here_1.0.2 gurobi_13.0-1 vctrs_0.7.1 R6_2.6.1
[37] lifecycle_1.0.5 htmlwidgets_1.6.4 pkgconfig_2.0.3 cccp_0.3-3
[41] pillar_1.11.1 gtable_0.3.6 glue_1.8.0 Rcpp_1.1.1
[45] xfun_0.56 tibble_3.3.1 tidyselect_1.2.1 knitr_1.51
[49] dichromat_2.0-0.1 highs_1.12.0-3 farver_2.1.2 htmltools_0.5.9
[53] labeling_0.4.3 rmarkdown_2.30 piqp_0.6.2 compiler_4.5.2
[57] S7_0.2.1