Quantile regression is another variation on least squares . The loss is the tilted function,
where specifies the quantile. The problem as before is to minimize the total residual loss. This model is commonly used in ecology, healthcare, and other fields where the mean alone is not enough to capture complex relationships between variables. CVXR allows us to create a function to represent the loss and integrate it seamlessly into the problem definition, as illustrated below.
Example
We will use an example from the quantreg package. The vignette provides an example of the estimation and plot.
data(engel)p <-ggplot(data = engel) +geom_point(mapping =aes(x = income, y = foodexp), color ="blue")taus <-c(0.1, 0.25, 0.5, 0.75, 0.90, 0.95)fits <-data.frame(coef(lm(foodexp ~ income, data = engel)),sapply(taus, function(x) coef(rq(formula = foodexp ~ income, data = engel, tau = x))))names(fits) <-c("OLS", sprintf("\\(\\tau_{%0.2f}\\)", taus))nf <-ncol(fits)colors <-colorRampPalette(colors =c("black", "red"))(nf)p <- p +geom_abline(intercept = fits[1, 1], slope = fits[2, 1], color = colors[1], linewidth =1.5)for (i inseq_len(nf)[-1]) { p <- p +geom_abline(intercept = fits[1, i], slope = fits[2, i], color = colors[i])}p
The above plot shows the quantile regression fits for . The OLS fit is the thick black line.
The following is a table of the estimates.
knitr::kable(fits, format ="html", escape =FALSE, caption ="Fits from OLS and `quantreg`") |>kable_styling("striped") |>column_spec(1:8, background ="#ececec")
Fits from OLS and `quantreg`
OLS
(Intercept)
147.4753885
110.1415742
95.4835396
81.4822474
62.3965855
67.3508721
64.1039632
income
0.4851784
0.4017658
0.4741032
0.5601806
0.6440141
0.6862995
0.7090685
The CVXR formulation follows. Note we make use of model.matrix to get the intercept column painlessly.
X <-model.matrix(foodexp ~ income, data = engel)y <-matrix(engel[, "foodexp"], ncol =1)beta <-Variable(2)quant_loss <-function(u, tau) { 0.5*abs(u) + (tau -0.5) * u }solutions <-sapply(taus, function(tau) { obj <-sum(quant_loss(y - X %*% beta, tau = tau)) prob <-Problem(Minimize(obj))## THE OSQP solver returns an error for tau = 0.5psolve(prob, solver ="CLARABEL")check_solver_status(prob)value(beta)})fits <-data.frame(coef(lm(foodexp ~ income, data = engel)), solutions)names(fits) <-c("OLS", sprintf("\\(\\tau_{%0.2f}\\)", taus))
Here is a table similar to the above with the OLS estimate added in for easy comparison.
knitr::kable(fits, format ="html", escape =FALSE, caption ="Fits from OLS and `CVXR`") |>kable_styling("striped") |>column_spec(1:8, background ="#ececec")