Sparse Inverse Covariance Estimation
Introduction
Assume we are given i.i.d. observations \(x_i \sim N(0,\Sigma)\) for \(i = 1,\ldots,m\), and the covariance matrix \(\Sigma \in {\mathbf S}_+^n\), the set of symmetric positive semidefinite matrices, has a sparse inverse \(S = \Sigma^{-1}\). Let \(Q = \frac{1}{m-1}\sum_{i=1}^m (x_i - \bar x)(x_i - \bar x)^T\) be our sample covariance. One way to estimate \(\Sigma\) is to maximize the log-likelihood with the prior knowledge that \(S\) is sparse (Friedman, Hastie, and Tibshirani 2008), which amounts to the optimization problem:
\[ \begin{array}{ll} \underset{S}{\mbox{maximize}} & \log\det(S) - \mbox{tr}(SQ) \\ \mbox{subject to} & S \in {\mathbf S}_+^n, \quad \sum_{i=1}^n \sum_{j=1}^n |S_{ij}| \leq \alpha. \end{array} \]
The parameter \(\alpha \geq 0\) controls the degree of sparsity. The problem
is convex, so we can solve it using CVXR
.
Example
We’ll create a sparse positive semi-definite matrix \(S\) using synthetic data
set.seed(1)
n <- 10 ## Dimension of matrix
m <- 1000 ## Number of samples
## Create sparse, symmetric PSD matrix S
A <- rsparsematrix(n, n, 0.15, rand.x = stats::rnorm)
Strue <- A %*% t(A) + 0.05 * diag(rep(1, n)) ## Force matrix to be strictly positive definite
We can now create the covariance matrix \(R\) as the inverse of \(S\).
R <- base::solve(Strue)
As test data, we sample from a multivariate normal with the fact that if \(Y \sim N(0, I)\), then \(R^{1/2}Y \sim N(0, R)\) since \(R\) is symmetric.
x_sample <- matrix(stats::rnorm(n * m), nrow = m, ncol = n) %*% t(expm::sqrtm(R))
Q <- cov(x_sample) ## Sample covariance matrix
Finally, we solve our convex program for a set of \(\alpha\) values.
Version 1.0 Note: Positive semi-definite variables are now
designated using PSD = TRUE
rather than the Semidef
function!
alphas <- c(10, 8, 6, 4, 1)
if (packageVersion("CVXR") > "0.99-7") {
S <- Variable(n, n, PSD = TRUE)
} else {
S <- Semidef(n) ## Variable constrained to positive semidefinite cone
}
obj <- Maximize(log_det(S) - matrix_trace(S %*% Q))
S.est <- lapply(alphas,
function(alpha) {
constraints <- list(sum(abs(S)) <= alpha)
## Form and solve optimization problem
prob <- Problem(obj, constraints)
result <- solve(prob)
## Create covariance matrix
R_hat <- base::solve(result$getValue(S))
Sres <- result$getValue(S)
Sres[abs(Sres) <= 1e-4] <- 0
Sres
})
In the code above, the Semidef
constructor restricts S
to the
positive semidefinite cone. In our objective, we use CVXR
functions
for the log-determinant and trace. The expression matrix_trace(S %*% Q)
is equivalent to `sum(diag(S %*% Q))}, but the former is preferred
because it is more efficient than making nested function calls.
However, a standalone atom does not exist for the determinant, so we
cannot replace log_det(S)
with log(det(S))
since det
is
undefined for a Semidef
object.
Results
The figures below depict the solutions for the above dataset with \(m = 1000, n = 10\), and \(S\) containing 26% non-zero entries, represented by the dark squares in the images below. The sparsity of our inverse covariance estimate decreases for higher \(\alpha\), so that when \(\alpha = 1\), most of the off-diagonal entries are zero, while if \(\alpha = 10\), over half the matrix is dense. At \(\alpha = 4\), we achieve the true percentage of non-zeros.
do.call(multiplot, args = c(list(plotSpMat(Strue)),
mapply(plotSpMat, S.est, alphas, SIMPLIFY = FALSE),
list(layout = matrix(1:6, nrow = 2, byrow = TRUE))))
Session Info
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin21.6.0 (64-bit)
## Running under: macOS Ventura 13.0
##
## Matrix products: default
## BLAS: /usr/local/Cellar/openblas/0.3.21/lib/libopenblasp-r0.3.21.dylib
## LAPACK: /usr/local/Cellar/r/4.2.1_4/lib/R/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid stats graphics grDevices datasets utils methods
## [8] base
##
## other attached packages:
## [1] expm_0.999-6 Matrix_1.5-1 ggplot2_3.3.6 CVXR_1.0-11
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.0 xfun_0.34 bslib_0.4.0 slam_0.1-50
## [5] lattice_0.20-45 Rmosek_10.0.25 colorspace_2.0-3 vctrs_0.5.0
## [9] generics_0.1.3 htmltools_0.5.3 yaml_2.3.6 gmp_0.6-6
## [13] utf8_1.2.2 rlang_1.0.6 jquerylib_0.1.4 pillar_1.8.1
## [17] glue_1.6.2 Rmpfr_0.8-9 withr_2.5.0 DBI_1.1.3
## [21] Rcplex_0.3-5 RColorBrewer_1.1-3 bit64_4.0.5 scs_3.0-1
## [25] lifecycle_1.0.3 stringr_1.4.1 munsell_0.5.0 blogdown_1.13
## [29] gtable_0.3.1 gurobi_9.5-2 codetools_0.2-18 evaluate_0.17
## [33] labeling_0.4.2 knitr_1.40 fastmap_1.1.0 fansi_1.0.3
## [37] cccp_0.2-9 highr_0.9 Rcpp_1.0.9 scales_1.2.1
## [41] cachem_1.0.6 jsonlite_1.8.3 farver_2.1.1 bit_4.0.4
## [45] digest_0.6.30 stringi_1.7.8 bookdown_0.29 dplyr_1.0.10
## [49] Rglpk_0.6-4 cli_3.4.1 tools_4.2.1 magrittr_2.0.3
## [53] sass_0.4.2 tibble_3.1.8 pkgconfig_2.0.3 rcbc_0.1.0.9001
## [57] assertthat_0.2.1 rmarkdown_2.17 R6_2.5.1 compiler_4.2.1