channel_capacity <- function(n, m, P, sum_x = 1) {
## x is probability distribution of the input signal X(t)
x <- Variable(n)
## y is the probability distribution of the output signal Y(t)
y <- P %*% x
## Compute c_j = sum_i p_{ij} log2(p_{ij})
## Use the convention 0*log(0) = 0 by handling zeros
P_log <- ifelse(P > 0, log2(P), 0)
c_vec <- colSums(P * P_log)
## Mutual information: I = c'x + sum(entr(y)) / log(2)
I <- t(c_vec) %*% x + sum_entries(entr(y)) / log(2)
## Channel capacity is maximized by maximising the mutual information
obj <- Maximize(I)
constraints <- list(sum(x) == sum_x, x >= 0)
## Form and solve problem
prob <- Problem(obj, constraints)
result <- psolve(prob)
list(status = status(prob),
capacity = result,
x = value(x))
}Channel Capacity
Introduction
This example is adapted from Boyd and Vandenberghe, Convex Optimization, exercise 4.57, pages 207–208.
Convex optimization can be used to find the channel capacity \(C\) of a discrete memoryless channel. Consider a communication channel with input \(X(t) \in \{1, 2, \ldots, n\}\) and output \(Y(t) \in \{1, 2, \ldots, m\}\). The random variables \(X\) and \(Y\) can take \(n\) and \(m\) different values, respectively.
In a discrete memoryless channel, the relation between the input and the output is given by the transition probability:
\[ p_{ij} = \mathbb{P}(Y(t) = i \mid X(t) = j). \]
These transition probabilities form the channel transition matrix \(P \in \mathbb{R}^{m \times n}\). The columns of \(P\) must sum to 1 and all elements of \(P\) must be non-negative.
Assume that \(X\) has a probability distribution denoted by \(x \in \mathbb{R}^n\), meaning that \(x_j = \mathbb{P}(X(t) = j)\) for \(j \in \{1, \ldots, n\}\).
From Shannon’s theory, the channel capacity is given by the maximum possible mutual information \(I\) between \(X\) and \(Y\):
\[ C = \sup_x I(X; Y), \]
where
\[ I(X;Y) = -\sum_{i=1}^{m} y_i \log_2 y_i + \sum_{j=1}^{n} \sum_{i=1}^{m} x_j p_{ij} \log_2 p_{ij}, \]
and \(y = Px\) is the output distribution.
Given that \(x \log x\) is convex for \(x \geq 0\), we can formulate this as a convex optimization problem:
\[ \begin{array}{ll} \mbox{maximize} & I(X; Y) \\ \mbox{subject to} & \sum_{i=1}^{n} x_i = 1, \quad x \succeq 0. \end{array} \]
Due to the entropy function in CVXR, this can be written quite easily in DCP form.
Channel Capacity Function
We write a function that computes the channel capacity for a given transition matrix \(P\) with \(n\) input and \(m\) output symbols.
Example
In this example we consider a communication channel with two possible inputs and outputs, so \(n = m = 2\). The channel transition matrix we use is:
\[ P = \begin{pmatrix} 0.75 & 0.25 \\ 0.25 & 0.75 \end{pmatrix}. \]
Note that the columns of \(P\) must sum to 1 and all elements of \(P\) must be positive.
n <- 2
m <- 2
P <- matrix(c(0.75, 0.25,
0.25, 0.75), nrow = m, ncol = n, byrow = TRUE)
result <- channel_capacity(n, m, P)
cat("Problem status:", result$status, "\n")
cat(sprintf("Optimal value of C = %.4g\n", result$capacity))
cat("Optimal variable x =\n")
print(round(result$x, 4))Problem status: optimal
Optimal value of C = 0.1887
Optimal variable x =
[,1]
[1,] 0.5
[2,] 0.5
The optimal input distribution is uniform (\(x = [0.5, 0.5]\)) and the channel capacity is approximately \(C \approx 0.1887\) bits.
A Larger Example
We can also consider a non-symmetric channel with \(n = 4\) inputs and \(m = 3\) outputs.
n2 <- 4
m2 <- 3
set.seed(42)
## Generate a random transition matrix (columns sum to 1)
P2 <- matrix(runif(m2 * n2), nrow = m2, ncol = n2)
P2 <- sweep(P2, 2, colSums(P2), "/")
result2 <- channel_capacity(n2, m2, P2)
cat("Problem status:", result2$status, "\n")
cat(sprintf("Optimal value of C = %.4g\n", result2$capacity))
cat("Optimal variable x =\n")
print(round(result2$x, 4))Problem status: optimal
Optimal value of C = 0.152
Optimal variable x =
[,1]
[1,] 0.4806
[2,] 0.0000
[3,] 0.5194
[4,] 0.0000
Session Info
R version 4.5.3 (2026-03-11)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.3.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] CVXR_1.8.1
loaded via a namespace (and not attached):
[1] slam_0.1-55 cli_3.6.5 knitr_1.51 ECOSolveR_0.6.1
[5] rlang_1.1.7 xfun_0.56 clarabel_0.11.2 otel_0.2.0
[9] gurobi_13.0-1 Rglpk_0.6-5.1 highs_1.12.0-3 cccp_0.3-3
[13] scs_3.2.7 S7_0.2.1 jsonlite_2.0.0 backports_1.5.0
[17] htmltools_0.5.9 Rmosek_11.1.1 gmp_0.7-5.1 piqp_0.6.2
[21] rmarkdown_2.30 grid_4.5.3 evaluate_1.0.5 fastmap_1.2.0
[25] yaml_2.3.12 compiler_4.5.3 codetools_0.2-20 htmlwidgets_1.6.4
[29] Rcpp_1.1.1 osqp_1.0.0 lattice_0.22-9 digest_0.6.39
[33] checkmate_2.3.4 Matrix_1.7-4 tools_4.5.3
References
- Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press. Exercise 4.57, pages 207–208.
- Shannon, C.E. (1948). A Mathematical Theory of Communication. Bell System Technical Journal, 27, 379–423, 623–656.