Channel Capacity

Author

CVXPY Developers and Balasubramanian Narasimhan

Introduction

This example is adapted from Boyd and Vandenberghe, Convex Optimization, exercise 4.57, pages 207–208.

Convex optimization can be used to find the channel capacity $C$ of a discrete memoryless channel. Consider a communication channel with input $X (t) \in {1, 2, \dots, n}$ and output $Y (t) \in {1, 2, \dots, m}$ . The random variables $X$ and $Y$ can take $n$ and $m$ different values, respectively.

In a discrete memoryless channel, the relation between the input and the output is given by the transition probability:

$p_{i j} = P (Y (t) = i ∣ X (t) = j) .$

These transition probabilities form the channel transition matrix $P \in R^{m \times n}$ . The columns of $P$ must sum to 1 and all elements of $P$ must be non-negative.

Assume that $X$ has a probability distribution denoted by $x \in R^{n}$ , meaning that $x_{j} = P (X (t) = j)$ for $j \in {1, \dots, n}$ .

From Shannon’s theory, the channel capacity is given by the maximum possible mutual information $I$ between $X$ and $Y$ :

$C = sup_{x} I (X; Y),$

where

$I (X; Y) = - \sum_{i = 1}^{m} y_{i} \log_{2} y_{i} + \sum_{j = 1}^{n} \sum_{i = 1}^{m} x_{j} p_{i j} \log_{2} p_{i j},$

and $y = P x$ is the output distribution.

Given that $x \log x$ is convex for $x \geq 0$ , we can formulate this as a convex optimization problem:

$\begin{array}{ll} maximize & I (X; Y) \\ subject to & \sum_{i = 1}^{n} x_{i} = 1, x ⪰ 0. \end{array}$

Due to the entropy function in CVXR, this can be written quite easily in DCP form.

Channel Capacity Function

We write a function that computes the channel capacity for a given transition matrix $P$ with $n$ input and $m$ output symbols.

channel_capacity <- function(n, m, P, sum_x = 1) {
    ## x is probability distribution of the input signal X(t)
    x <- Variable(n)

    ## y is the probability distribution of the output signal Y(t)
    y <- P %*% x

    ## Compute c_j = sum_i p_{ij} log2(p_{ij})
    ## Use the convention 0*log(0) = 0 by handling zeros
    P_log <- ifelse(P > 0, log2(P), 0)
    c_vec <- colSums(P * P_log)

    ## Mutual information: I = c'x + sum(entr(y)) / log(2)
    I <- t(c_vec) %*% x + sum_entries(entr(y)) / log(2)

    ## Channel capacity is maximized by maximising the mutual information
    obj <- Maximize(I)
    constraints <- list(sum(x) == sum_x, x >= 0)

    ## Form and solve problem
    prob <- Problem(obj, constraints)
    result <- psolve(prob)

    list(status = status(prob),
         capacity = result,
         x = value(x))
}

Example

In this example we consider a communication channel with two possible inputs and outputs, so $n = m = 2$ . The channel transition matrix we use is:

$P = (\begin{matrix} 0.75 & 0.25 \\ 0.25 & 0.75 \end{matrix}) .$

Note that the columns of $P$ must sum to 1 and all elements of $P$ must be positive.

n <- 2
m <- 2
P <- matrix(c(0.75, 0.25,
              0.25, 0.75), nrow = m, ncol = n, byrow = TRUE)

result <- channel_capacity(n, m, P)
cat("Problem status:", result$status, "\n")
cat(sprintf("Optimal value of C = %.4g\n", result$capacity))
cat("Optimal variable x =\n")
print(round(result$x, 4))

Problem status: optimal 
Optimal value of C = 0.1887
Optimal variable x =
     [,1]
[1,]  0.5
[2,]  0.5

The optimal input distribution is uniform ( $x = [0.5, 0.5]$ ) and the channel capacity is approximately $C \approx 0.1887$ bits.

A Larger Example

We can also consider a non-symmetric channel with $n = 4$ inputs and $m = 3$ outputs.

n2 <- 4
m2 <- 3
set.seed(42)
## Generate a random transition matrix (columns sum to 1)
P2 <- matrix(runif(m2 * n2), nrow = m2, ncol = n2)
P2 <- sweep(P2, 2, colSums(P2), "/")

result2 <- channel_capacity(n2, m2, P2)
cat("Problem status:", result2$status, "\n")
cat(sprintf("Optimal value of C = %.4g\n", result2$capacity))
cat("Optimal variable x =\n")
print(round(result2$x, 4))

Problem status: optimal 
Optimal value of C = 0.152
Optimal variable x =
       [,1]
[1,] 0.4805
[2,] 0.0000
[3,] 0.5195
[4,] 0.0000

Session Info

R version 4.5.2 (2025-10-31)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.3

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] CVXR_1.8.0.9207

loaded via a namespace (and not attached):
 [1] slam_0.1-55       cli_3.6.5         knitr_1.51        ECOSolveR_0.6.1  
 [5] rlang_1.1.7       xfun_0.56         clarabel_0.11.2   otel_0.2.0       
 [9] gurobi_13.0-1     Rglpk_0.6-5.1     highs_1.12.0-3    cccp_0.3-3       
[13] scs_3.2.7         S7_0.2.1          jsonlite_2.0.0    Rcplex_0.3-8     
[17] backports_1.5.0   Rmosek_11.1.1     htmltools_0.5.9   gmp_0.7-5.1      
[21] rmarkdown_2.30    piqp_0.6.2        grid_4.5.2        evaluate_1.0.5   
[25] fastmap_1.2.0     yaml_2.3.12       compiler_4.5.2    codetools_0.2-20 
[29] htmlwidgets_1.6.4 Rcpp_1.1.1        osqp_1.0.0        lattice_0.22-9   
[33] digest_0.6.39     checkmate_2.3.4   Matrix_1.7-4      tools_4.5.2

References

Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press. Exercise 4.57, pages 207–208.
Shannon, C.E. (1948). A Mathematical Theory of Communication. Bell System Technical Journal, 27, 379–423, 623–656.