Queuing Design

Author

CVXPY Developers and Balasubramanian Narasimhan

Introduction

In this example, we consider the design of an \(M/M/N\) queuing system, and we study the sensitivity of the design with respect to various parameters, such as the maximum total delay and total service rate. This example was described in the paper Differentiating through Log-Log Convex Programs.

We consider the optimization of a (Markovian) queuing system, with \(N\) queues. A queuing system is a collection of queues, in which queued items wait to be served; the queued items might be threads in an operating system, or packets in an input or output buffer of a networking system. A natural goal is to minimize the service load of the system, given constraints on various properties of the queuing system, such as limits on the maximum delay or latency. In this example, we formulate this design problem as a log-log convex program (LLCP), and compute the sensitivity of the design variables with respect to the parameters.

Problem Formulation

We assume that items arriving at the \(i\)th queue are generated by a Poisson process with rate \(\lambda_i\), and that the service times for the \(i\)th queue follow an exponential distribution with parameter \(\mu_i\), for \(i = 1, \ldots, N\). The service load of the queuing system is a function \(\ell : \mathbf{R}^{N}_{++} \times \mathbf{R}^{N}_{++} \to \mathbf{R}^{N}_{++}\) of the arrival rate vector \(\lambda\) and the service rate vector \(\mu\), with components

\[ \ell_i(\lambda, \mu) = \frac{\mu_i}{\lambda_i}, \quad i = 1, \ldots, N. \]

(This is the reciprocal of the traffic load, which is usually denoted by \(\rho\).) Similarly, the queue occupancy, the average delay, and the total delay of the system are (respectively) functions \(q\), \(w\), and \(d\) of \(\lambda\) and \(\mu\), with components

\[ q_i(\lambda, \mu) = \frac{\ell_i(\lambda, \mu)^{-2}}{1 - \ell_i(\lambda, \mu)^{-1}}, \quad w_i(\lambda, \mu) = \frac{q_i(\lambda, \mu)}{\lambda_i} + \frac{1}{\mu_i}, \quad d_i(\lambda, \mu) = \frac{1}{\mu_i - \lambda_i}. \]

The queuing system has limits on the queue occupancy, average queuing delay, and total delay, which must satisfy

\[ q(\lambda, \mu) \leq q_{\max}, \quad w(\lambda, \mu) \leq w_{\max}, \quad d(\lambda, \mu) \leq d_{\max}, \]

where \(q_{\max}\), \(w_{\max}\), and \(d_{\max} \in \mathbf{R}^{N}_{++}\) are parameters and the inequalities are meant elementwise. Additionally, the arrival rate vector \(\lambda\) must be at least \(\lambda_{\min} \in \mathbf{R}^{N}_{++}\), and the sum of the service rates must be no greater than \(\mu_{\max} \in \mathbf{R}_{++}\).

Our design problem is to choose the arrival rates and service times to minimize a weighted sum of the service loads, \(\gamma^T \ell(\lambda, \mu)\), where \(\gamma \in \mathbf{R}^{N}_{++}\) is the weight vector, while satisfying the constraints. The problem is

\[ \begin{array}{ll} \text{minimize} & \gamma^T \ell(\lambda, \mu) \\ \text{subject to} & q(\lambda, \mu) \leq q_{\max} \\ & w(\lambda, \mu) \leq w_{\max} \\ & d(\lambda, \mu) \leq d_{\max} \\ & \lambda \geq \lambda_{\min}, \quad \sum_{i=1}^{N} \mu_i \leq \mu_{\max}. \end{array} \]

Here, \(\lambda, \mu \in \mathbf{R}^{N}_{++}\) are the variables and \(\gamma, q_{\max}, w_{\max}, d_{\max}, \lambda_{\min} \in \mathbf{R}^{N}_{++}\) and \(\mu_{\max} \in \mathbf{R}_{++}\) are the parameters. This problem is an LLCP.

Setting Up the Problem

N <- 2

mu  <- Variable(N, pos = TRUE, name = "mu")
lam <- Variable(N, pos = TRUE, name = "lambda")
ell <- Variable(N, pos = TRUE, name = "ell")

w_max   <- Parameter(N, pos = TRUE, name = "w_max",   value = c(2.5, 3.0))
d_max   <- Parameter(N, pos = TRUE, name = "d_max",   value = c(2.0, 2.0))
q_max   <- Parameter(N, pos = TRUE, name = "q_max",   value = c(4.0, 5.0))
lam_min <- Parameter(N, pos = TRUE, name = "lam_min", value = c(0.5, 0.8))
mu_max  <- Parameter(1, pos = TRUE, name = "mu_max",  value = 3.0)
gamma   <- Parameter(N, pos = TRUE, name = "gamma",   value = c(1.0, 2.0))

## Queuing system functions
lq <- (ell)^(-2) / one_minus_pos(ell^(-1))
q  <- lq
w  <- lq / lam + 1 / mu
d  <- 1 / diff_pos(mu, lam)

constraints <- list(
    w <= w_max,
    d <= d_max,
    q <= q_max,
    lam >= lam_min,
    sum_entries(mu) <= mu_max,
    ell == mu / lam
)

objective_fn <- t(gamma) %*% ell

problem <- Problem(Minimize(objective_fn), constraints)

Solving the Problem

result <- psolve(problem, requires_grad = TRUE, gp = TRUE, solve_method = "SCS",
                 eps = 1e-14, max_iters = 10000, mode = "dense")
cat(sprintf("Optimal value: %f\n", result))

Optimal value: 4.457107

The solution is printed below.

cat(sprintf("mu:  [%f, %f]\n", value(mu)[1], value(mu)[2]))

mu:  [1.328427, 1.671573]

cat(sprintf("lam: [%f, %f]\n", value(lam)[1], value(lam)[2]))

lam: [0.828427, 1.171573]

cat(sprintf("ell: [%f, %f]\n", value(ell)[1], value(ell)[2]))

ell: [1.603553, 1.426777]

Sensitivities

We compute the derivative of each variable with respect to the parameters. One takeaway is that the solution is highly sensitive to the values of \(d_{\max}\) and \(\mu_{\max}\), and that increasing these parameters would decrease the service loads, especially on the first queue.

result <- psolve(problem, requires_grad = TRUE, gp = TRUE, solve_method = "SCS",
                 eps = 1e-14, max_iters = 10000, mode = "dense")

params_list <- list(gamma = gamma, w_max = w_max, d_max = d_max,
                    q_max = q_max, lam_min = lam_min, mu_max = mu_max)

for (var in list(lam, mu, ell)) {
    cat(sprintf("Variable: %s\n", name(var)))

    cat("Gradient with respect to first component\n")
    gradient(var) <- c(1.0, 0.0)
    backward(problem)
    for (nm in names(params_list)) {
        p <- params_list[[nm]]
        g <- gradient(p)
        if (length(g) == 2) {
            cat(sprintf("  %s: %.3g, %.3g\n", nm, g[1], g[2]))
        } else {
            cat(sprintf("  %s: %.3g\n", nm, g))
        }
    }

    cat("Gradient with respect to second component\n")
    gradient(var) <- c(0.0, 1.0)
    backward(problem)
    for (nm in names(params_list)) {
        p <- params_list[[nm]]
        g <- gradient(p)
        if (length(g) == 2) {
            cat(sprintf("  %s: %.3g, %.3g\n", nm, g[1], g[2]))
        } else {
            cat(sprintf("  %s: %.3g\n", nm, g))
        }
    }

    gradient(var) <- c(0.0, 0.0)
    cat("\n")
}

Variable: lambda
Gradient with respect to first component
  gamma: 0.154, -0.0771
  w_max: -2.74e-14, 7.7e-15
  d_max: -0.404, -0.162
  q_max: -5.44e-15, 7.61e-15
  lam_min: 2.22e-14, -1.46e-14
  mu_max: 0.899
Gradient with respect to second component
  gamma: -0.331, 0.166
  w_max: -3.16e-14, 1.01e-14
  d_max: -0.119, -0.361
  q_max: -6.55e-15, 8.89e-15
  lam_min: 2.04e-14, -1.85e-14
  mu_max: 1.07

Variable: mu
Gradient with respect to first component
  gamma: 0.154, -0.0771
  w_max: -2.14e-16, -1.18e-15
  d_max: -0.654, -0.162
  q_max: 0, -8.4e-16
  lam_min: 5.77e-15, 6.4e-15
  mu_max: -0.101
Gradient with respect to second component
  gamma: -0.331, 0.166
  w_max: -4.5e-14, 7.7e-15
  d_max: -0.119, -0.611
  q_max: -9.77e-15, 9.94e-15
  lam_min: 4e-14, -1.2e-14
  mu_max: 0.0706

Variable: ell
Gradient with respect to first component
  gamma: -0.177, 0.0884
  w_max: 8.27e-15, -2.89e-15
  d_max: -0.289, -0.164
  q_max: 1.44e-15, -2.71e-15
  lam_min: -6.22e-15, 8.41e-15
  mu_max: -0.302
Gradient with respect to second component
  gamma: 0.0884, -0.0442
  w_max: -2.39e-14, 4.14e-15
  d_max: -0.0975, -0.223
  q_max: -4.44e-15, 4.71e-15
  lam_min: 2.4e-14, -5.82e-15
  mu_max: -0.213

Perturbation Analysis

Next, we perturb each parameter by a small amount, and compare the prediction of a first-order approximation to the solution of the perturbed problem to the true solution.

result <- psolve(problem, requires_grad = TRUE, gp = TRUE, solve_method = "SCS",
                 eps = 1e-14, max_iters = 10000, mode = "dense")

mu_value  <- value(mu)
lam_value <- value(lam)

delta_frac <- 0.01
for (nm in names(params_list)) {
    p <- params_list[[nm]]
    delta(p) <- value(p) * delta_frac
}

derivative(problem)

lam_pred <- (delta(lam) / lam_value) * 100
mu_pred  <- (delta(mu) / mu_value) * 100

cat(sprintf("lam predicted (percent change): [%.4f, %.4f]\n", lam_pred[1], lam_pred[2]))

lam predicted (percent change): [2.0000, 2.0000]

cat(sprintf("mu  predicted (percent change): [%.4f, %.4f]\n", mu_pred[1], mu_pred[2]))

mu  predicted (percent change): [0.8708, 1.1026]

## Now solve the perturbed problem
for (nm in names(params_list)) {
    p <- params_list[[nm]]
    value(p) <- value(p) + delta(p)
}
psolve(problem, solver = "SCS", gp = TRUE, eps = 1e-14, max_iters = 10000)

[1] 4.458205

lam_actual <- ((value(lam) - lam_value) / lam_value) * 100
mu_actual  <- ((value(mu) - mu_value) / mu_value) * 100

cat(sprintf("lam actual    (percent change): [%.4f, %.4f]\n", lam_actual[1], lam_actual[2]))

lam actual    (percent change): [2.0174, 2.0167]

cat(sprintf("mu  actual    (percent change): [%.4f, %.4f]\n", mu_actual[1], mu_actual[2]))

mu  actual    (percent change): [0.8830, 1.1153]

Session Info

R version 4.6.0 (2026-04-24)
Platform: aarch64-apple-darwin23
Running under: macOS Tahoe 26.5.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] CVXR_1.9.1

loaded via a namespace (and not attached):
 [1] backports_1.5.1   digest_0.6.39     fastmap_1.2.0     xfun_0.58        
 [5] Matrix_1.7-5      lattice_0.22-9    osqp_1.0.0        knitr_1.51       
 [9] gmp_0.7-5.1       htmltools_0.5.9   rmarkdown_2.31    cli_3.6.6        
[13] S7_0.2.2          clarabel_0.11.2   grid_4.6.0        scs_3.2.7        
[17] compiler_4.6.0    highs_1.14.0-2    tools_4.6.0       checkmate_2.3.4  
[21] evaluate_1.0.5    diffcp_0.1.1      Rcpp_1.1.1-1.1    yaml_2.3.12      
[25] otel_0.2.0        rlang_1.2.0       jsonlite_2.0.0    htmlwidgets_1.6.4