Survey Calibration

Introduction

Calibration is a widely used technique in survey sampling. Suppose \(m\) sampling units in a survey have been assigned initial weights \(d_i\) for \(i = 1,\ldots,m\), and furthermore, there are \(n\) auxiliary variables whose values in the sample are known. Calibration seeks to improve the initial weights \(d_i\) by finding new weights \(w_i\) that incorporate this auxiliary information while perturbing the initial weights as little as possible, , the ratio \(g_i = w_i/d_i\) must be close to one. Such reweighting improves precision of estimates (Chapter 7, T. S. Lumley (2010)).

Let \(X \in {\mathbf R}^{m \times n}\) be the matrix of survey samples, with each column corresponding to an auxiliary variable. Reweighting can be expressed as the optimization problem (see Davies, Gillard, and Zhigljavsky (2016)):

\[ \begin{array}{ll} \mbox{minimize} & \sum_{i=1}^m d_i\phi(g_i) \\ \mbox{subject to} & A^Tg = r \end{array} \]

with respect to \(g \in {\mathbf R}^m\), where \(\phi:{\mathbf R} \rightarrow {\mathbf R}\) is a strictly convex function with \(\phi(1) = 0\), \(r \in {\mathbf R}^n\) are the known population totals of the auxiliary variables, and \(A \in {\mathbf R}^{m \times n}\) is related to \(X\) by \(A_{ij} = d_iX_{ij}\) for \(i = 1,\ldots,m\) and \(j = 1,\ldots,n\).

Raking

A common calibration technique is raking, which uses the penalty function \(\phi(g_i) = g_i\log(g_i) - g_i + 1\) as the calibration metric.

We illustrate with the California Academic Performance Index data in the survey package (T. Lumley (2018)) which also supplies facilities for calibration via the function calibrate. Both the population dataset (apipop) and a simple random sample of \(m = 200\) (apisrs) are provided. Suppose that we wish to reweight the observations in the sample using known totals for two variables from the population: stype, the school type (elementary, middle or high) and sch.wide, whether the school met the yearly target or not. This reweighting would make the sample more representative of the general population.

The code below estimates the weights using survey::calibrate.

data(api)
design_api <- svydesign(id = ~dnum, weights = ~pw, data = apisrs)
## Warning in svydesign(id = ~dnum, weights = ~pw, data = apisrs): partial
## argument match of 'id' to 'ids'
## Warning in svydesign.default(id = ~dnum, weights = ~pw, data = apisrs): partial
## argument match of 'id' to 'ids'
## Warning in match.call(definition, call, expand.dots, envir): partial argument
## match of 'id' to 'ids'
formula <- ~stype + sch.wide
T <- apply(model.matrix(object = formula, data = apipop),
           2,
           sum)

cal_api <- calibrate(design_api, formula, population = T, calfun = cal.raking)
w_survey <- weights(cal_api)

The CVXR formulation follows.

di <- apisrs$pw
X <- model.matrix(object = formula, data = apisrs)
A <- di * X
n <- nrow(apisrs)
g <- Variable(n)
constraints <- list(t(A) %*% g == T)

## Raking
Phi_R <- Minimize(sum(di * (-entr(g) - g + 1)))
p <- Problem(Phi_R, constraints)
res <- solve(p)
w_cvxr <- di * res$getValue(g)

We compare the results below in a table which show them to be identical.

## Using functions in the *un echoed* preamble of this document...
build_table(d1 = build_df(apisrs, "Survey", w_survey),
            d2 = build_df(apisrs, "CVXR", w_cvxr),
            title = "Calibration weights from Raking")
Table 1: Calibration weights from Raking
stype sch.wide Survey wts. Frequency CVXR wts. Frequency
E No 28.911 15 28.911 15
E Yes 31.396 127 31.396 127
H No 29.003 13 29.003 13
H Yes 31.497 12 31.497 12
M No 29.033 9 29.033 9
M Yes 31.529 24 31.529 24

Other Calibration Metrics

Two other penalty functions commonly used are:

  • Quadratic \[ \phi^{Q}(g) = \frac{1}{2}(g-1)^2; \]
  • Logit \[ \phi^{L}(g; l, u) = \frac{1}{C}\biggl[ (g-l)\log\left(\frac{g-l}{1-l}\right) + (u-g)\log\left(\frac{u-g}{u-1}\right) \biggr] \mbox{ for } C = \frac{u-l}{(1-l)(u-1)}. \]

It is again easy to incorporate these in our example and compare to survey results.

Quadratic

The survey function for this calibration is invoked as cal.linear.

## Quadratic
Phi_Q  <- Minimize(sum_squares(g - 1) / 2)
p <- Problem(Phi_Q, constraints)
res <- solve(p)
w_cvxr_q <- di * res$getValue(g)
w_survey_q <- weights(calibrate(design_api, formula, population = T, calfun = cal.linear))

N.B. This example used the SCS solver (before CVXR v0.99-7) as the default ECOS solver produced a different number of unique weights. With updates to both ECOSolveR and scs, the situation is now reversed. Such differences are not unheard of among solvers! However, one can obtain the same results by using additional arguments solver = "SCS" and acceleration_lookback = 5L in the call to CVXR::solve() above.

Table 2: Calibration weights from Quadratic metric
stype sch.wide Survey wts. Frequency CVXR wts. Frequency
E No 28.907 15 28.907 15
E Yes 31.397 127 31.397 127
H No 29.005 13 29.005 13
H Yes 31.495 12 31.495 12
M No 29.037 9 29.037 9
M Yes 31.528 24 31.528 24

Logistic

Finally, the logistic, which requires bounds \(l\) and \(u\) on the coefficients; we use \(l=0.9\) and \(u=1.1\).

u <- 1.10; l <- 0.90
w_survey_l <- weights(calibrate(design_api, formula, population = T, calfun = cal.linear,
                                bounds = c(l, u)))
Phi_L <- Minimize(sum(-entr((g - l) / (u - l))  -
                      entr((u - g) / (u - l)))) 
p <- Problem(Phi_L, c(constraints, list(l <= g, g <= u)))
res <- solve(p)
w_cvxr_l <- di * res$getValue(g)
Table 3: Calibration weights from Logit metric
stype sch.wide Survey wts. Frequency CVXR wts. Frequency
E No 28.907 15 28.929 15
E Yes 31.397 127 31.394 127
H No 29.005 13 28.995 13
H Yes 31.495 12 31.505 12
M No 29.037 9 29.014 9
M Yes 31.528 24 31.536 24

Further Metrics

Following examples in survey::calibrate, we can try a few other metrics.

First, the hellinger distance.

hellinger <- make.calfun(Fm1 = function(u, bounds)  ((1 - u / 2)^-2) - 1,
                         dF= function(u, bounds) (1 -u / 2)^-3 ,
                         name = "Hellinger distance")
w_survey_h <- weights(calibrate(design_api, formula, population = T, calfun = hellinger))
Phi_h <- Minimize(sum((1 - g / 2)^(-2)))
p <- Problem(Phi_h, constraints)
res <- solve(p)
w_cvxr_h <- di * res$getValue(g)
Table 4: Calibration weights from Hellinger distance metric
stype sch.wide Survey wts. Frequency CVXR wts. Frequency
E No 28.913 15 28.890 15
E Yes 31.396 127 31.399 127
H No 29.002 13 29.011 13
H Yes 31.498 12 31.488 12
M No 29.031 9 29.056 9
M Yes 31.530 24 31.521 24

Next, the derivative of the inverse hyperbolic sine.

w_survey_s <- weights(calibrate(design_api, formula, population = T, calfun = cal.sinh,
                                bounds = c(l, u)))
Phi_s <- Minimize(sum( 0.5 * (exp(g) + exp(-g))))
p <- Problem(Phi_s, c(constraints, list(l <= g, g <= u)))
res <- solve(p)
w_cvxr_s <- di * res$getValue(g)
Table 5: Calibration weights from derivative of sinh metric
stype sch.wide Survey wts. Frequency CVXR wts. Frequency
E No 28.911 15 28.904 15
E Yes 31.396 127 31.397 127
H No 29.003 13 29.006 13
H Yes 31.497 12 31.494 12
M No 29.033 9 29.041 9
M Yes 31.529 24 31.526 24
## Testthat Results: No output is good

Session Info

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-apple-darwin20
## Running under: macOS Sequoia 15.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Los_Angeles
## tzcode source: internal
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices datasets  utils     methods  
## [8] base     
## 
## other attached packages:
## [1] survey_4.4-2     survival_3.7-0   Matrix_1.7-1     dplyr_1.1.4     
## [5] kableExtra_1.4.0 CVXR_1.0-15      testthat_3.2.1.1 here_1.0.1      
## 
## loaded via a namespace (and not attached):
##  [1] xfun_0.49         bslib_0.8.0       lattice_0.22-6    Rmosek_10.2.0    
##  [5] vctrs_0.6.5       tools_4.4.2       generics_0.1.3    tibble_3.2.1     
##  [9] fansi_1.0.6       highr_0.11        pkgconfig_2.0.3   desc_1.4.3       
## [13] assertthat_0.2.1  lifecycle_1.0.4   compiler_4.4.2    stringr_1.5.1    
## [17] brio_1.1.5        munsell_0.5.1     gurobi_11.0-0     mitools_2.4      
## [21] codetools_0.2-20  ECOSolveR_0.5.5   htmltools_0.5.8.1 sass_0.4.9       
## [25] cccp_0.3-1        yaml_2.3.10       gmp_0.7-5         pillar_1.9.0     
## [29] jquerylib_0.1.4   MASS_7.3-61       rcbc_0.1.0.9001   clarabel_0.9.0.1 
## [33] Rcplex_0.3-6      cachem_1.1.0      tidyselect_1.2.1  digest_0.6.37    
## [37] stringi_1.8.4     slam_0.1-54       bookdown_0.41     splines_4.4.2    
## [41] rprojroot_2.0.4   fastmap_1.2.0     colorspace_2.1-1  cli_3.6.3        
## [45] magrittr_2.0.3    utf8_1.2.4        Rmpfr_0.9-5       scales_1.3.0     
## [49] bit64_4.5.2       rmarkdown_2.29    bit_4.5.0         blogdown_1.19    
## [53] evaluate_1.0.1    knitr_1.48        Rglpk_0.6-5.1     viridisLite_0.4.2
## [57] rlang_1.1.4       Rcpp_1.0.13-1     glue_1.8.0        DBI_1.2.3        
## [61] xml2_1.3.6        pkgload_1.4.0     osqp_0.6.3.3      svglite_2.1.3    
## [65] rstudioapi_0.17.1 jsonlite_1.8.9    R6_2.5.1          systemfonts_1.1.0

Source

R Markdown

References

Davies, G., J. Gillard, and A. Zhigljavsky. 2016. “Comparative Study of Different Penalty Functions and Algorithms in Survey Calibration.” In Advances in Stochastic and Deterministic Global Optimization, 87–127. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-29975-4_6.
Lumley, Thomas. 2018. “Survey: Analysis of Complex Survey Samples.”
Lumley, Thomas S. 2010. Complex Surveys: A Guide to Analysis Using r. Wiley Publishing.