Skip to contents

Research question

The Refugee Act of 1980 extended asylum protections in U.S. immigration law. Because the Act was sponsored by Democrats, a plausible hypothesis is that support was partisan: Democrats would tend to vote in favor (Pro) and Republicans against (Con).1

The data below record the vote of 407 U.S. House members alongside their party affiliation. Optimal Data Analysis (UniODA) is used to determine whether party affiliation discriminates vote direction, and to quantify the strength of the association.

Data

Party affiliation (0 = Republican, 1 = Democrat) is the attribute; vote (0 = Con, 1 = Pro) is the class variable. Published cell frequencies are reconstructed directly into observation-level vectors - no external data file is required.

library(oda)

# Cross-classification: rows = vote (class), cols = party (attribute).
#          Rep (0)   Dem (1)
#  Con (0)   118       78     n(Con) = 196
#  Pro (1)    34      177     n(Pro) = 211
vote  <- c(rep(0L, 118), rep(0L,  78), rep(1L,  34), rep(1L, 177))
party <- c(rep(0L, 118), rep(1L,  78), rep(0L,  34), rep(1L, 177))

table(vote, party,
      dnn = c("Vote (0=Con, 1=Pro)", "Party (0=Rep, 1=Dem)"))
#>                    Party (0=Rep, 1=Dem)
#> Vote (0=Con, 1=Pro)   0   1
#>                   0 118  78
#>                   1  34 177

Fit the ODA model

Party affiliation is binary (0/1); ODA scans it as an ordered attribute (no categorical flag), which is consistent with the MegaODA reference analysis. Because the analysis specifies a directional hypothesis a priori (Democrats favor the Act, i.e. higher party values predict vote = Pro), direction = "greater" enforces MPE Chapter 2 directional ordered ODA. Leave-one-out (LOO) jackknife validity analysis is included.

# Canonical reference run (mc_iter = 25000L; not evaluated in CRAN vignette)
fit <- oda_fit(
  x         = party,
  y         = vote,
  attr_type = "ordered",
  direction = "greater",
  mc_iter   = 25000L,
  loo       = "on"
)
# CRAN-safe run: mc_iter = 500L for vignette rendering speed.
# Training rule, ESS, and confusion matrix are identical to the canonical run.
# The MC p-value reflects fewer permutations; use the canonical run for publication.
fit <- oda_fit(
  x         = party,
  y         = vote,
  attr_type = "ordered",
  direction = "greater",
  mc_iter   = 500L,
  mc_seed   = 42L,
  loo       = "on"
)

Rule and confusion matrix

print(fit)
#> 
#> ODA (binary)  attr_type=ordered  priors=TRUE  n=407
#> 
#> Rule: <= 0.5 --> 0   |   > 0.5 --> 1
#> 
#>   CLASS       n     PAC
#>       0     196   60.2%
#>       1     211   83.9%
#> 
#>   Mean PAC: 72.05%   ESS: 44.09%  p(MC): < .001
#> 
#>   -- LOO --
#>   CLASS       n     PAC
#>       0     196   60.2%
#>       1     211   83.9%
#> 
#>   LOO ESS: 44.09%  p(LOO): < .001

ODA identified a single cut at 0.5, separating the two party values:

  • If party <= 0.5 (Republican) -> predict vote = Con (0)
  • If party > 0.5 (Democrat) -> predict vote = Pro (1)

This rule is consistent with the directional hypothesis: Democrats supported the Act and Republicans opposed it.

# Confusion matrix: actual vote (rows) x predicted vote (cols)
conf_mat <- matrix(
  c(fit$confusion$TN, fit$confusion$FP,
    fit$confusion$FN, fit$confusion$TP),
  nrow = 2L, byrow = TRUE,
  dimnames = list(Actual    = c("Con(0)", "Pro(1)"),
                  Predicted = c("Con(0)", "Pro(1)"))
)
print(conf_mat)
#>         Predicted
#> Actual   Con(0) Pro(1)
#>   Con(0)    118     78
#>   Pro(1)     34    177

ESS / PAC / PV interpretation

summary(fit)
#> 
#> ODA Summary (binary)  status=valid  n=407
#>   attr_type=ordered  priors=TRUE  weights=FALSE
#>   Rule: <= 0.5 --> 0   |   > 0.5 --> 1
#> 
#>   -- Train --
#>     Mean PAC (wt): 72.05%   ESS: 44.09%
#>     Sensitivity: 0.839   Specificity: 0.602
#>     p(MC): < .001  [MC permutation, one-tailed]
#>   -- LOO --
#>     CLASS       n     PAC
#>         0     196   60.2%
#>         1     211   83.9%
#>     LOO ESS: 44.09%
#>     LOO Mean PAC: 72.05%
#>     p(LOO): < .001  [Fisher exact (2x2), one-tailed]
# Predictive value: accuracy when the model makes a prediction into each class
pv_con <- fit$confusion$TN / (fit$confusion$TN + fit$confusion$FN)
pv_pro <- fit$confusion$TP / (fit$confusion$TP + fit$confusion$FP)
cat("PV Con (0):", round(pv_con * 100, 1), "%\n")
#> PV Con (0): 77.6 %
cat("PV Pro (1):", round(pv_pro * 100, 1), "%\n")
#> PV Pro (1): 69.4 %
  • PAC (sensitivity per class): 60.2% for Republican members (Con vote), 83.9% for Democratic members (Pro vote). Because 50% correct per class is expected by chance, the model classifies Democratic members nearly twice as well as chance.
  • ESS = 44.09% indicates a moderate effect.2 The asymmetry (60% vs. 84%) reflects stronger partisan signal on the Democratic side - Democrats who sponsored the Act voted for it far more uniformly.
  • PV: When the model predicts a Con vote, it is correct ~77.6% of the time; when it predicts a Pro vote, ~69.4%.

Monte Carlo and LOO validity

The directional MC p-value and LOO result are shown in the summary output above.

  • LOO stability: The leave-one-out ESS equals the training ESS exactly (44.09%), indicating the rule is completely stable - no single observation materially alters the model.
  • LOO Fisher exact p < .001: Statistical significance confirmed in holdout analysis; the one-sided Fisher test is appropriate for the directional hypothesis.
  • MC p-value: The printed p(MC) is a directional Fisher-randomization p-value (one-tailed), consistent with the a priori hypothesis that Democrats vote Pro. The directional p-value is at most half the nondirectional p-value. Interpret by decision threshold (e.g., p < 0.05).

Notes on reproducibility and current scope

Fixture parity. The training rule, confusion matrix, and ESS are verified against MegaODA.exe output in the package test suite (tests/testthat/test-fixture-vignettes.R, Example 1).

MC p-value calibration. The MC p shown here reflects mc_iter = 500L for CRAN build speed. MegaODA reports p = 0.000000 (exact zero) at 25000 iterations; with 500 iterations a near-zero p will still be reported accurately (STOP fires early). Use the canonical run with mc_iter = 25000L (chunk fit-canonical, eval=FALSE) for publication-quality results. Training ESS and confusion matrix are unaffected by mc_iter.

Directional analysis. This vignette uses direction = "greater" (MPE Chapter 2 binary ordered directional ODA). The directional constraint restricts the MC permutation and LOO searches to the hypothesized direction only, yielding a one-tailed p-value consistent with the a priori hypothesis. MPE Chapter 4 categorical/table DIRECTIONAL is Phase 6C (not yet implemented).