Skip to contents

Estimates the precision of an observed binary classification effect by comparing model and chance distributions via permutation/resampling bootstrap. Based on the NOVOboot methodology (Yarnold 2020; Yarnold & Soltysik 2016).

Fixed-confusion bootstrap: This function samples from the observed confusion matrix structure. It does not refit ODA or CTA models and does not estimate model-selection variability. The model distribution is generated by resampling paired (actual, predicted) rows from the expanded confusion table; the chance distribution is generated by independently resampling actual and predicted labels, breaking their association. Novometric significance (Axiom 1) is declared when the 95% confidence intervals for model and chance ESS do not overlap.

Usage

novo_boot_ci(x, ...)

# Default S3 method
novo_boot_ci(x,
             nboot       = 5000L,
             seed        = NULL,
             sample_frac = 0.5,
             probs       = c(0, .025, .05, .25, .5, .75, .95, .975, 1),
             alternative = c("two.sided", "greater", "less"),
             ...)

# S3 method for class 'oda_fit'
novo_boot_ci(x,
             nboot       = 5000L,
             seed        = NULL,
             sample_frac = 0.5,
             probs       = c(0, .025, .05, .25, .5, .75, .95, .975, 1),
             alternative = c("two.sided", "greater", "less"),
             ...)

# S3 method for class 'cta_tree'
novo_boot_ci(x,
             nboot       = 5000L,
             seed        = NULL,
             sample_frac = 0.5,
             probs       = c(0, .025, .05, .25, .5, .75, .95, .975, 1),
             alternative = c("two.sided", "greater", "less"),
             node_id     = NULL,
             weighted    = FALSE,
             ...)

# S3 method for class 'cta_ort'
novo_boot_ci(x,
             nboot       = 5000L,
             seed        = NULL,
             sample_frac = 0.5,
             probs       = c(0, .025, .05, .25, .5, .75, .95, .975, 1),
             alternative = c("two.sided", "greater", "less"),
             stratum_id  = NULL,
             weighted    = FALSE,
             ...)

# S3 method for class 'novo_boot_ci'
print(x, ...)

Arguments

x

For the default method: a 2x2 integer matrix, rows = actual class, columns = predicted class. Same [actual, predicted] convention as training_confusion in a cta_tree and as oda_confusion(). Use byrow = TRUE when constructing with matrix(). For S3 methods: a fitted model object (oda_fit, cta_tree, or cta_ort) from which the training confusion matrix is extracted.

nboot

Number of bootstrap replicates. Default 5000.

seed

Integer seed passed to set.seed, or NULL to use the current RNG state. Use a fixed seed for reproducibility.

sample_frac

Fraction of n sampled per replicate (with replacement). Default 0.5, matching NOVOboot.

probs

Quantile probability levels for the summary table.

alternative

Direction for exact Fisher p-values: "two.sided" (default), "greater", or "less".

node_id

Integer node id of a terminal (leaf) node in a cta_tree. When supplied, the bootstrap uses the class counts for that specific terminal node rather than the full-tree confusion. Only valid for novo_boot_ci.cta_tree.

stratum_id

Integer stratum id from cta_ort$strata. When supplied, the bootstrap uses the class counts for that single terminal LORT stratum rather than the full-LORT confusion. Only valid for novo_boot_ci.cta_ort.

weighted

Logical. When node_id or stratum_id is supplied, weighted = TRUE uses case-weighted class counts and weighted = FALSE (default) uses raw integer counts. Ignored for full-tree paths.

...

For the generic and S3 fit methods: additional arguments passed to novo_boot_ci.default. For print.novo_boot_ci: currently ignored.

Value

An object of class novo_boot_ci, a list with:

call

The matched call.

confusion

Input confusion matrix (integer, 2x2).

n

Total observations (sum(x)).

k

Observations sampled per replicate (round(sample_frac * n)).

nboot, sample_frac, probs, alternative

Input parameters.

has_zero_cells

Logical; TRUE if any cell of x is zero. Does not stop computation; NA propagates for affected metrics in affected replicates.

observed

Data frame with one row per metric. Columns: metric, value. Reports the observed (not bootstrapped) sensitivity, specificity, mean_pac, ess, odds_ratio, and risk_ratio computed directly from the input confusion matrix.

model

Data frame (nboot rows). Per-replicate model bootstrap distributions: sensitivity, specificity, mean_pac, ess (all in %), odds_ratio, risk_ratio, p_value. NA for undefined OR/RR.

chance

Data frame (nboot rows). Same columns as model. Generated by independently resampling actual and predicted labels (null of no classification association).

quantiles

Data frame (length(probs) rows). Quantiles of each metric for model and chance across all replicates, including p_value_model and p_value_chance.

ci

Data frame (one row per metric). Fixed 95% CI bounds (2.5th and 97.5th percentiles) for model and chance. Columns: metric, model_lower, model_upper, chance_lower, chance_upper, overlap.

significant

Logical scalar. TRUE if the ESS model 95% CI lower bound exceeds the ESS chance 95% CI upper bound - novometric Axiom 1 CI non-overlap criterion.

source_type

Character. Evidence provenance tag: "matrix", "oda_fit", "cta_tree", "cta_tree_node", "cta_ort", or "cta_ort_stratum".

source_id

Integer or NA. Node or stratum id when evidence came from a specific sub-unit; NA for full-tree paths.

weighted

Logical or NA. TRUE when weighted class counts were used; FALSE for raw counts; NA for the default matrix path.

Details

Model distribution: The input confusion matrix is expanded to n paired (actual, predicted) observation rows. For each replicate, k row indices are drawn with replacement, preserving the observed (actual, predicted) joint distribution. This mirrors the NOVOboot row-resampling approach.

Chance distribution: Actual and predicted labels are resampled independently for each replicate, breaking any association between them. This generates the null distribution against which the model effect is compared.

p-values: An exact 2x2 Fisher p-value is computed for every replicate confusion matrix for both model and chance distributions. These form precision distributions and complement the CI non-overlap criterion; they are not a substitute for it.

Novometric Axiom 1: A statistically significant effect exists when the exact discrete confidence intervals for model and chance performance do not overlap. significant = TRUE indicates the ESS model 95% CI lies entirely above the ESS chance 95% CI.

ESS formula: ESS(%) = 100 * (mean_PAC - 0.5) / 0.5, consistent with oda_ess_from_meanpac.

OR: Diagnostic odds ratio (TP * TN) / (FP * FN). NA when FP = 0 or FN = 0 in a replicate.

RR: Positive predictive value / false omission rate [TP / (TP+FP)] / [FN / (FN+TN)]. NA when undefined.

References

Yarnold PR (2020). Reformulating the First Axiom of Novometric Theory: Assessing Minimum Sample Size in Experimental Design. Optimal Data Analysis 9, 7–8.

Yarnold PR, Soltysik RC (2016). Maximizing Predictive Accuracy. ODA Books.

Examples

# Myeloma MINDENOM=1 confusion (actual x predicted, byrow = TRUE)
conf <- matrix(c(146, 40,
                  36, 33), nrow = 2, byrow = TRUE)
ci <- novo_boot_ci(conf, nboot = 200L, seed = 42L)
ci$significant
#> [1] FALSE
print(ci)
#> Novometric fixed-confusion bootstrap
#>   n = 255   k = 128   nboot = 200   sample_frac = 0.50
#> 
#> Confusion matrix (actual x predicted):
#>      [,1] [,2]
#> [1,]  146   40
#> [2,]   36   33
#> 
#> Observed:  ESS = 26.32%   Mean PAC = 63.16%
#> 
#> 95% CI (2.5% -- 97.5%):
#>   ESS (%)            Model [ 15.11,  51.79]  Chance [-17.11,  21.60]  Overlap: TRUE
#>   Mean PAC (%)       Model [ 57.56,  75.89]  Chance [ 41.45,  60.80]  Overlap: TRUE
#>   Sensitivity (%)    Model [ 35.07,  69.23]  Chance [ 15.38,  44.84]  Overlap: TRUE
#>   Specificity (%)    Model [ 73.17,  88.51]  Chance [ 64.70,  82.29]  Overlap: TRUE
#> 
#> Novometric significance (ESS CI non-overlap): FALSE