Skip to contents

Converts the data.frame returned by cta_confusion_table (columns actual, predicted, n) to a 2x2 integer matrix suitable for novo_boot_ci.

Usage

as_confusion_matrix(df)

Arguments

df

A data.frame with integer columns actual, predicted, and n. Must represent a binary (2-class) classification with class labels 0 and 1.

Value

A 2x2 integer matrix with rows = actual class (0/1) and columns = predicted class (0/1), matching the training_confusion convention used throughout oda. Row and column names are "0" and "1".

Examples

# From raw data frame:
df <- data.frame(
  actual    = c(0L, 0L, 1L, 1L),
  predicted = c(0L, 1L, 0L, 1L),
  n         = c(146L, 40L, 36L, 33L)
)
m <- as_confusion_matrix(df)
novo_boot_ci(m, nboot = 200L, seed = 1L)
#> Novometric fixed-confusion bootstrap
#>   n = 255   k = 128   nboot = 200   sample_frac = 0.50
#> 
#> Confusion matrix (actual x predicted):
#>       predicted
#> actual   0  1
#>      0 146 40
#>      1  36 33
#> 
#> Observed:  ESS = 26.32%   Mean PAC = 63.16%
#> 
#> 95% CI (2.5% -- 97.5%):
#>   ESS (%)            Model [  4.94,  37.50]  Chance [-18.93,  17.03]  Overlap: TRUE
#>   Mean PAC (%)       Model [ 52.47,  68.75]  Chance [ 40.53,  58.52]  Overlap: TRUE
#>   Sensitivity (%)    Model [ 31.24,  62.24]  Chance [ 18.14,  48.58]  Overlap: TRUE
#>   Specificity (%)    Model [ 64.88,  81.84]  Chance [ 56.17,  75.25]  Overlap: TRUE
#> 
#> Novometric significance (ESS CI non-overlap): FALSE

# From a fitted tree:
# \donttest{
fit <- cta_fit(data.frame(x = seq_len(8L)),
               c(0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L),
               mindenom = 2L, mc_iter = 100L, loo = "off")
ct <- cta_confusion_table(fit)
m  <- as_confusion_matrix(ct)
novo_boot_ci(m, nboot = 200L, seed = 42L)
#> Novometric fixed-confusion bootstrap
#>   n = 8   k = 4   nboot = 200   sample_frac = 0.50
#>   [note: input confusion has one or more zero cells]
#> 
#> Confusion matrix (actual x predicted):
#>       predicted
#> actual 0 1
#>      0 4 0
#>      1 0 4
#> 
#> Observed:  ESS = 100.00%   Mean PAC = 100.00%
#> 
#> 95% CI (2.5% -- 97.5%):
#>   ESS (%)            Model [100.00, 100.00]  Chance [-90.00, 100.00]  Overlap: TRUE
#>   Mean PAC (%)       Model [100.00, 100.00]  Chance [  5.00, 100.00]  Overlap: TRUE
#>   Sensitivity (%)    Model [100.00, 100.00]  Chance [  0.00, 100.00]  Overlap: TRUE
#>   Specificity (%)    Model [100.00, 100.00]  Chance [  0.00, 100.00]  Overlap: TRUE
#> 
#> Novometric significance (ESS CI non-overlap): FALSE
# }