Multiclass ODA (C >= 3) with oda

When the class variable has three or more values, oda_fit() dispatches to the MultiODA engine. The objective, chance benchmark, and output fields all adapt to the multiclass case. This article covers the key differences from binary ODA.

The K-class chance benchmark

For a binary problem, random guessing achieves 50% accuracy per class on average (Mean PAC = 50%, ESS = 0%). For a K-class problem, random guessing achieves 100/K percent per class on average:

$\text{chance benchmark} = 100/K \%$ $\text{ESS} = \frac{\text{Mean PAC} - (100/K)}{100/K} \times 100\%$

For K = 3 (three classes), chance is 33.3%; for K = 4, it is 25%. ESS = 0% still means no better than chance; ESS = 100% still means perfect per-class classification. The scale is the same regardless of K.

Ordered multiclass: cut values and segment-class assignments

For an ordered attribute with C classes, the MultiODA engine finds C - 1 cut values partitioning the attribute range into C contiguous segments, each assigned to one class. The assignment is monotone when no directional constraint is given: the engine searches all valid ordered segment-class assignments and returns the one maximising Mean PAC.

The rule is stored in $rule$cut_values (length C - 1) and $rule$seg_classes (length C, one class per segment).

iris example

library(oda)

x <- iris$Petal.Length
y <- as.integer(iris$Species)   # 1=setosa, 2=versicolor, 3=virginica

fit3 <- oda_fit(
  x         = x,
  y         = y,
  attr_type = "ordered",
  mcarlo    = FALSE,   # no MC permutation  -  instant; use mc_iter for p-value
  loo       = "off"
)
print(fit3)
#> 
#> ODA (multiclass)  attr_type=ordered  priors=TRUE  n=150
#> 
#> Rule: <= 2.45 --> 1   |   (2.45, 4.75] --> 2   |   > 4.75 --> 3
#> 
#>   CLASS     PAC
#>       1  100.0%
#>       2   88.0%
#>       3   98.0%
#> 
#>   Mean PAC: 95.33%   ESS: 93.00%

fit3$rule$cut_values   # two cutpoints separating three segments
#> [1] 2.45 4.75
fit3$rule$seg_classes  # class predicted in each segment
#> [1] 1 2 3

The three segments divide Petal.Length as: setosa (short petals) -> a boundary -> versicolor (medium) -> a boundary -> virginica (long petals).

Categorical multiclass: level-class mapping

For a nominal attribute the engine searches all valid partitions of attribute categories into classes. The best partition maximises Mean PAC. With direction = "ascending" and k = C categories matching C classes, the engine constrains the search to the identity mapping (category i -> class i).

protein type example

Nishikawa et al. (1983) classified 325 proteins into four types by two independent methods. The convergent-validity hypothesis is that the two classification systems agree - i.e., amino acid type i predicts biological type i. direction = "ascending" imposes the identity mapping.¹

biological_type <- c(
  rep(1L, 98), rep(2L, 13), rep(3L,  6), rep(4L,  7),  # amino_acid = 1
  rep(1L, 16), rep(2L, 50), rep(3L,  4), rep(4L, 19),  # amino_acid = 2
  rep(1L,  5), rep(2L,  2), rep(3L, 23), rep(4L, 14),  # amino_acid = 3
  rep(1L,  3), rep(2L,  8), rep(3L, 12), rep(4L, 45)   # amino_acid = 4
)
amino_acid_type <- c(rep(1L, 124), rep(2L, 89), rep(3L, 44), rep(4L, 68))

fit4 <- oda_fit(
  x         = amino_acid_type,
  y         = biological_type,
  attr_type = "categorical",
  direction = "ascending",    # identity map: AA type i -> biological type i
  mc_iter   = 500L,
  mc_seed   = 42L,
  loo       = "off"
)
print(fit4)
#> 
#> ODA (multiclass)  attr_type=categorical  priors=TRUE  n=325
#> 
#> Rule: 1 --> 1   |   2 --> 2   |   3 --> 3   |   4 --> 4
#> 
#>   CLASS     PAC
#>       1   80.3%
#>       2   68.5%
#>       3   51.1%
#>       4   52.9%
#> 
#>   Mean PAC: 63.22%   ESS: 50.96%  p(MC): < .001

This example demonstrates the directional identity-map API (direction = "ascending"). The nondirectional MegaODA parity run for these data - using nondirectional search to reproduce the gold EXE output - is in the package vignette protein-type-multiclass-oda.

Interpreting per-class PAC at K > 2

m <- oda_metrics(fit4)
cat("Mean PAC:", round(m$mean_pac * 100, 2), "%  (chance = 25% for K=4)\n")
#> Mean PAC: 6321.83 %  (chance = 25% for K=4)
cat("ESS:     ", round(m$ess, 2), "%\n")
#> ESS:      50.96 %
cat("PAC by class:", paste(round(m$pac_by_class, 1), collapse = ", "), "%\n")
#> PAC by class: 80.3, 68.5, 51.1, 52.9 %

All four PAC values must exceed the K-class chance benchmark (25%) for the model to beat chance for every class. PAC below chance for a class indicates the model is less useful than random prediction for that class - even when Mean PAC and ESS are positive.

Overall accuracy vs. Mean PAC:

# Overall accuracy gives large classes disproportionate influence
overall_acc <- sum(diag(fit4$confusion)) / sum(fit4$confusion) * 100
cat("Overall accuracy:", round(overall_acc, 2), "%\n")
#> Overall accuracy: 66.46 %
cat("Mean PAC:        ", round(m$mean_pac * 100, 2), "%\n")
#> Mean PAC:         6321.83 %

For balanced class sizes these two measures agree. For imbalanced classes they diverge, and Mean PAC is the appropriate summary: it treats every class equally and scales ESS consistently regardless of class frequency.

Confusion matrix: For C >= 3, $confusion is a C x C integer matrix (actual x predicted):

fit4$confusion
#>       predicted
#> actual  1  2  3  4
#>      1 98 16  5  3
#>      2 13 50  2  8
#>      3  6  4 23 12
#>      4  7 19 14 45

Notes on LOO for multiclass

Leave-one-out is supported for ordered multiclass (loo = "on") and for unweighted categorical multiclass. Weighted categorical multiclass LOO is explicitly forbidden - the engine raises an error because refit-per-fold with fixed categorical weights is not well-defined.

fit3_loo <- oda_fit(
  x         = x,
  y         = y,
  attr_type = "ordered",
  mcarlo    = FALSE,
  loo       = "on"
)
s3 <- summary(fit3_loo)
cat("Training ESS:", round(fit3_loo$ess, 2), "%\n")
#> Training ESS: 93 %
cat("LOO ESS:     ", round(s3$loo$ess_loo, 2), "%\n")
#> LOO ESS:      91 %
cat("LOO status:  ", if (isTRUE(s3$loo$allowed)) "stable" else "not allowed", "\n")
#> LOO status:   stable

Summary

Feature	Binary ODA	Multiclass ODA
Class values	2	>= 3
Chance benchmark	50%	100/K%
Ordered rule	1 cutpoint	C-1 cutpoints
Confusion matrix	TN/FP/FN/TP	CxC integer matrix
LOO	Supported	Ordered and unweighted categorical supported; weighted categorical errors

Multiclass ODA (C >= 3) with oda_fit()