Skip to contents

Returns the stored full-tree training confusion matrix for the final selected CTA model in tidy long format (one row per actual x predicted class pair).

The confusion matrix is captured at fit time at the exact moment the winning candidate is selected, using the same scoring predictions. For the expanded ENUMERATE phase, predictions use majority-fallback for missing attributes. For the root-only stump phase, predictions are path-local (observations whose root attribute is missing are excluded).

This function does not report split-node local confusion. Split-node confusion reflects all observations at a node classified by that node's rule alone; it is not the same as full-tree confusion for trees with more than one split. The two coincide incidentally for stumps but the semantics here are always final-tree.

Usage

cta_confusion_table(tree)

Arguments

tree

A cta_tree from oda_cta_fit.

Value

A data.frame with columns:

actual

Integer actual class label.

predicted

Integer predicted class label.

n

Integer raw count of observations with this actual x predicted combination in the final selected tree.

Rows are sorted by actual then predicted. For a no-tree fit (or if training_confusion is absent), the returned data frame has zero rows but the correct column structure.

Examples

data(mtcars)
X    <- mtcars[, c("cyl", "disp", "hp", "wt")]
y    <- as.integer(mtcars$am)
tree <- oda_cta_fit(X, y, mindenom = 5L, mc_iter = 500L, mc_seed = 42L)
cta_confusion_table(tree)
#>   actual predicted  n
#> 1      0         0 17
#> 2      0         1  2
#> 3      1         0  1
#> 4      1         1 12