Skip to contents

Fits a single cta_fit model with group as the class variable and all columns of X as candidate predictors. Returns a structured summary of the CTA balance result.

Usage

cta_balance_table(
  group,
  X,
  w = NULL,
  mindenom = 1L,
  alpha = 0.05,
  loo = "off",
  mc_iter = 5000L,
  mc_seed = NULL,
  ...
)

Arguments

group

Integer (or coercible) binary group indicator. Must have exactly two distinct non-missing values.

X

Data frame of baseline covariate columns.

w

Optional numeric case-weight vector. When supplied, CTA uses case weights and has_weights = TRUE in the result.

mindenom

Integer minimum endpoint denominator passed to cta_fit. Default 1L.

alpha

Numeric significance threshold stored in the result and used in the no_tree_message of cta_balance_plot_data. Default 0.05. Does not override alpha_split; pass alpha_split via ... to change the CTA node-level threshold.

loo

LOO gate mode passed to cta_fit. Default "off".

mc_iter

Integer MC iterations per CTA node. Default 5000L.

mc_seed

Integer RNG seed; NULL for unseeded.

...

Additional arguments forwarded to cta_fit (e.g., alpha_split, prune_alpha, priors_on).

Value

A list of class "cta_balance_table" with fields:

status

Character: "valid_tree", "stump", "no_tree", or "fit_error".

balance_interpretation

Character: "discriminating" or "no_discriminating_combinations" (when no_tree); NA on fit error.

root_attribute

Character; root split variable name; NA when no_tree.

n_endpoints

Integer; number of terminal endpoints; NA when no_tree.

overall_ess

Numeric; full-tree ESS (%) when weights not active; NA otherwise.

overall_wess

Numeric; full-tree WESS (%) when weights active; NA otherwise.

ess_display

Numeric; operative measure (overall_wess when weights active, else overall_ess); NA for no_tree.

d_stat

Numeric; parsimony-adjusted D statistic; NA for no_tree.

mindenom

Integer; MINDENOM used.

alpha

Numeric; significance threshold stored for downstream use.

has_weights

Logical; whether case weights were active.

tree

The raw cta_tree object; NULL on fit error.

endpoint_table

Data frame from cta_endpoint_table; zero-row for no_tree.

node_table

Data frame from cta_node_table.

fit_error

Logical; TRUE when cta_fit threw.

fit_reason

Character; error message when fit_error; NA otherwise.

Details

A status = "no_tree" result means no combination of baseline covariates in X predicted group membership at the declared significance level, LOO constraint, and minimum endpoint denominator. This is favorable evidence of multivariable covariate balance under the declared analytic constraints. It must not be interpreted as a model failure; in balance analysis, inability to discriminate groups is the goal.

group vs. outcome: group is the binary class variable. The scientific outcome is strictly out of scope.

Implementation constraint: this function calls cta_fit once; it does not reimplement ENUMERATE or node-growth logic.

References

Linden A, Yarnold PR (2016). Using machine learning to assess covariate balance in matching studies. Journal of Evaluation in Clinical Practice, 22(6), 861-867.

Examples

X <- data.frame(
  A = c(rep(0L, 20), rep(1L, 20), rep(1L, 20)),
  B = c(rep(0L, 20), rep(0L, 20), rep(1L, 20))
)
group <- c(rep(0L, 40), rep(1L, 20))
ct <- cta_balance_table(group, X, mindenom = 5L,
                         mc_iter = 200L, mc_seed = 42L)
ct$status
#> [1] "valid_tree"
ct$balance_interpretation
#> [1] "discriminating"