Skip to contents

Low-level engine for binary-class Optimal Data Analysis. Handles ordered, categorical, and binary attributes with optional prior-odds weighting, Monte Carlo p-value, and leave-one-out validity analysis. Most users should call oda_fit instead.

Usage

oda_univariate_core(x, y, w = NULL,
  attr_type = c("auto","ordered","categorical","binary"),
  priors_on = TRUE, primary = NULL, secondary = NULL,
  miss_codes = NULL, missing_code = NULL,
  loo = c("off","stable","pvalue"), loo_alpha = 0.05,
  mcarlo = TRUE, mc_iter = 25000L, mc_target = 0.05,
  mc_stop = 99.9, mc_stopup = NA_real_, mc_adjust = FALSE,
  mc_seed = NULL, chance_model = c("class","attribute"),
  eval_order = c("mc_then_loo","loo_then_mc"),
  mindenom = 1L,
  direction = c("both","off","greater","less"),
  direction_map = NULL)

Arguments

x

Attribute values.

y

Binary class labels, coercible to 0/1 integers.

w

Optional numeric case weights.

attr_type

Attribute type. "auto" detects from data.

priors_on

If TRUE, use inverse-frequency weighting.

primary

Primary tie-break heuristic. NULL = default by priors_on.

secondary

Secondary tie-break. NULL = "samplerep".

miss_codes

Additional missing-value codes.

missing_code

Scalar alias for miss_codes.

loo

"off", "stable", or "pvalue".

loo_alpha

Alpha threshold for loo = "pvalue".

mcarlo

Run Monte Carlo p-value?

mc_iter

Maximum MC iterations.

mc_target

Significance threshold.

mc_stop

Confidence level (percent) for STOP early stopping.

mc_stopup

Confidence level (percent) for STOPUP.

mc_adjust

Legacy parameter; unused.

mc_seed

RNG seed.

chance_model

"class" (1/2) or "attribute" (1/k_attr) baseline.

eval_order

Controls whether Monte Carlo testing is run before LOO validation or whether eligible ordered-cut LOO stability is checked before Monte Carlo. The default "mc_then_loo" preserves standalone UniODA behaviour. CTA tree building uses "loo_then_mc" internally to reject LOO-unstable ordered-cut candidates before spending MC iterations.

mindenom

Minimum raw observation count required in each child node for a candidate cut to be evaluated. Default 1 (no enforcement).

direction

Directional hypothesis (MPE Chapter 2 scope): "both" (default) or its synonym "off", "greater" (high x predicts class 1; Chapter 2 DIRECTION < 0 1), or "less" (low x predicts class 1; Chapter 2 DIRECTION > 0 1). Ordered and binary attributes only. When direction_map is supplied, categorical DIRECTIONAL is also supported via a fixed mapping; passing direction in "greater"/"less" with a categorical attribute and no direction_map returns ok = FALSE.

direction_map

Named integer vector for categorical fixed-partition DIRECTIONAL (MPE Chapter 4). Names are attribute levels (character); values are 0/1 coded class labels. All attribute levels must be covered. When supplied for a categorical attribute, the specified partition is evaluated without searching alternatives; LOO predictions are trivially stable. Default NULL.

Value

Named list. Key fields: ok, rule, confusion (list with integer counts TP, TN, FP, FN and rate fields sensitivity, specificity as proportions in [0,1]), ess, pac, p_mc, loo, n_eff.