nicetoolbox.evaluation.metrics.categorical.confusion_matrix.ConfusionMatrixMetric

class nicetoolbox.evaluation.metrics.categorical.confusion_matrix.ConfusionMatrixMetric(metrics_config: BaseMetricConfig, config_handler: ConfigHandler)[source]

Bases: BaseMetric

Binary confusion matrix with precision, recall, F1 per pre-compute group.

Pools all pred/gt boolean values within each group, then runs sklearn’s confusion_matrix once per group. This avoids Simpson’s paradox from averaging per-pair F1s across groups with different support.

Input arrays must have bool dtype (or castable to bool). Confidence floats should use a separate metric — routing floats here silently corrupts results.

Parameters:
  • metrics_config – Metric-specific configuration.

  • config_handler – Shared config handler with project and evaluation configs.

Methods

compute

Execute the metric end-to-end: load data, compute, return results.

Attributes

metric_config

metric_name

config_handler

compute() MetricResult[source]

Execute the metric end-to-end: load data, compute, return results.

Returns:

MetricResult containing summary tables, per-frame arrays, and plots.