nicetoolbox.evaluation.metrics.categorical.confusion_matrix.ConfusionMatrixMetric¶

class nicetoolbox.evaluation.metrics.categorical.confusion_matrix.ConfusionMatrixMetric(metrics_config: BaseMetricConfig, config_handler: ConfigHandler)[source]¶

Bases: BaseMetric

Binary confusion matrix with precision, recall, F1 per pre-compute group.

Pools all pred/gt boolean values within each group, then runs sklearn’s confusion_matrix once per group. This avoids Simpson’s paradox from averaging per-pair F1s across groups with different support.

Input arrays must have bool dtype (or castable to bool). Confidence floats should use a separate metric — routing floats here silently corrupts results.

Parameters:

metrics_config – Metric-specific configuration.
config_handler – Shared config handler with project and evaluation configs.

Methods

compute

Execute the metric end-to-end: load data, compute, return results.

Attributes

`metric_config`
`metric_name`
`config_handler`

compute() → MetricResult[source]¶

Execute the metric end-to-end: load data, compute, return results.

Returns:: MetricResult containing summary tables, per-frame arrays, and plots.