nicetoolbox.evaluation.metrics.evaluate

Evaluation metrics runner and result processing.

Classes

EvalResults

Container for evaluation results.

MetricRunner

Drives all metrics: initialize, process samples, evaluate, collect results.

class nicetoolbox.evaluation.metrics.evaluate.EvalResults(file_groups: ~typing.List[~nicetoolbox.evaluation.metrics.results_schema.ResultFileGroup] = <factory>, summaries: ~typing.List[~nicetoolbox.evaluation.metrics.results_schema.AggregatedResult] = <factory>)[source]

Container for evaluation results.

Stores file groups which hold frame level metrics and also summaries which carry aggregated metrics. Provides a save function to export results to disk.

Structure of saved file groups:
NPZ file path - <experiment_folder>/<dataset_name>__<session>__<sequence>/

<component>/<algorithm>__<metric_type>.npz

NPZ entries - data_description.npy:

{“data_description”: {metric_name: description}} where each description is a dictionary with {

“axis0”: [“person”], “axis1”: [“camera”], “axis2”: [“frames”], “axis3”: [“metric_dim”] }

  • <metric_name>.npy: ndarray of metric results, shape:

[#person x #camera x #frames x #metric_out_dim]

Structure of saved summaries:

CSV file path - <experiment_folder>/<dataset_name>_summary.csv CSV entries - metric_type, metric, component, algorithm, value

save(io_manager: IO) None[source]

Saves all evaluation results to disk for the given dataset.

Parameters:

io_manager (IO) – IO manager for file operations.

class nicetoolbox.evaluation.metrics.evaluate.MetricRunner(loader, eval_cfg: dict)[source]

Drives all metrics: initialize, process samples, evaluate, collect results.

Initializes the metric runner with a data loader and evaluation configuration. Calls the MetricFactory to create all metric handlers.

Parameters:
  • loader – DataLoader that yields batches of data.

  • eval_cfg (dict) – Configuration dictionary for evaluation, including device and metric settings.

evaluate() EvalResults[source]

Runs the full evaluation process: dispatches batches to metric handlers for processing, computes final metric results, and formats results.

Returns:

The final structured evaluation results.

Return type:

EvalResults