nicetoolbox.evaluation.results_wrapper.core

A Pandas-backed API for querying, aggregating, and exporting evaluation results.

Classes

EvaluationResults

A Pandas-backed API for querying, aggregating, and exporting evaluation results.

class nicetoolbox.evaluation.results_wrapper.core.EvaluationResults(root: Path)[source]

A Pandas-backed API for querying, aggregating, and exporting evaluation results.

This class wraps a Pandas DataFrame constructed from a filesystem index of npz files and provides a small API for: - filtering pandas rows by index levels, by meta data

(e.g. dataset, algorithm, …) and npz dimensions (person, camera, …)

  • filtering pandas columns by metrics

  • performing aggregations with selected aggregation methods

  • exporting the current view to CSV

  • returning the current view as a pandas DataFrame with optional flattening

    and NaN dropping

The object is stateful: mutating operations (query, aggregate, reset) modify the instance in-place and return self to allow method chaining. The original unmodified DataFrame built at initialization is retained in self._full_df and can be restored via reset().

Examples

>>> from nicetoolbox.evaluation.results_wrapper.core import EvaluationResults
>>> results = EvaluationResults(root=Path("/path/to/evaluation/results"))
>>> # Query for specific dataset and metric, aggregate by camera and export
>>> csv_path = (
...     results.query(dataset="my_dataset", metrics=["jpe", "pck"])
...     .aggregate(group_by=["camera"])
...     .to_csv(output_dir=Path("/path/to/output"), base_name="camera_summary")
... )
>>> print(f"Exported summary to: {csv_path}")

Builds the pandas DataFrame index from the provided root folder.

Parameters:

root – The root folder containing evaluation results.

aggregate(group_by: Iterable[str], agg_funcs: str | List[str] = 'mean') EvaluationResults[source]

Performs a flexible aggregation using pandas groupby. Any index levels not included in group_by are automatically aggregated.

Parameters:
  • group_by (Iterable[str]) – An iterable of index level names to group by.

  • agg_funcs (Union[str, List[str]], optional) – Aggregation function(s) to apply. Defaults to “mean”. Can be any valid pandas aggregation function name or a list of such names.

Returns:

Returns self after applying the aggregation to allow

for method chaining. Note that applying multiple aggregation functions results in a MultiIndex on the columns.

Return type:

EvaluationResults

Examples

>>> results = EvaluationResults(root=Path("/path/to/evaluation/results"))
>>> results.aggregate(
...     group_by=["dataset", "algorithm"], agg_funcs=["mean", "std"]
>>> )
# This will group the results by 'dataset' and 'algorithm', computing a
# summary statistic for each metric for each dataset-algorithm pair.
>>> results.reset()
...     .aggregate(
...         group_by=["dataset", "sequence", "algorithm", "camera"],
...         agg_funcs="mean"
>>> )
# This will group the results by 'dataset', 'sequence', 'algorithm', and
# 'camera', computing the mean for each metric for each unique combination
# of these dimensions, a breakdown per camera within each sequence.
property available_metrics: List[str]

Returns a list of available metric names in the current DataFrame.

Returns:

A list of metric names.

Return type:

List[str]

query(**filters: Dict[str, str | List[str]]) EvaluationResults[source]

Filtering of NICE Toolbox evaluation results based on index levels and metrics.

Parameters:

**filters – Dict[str, str | List[str]] Keyword arguments mapping DataFrame index levels to selection values. Each keyword must match one of the DataFrame index level names. The provided value may be: - a single value (e.g., person=’p1’), - an iterable of values (e.g., dataset=[‘dataset_A’, ‘dataset_B’])

Returns:

Returns self after applying the requested row/column

indexing to allow for method chaining.

Return type:

EvaluationResults

Examples

>>> results = EvaluationResults(root=Path("/path/to/evaluation/results"))
>>> results.query(dataset="my_dataset", algorithm=["alg1", "alg2"])
# This will filter the results to only include rows where the 'dataset'
# is 'my_dataset' and the 'algorithm' is either 'alg1' or 'alg2'.
>>> results.query(metric_name="jpe", label=["left_knee", "right_knee"])
# This will filter the results to only include rows where the 'metric_name'
# is 'jpe' and the 'label'is either 'left_knee' or 'right_knee'.
reset() EvaluationResults[source]

Restore the view to the originally loaded evaluation results.

Returns:

Returns self after resetting to allow method chaining.

Return type:

EvaluationResults

to_csv(output_dir: Path, file_name: str = 'summary') Path[source]

Exports the current state to a CSV with a meaningful name.

Parameters:
  • output_dir (Path) – The output directory to save the CSV file.

  • base_name (str, optional) – The base name for the CSV file. Defaults to “summary”.

Returns:

The path to the exported CSV file.

Return type:

Path

to_dataframe(flatten: bool = False, dropna: bool = False) DataFrame[source]

Returns current view as DataFrame.

Parameters:
  • flatten – If True, reset index to flat structure

  • dropna – If True, drop rows with all NaN values

Returns:

pandas DataFrame (copy of current view)