nicetoolbox.evaluation.results_wrapper.core¶

A Pandas-backed API for querying, aggregating, and exporting evaluation results.

Classes

EvaluationResults

A Pandas-backed API for querying, aggregating, and exporting evaluation results.

class nicetoolbox.evaluation.results_wrapper.core.EvaluationResults(root: Path)[source]¶

A Pandas-backed API for querying, aggregating, and exporting evaluation results.

This class wraps a Pandas DataFrame constructed from a filesystem index of npz files and provides a small API for: - filtering pandas rows by index levels, by meta data

(e.g. dataset, algorithm, …) and npz dimensions (person, camera, …)

filtering pandas columns by metrics
performing aggregations with selected aggregation methods
exporting the current view to CSV
returning the current view as a pandas DataFrame with optional flattening
and NaN dropping

The object is stateful: mutating operations (query, aggregate, reset) modify the instance in-place and return self to allow method chaining. The original unmodified DataFrame built at initialization is retained in self._full_df and can be restored via reset().

Examples

>>> from nicetoolbox.evaluation.results_wrapper.core import EvaluationResults
>>> results = EvaluationResults(root=Path("/path/to/evaluation/results"))
>>> # Query for specific dataset and metric, aggregate by camera and export
>>> csv_path = (
...     results.query(dataset="my_dataset", metrics=["jpe", "pck"])
...     .aggregate(group_by=["camera"])
...     .to_csv(output_dir=Path("/path/to/output"), base_name="camera_summary")
... )
>>> print(f"Exported summary to: {csv_path}")

Builds the pandas DataFrame index from the provided root folder.

Parameters:: root – The root folder containing evaluation results.

aggregate(group_by: Iterable[str], agg_funcs: str | List[str] = 'mean') → EvaluationResults[source]¶

Performs a flexible aggregation using pandas groupby. Any index levels not included in group_by are automatically aggregated.

Parameters:

group_by (Iterable[str]) – An iterable of index level names to group by.
agg_funcs (Union[str, List[str]], optional) – Aggregation function(s) to apply. Defaults to “mean”. Can be any valid pandas aggregation function name or a list of such names.

Returns:

Returns self after applying the aggregation to allow: for method chaining. Note that applying multiple aggregation functions results in a MultiIndex on the columns.

Return type:

EvaluationResults

Examples

>>> results = EvaluationResults(root=Path("/path/to/evaluation/results"))
>>> results.aggregate(
...     group_by=["dataset", "algorithm"], agg_funcs=["mean", "std"]
>>> )
# This will group the results by 'dataset' and 'algorithm', computing a
# summary statistic for each metric for each dataset-algorithm pair.

>>> results.reset()
...     .aggregate(
...         group_by=["dataset", "sequence", "algorithm", "camera"],
...         agg_funcs="mean"
>>> )
# This will group the results by 'dataset', 'sequence', 'algorithm', and
# 'camera', computing the mean for each metric for each unique combination
# of these dimensions, a breakdown per camera within each sequence.

property available_metrics: List[str]¶

Returns a list of available metric names in the current DataFrame.

Returns:: A list of metric names.
Return type:: List[str]

query(**filters: Dict[str, str | List[str]]) → EvaluationResults[source]¶

Filtering of NICE Toolbox evaluation results based on index levels and metrics.

Parameters:

**filters – Dict[str, str | List[str]] Keyword arguments mapping DataFrame index levels to selection values. Each keyword must match one of the DataFrame index level names. The provided value may be: - a single value (e.g., person=’p1’), - an iterable of values (e.g., dataset=[‘dataset_A’, ‘dataset_B’])

Returns:

Returns self after applying the requested row/column: indexing to allow for method chaining.

Return type:

EvaluationResults

Examples

>>> results = EvaluationResults(root=Path("/path/to/evaluation/results"))
>>> results.query(dataset="my_dataset", algorithm=["alg1", "alg2"])
# This will filter the results to only include rows where the 'dataset'
# is 'my_dataset' and the 'algorithm' is either 'alg1' or 'alg2'.

>>> results.query(metric_name="jpe", label=["left_knee", "right_knee"])
# This will filter the results to only include rows where the 'metric_name'
# is 'jpe' and the 'label'is either 'left_knee' or 'right_knee'.

reset() → EvaluationResults[source]¶

Restore the view to the originally loaded evaluation results.

Returns:: Returns self after resetting to allow method chaining.
Return type:: EvaluationResults

to_csv(output_dir: Path, file_name: str = 'summary') → Path[source]¶

Exports the current state to a CSV with a meaningful name.

Parameters:

output_dir (Path) – The output directory to save the CSV file.
base_name (str, optional) – The base name for the CSV file. Defaults to “summary”.

Returns:

The path to the exported CSV file.

Return type:

Path

to_dataframe(flatten: bool = False, dropna: bool = False) → DataFrame[source]¶

Returns current view as DataFrame.

Parameters:

flatten – If True, reset index to flat structure
dropna – If True, drop rows with all NaN values

Returns:

pandas DataFrame (copy of current view)