Cross Validation

Cross-validation evaluators for regression models.

class openadmet.models.eval.cross_validation.CrossValidationBase(*, n_resamples: int = 9999, axes_labels: list[str] = ['Measured', 'Predicted'], title: str = 'Pred vs ', pXC50: bool = False, plot_errbars: bool = False, confidence_level: float = 0.95, min_val: float = None, max_val: float = None, **extra_data: Any)[source]

Bases: EvalBase

Base class for cross-validation evaluators.

Variables:

_evaluated (bool) – Whether the evaluator has been run.
axes_labels (list[str]) – Labels for the axes in plots.
title (str) – Title for the plots.
pXC50 (bool) – Whether to plot for pXC50, highlighting 0.5 and 1.0 log range unit.
plot_errbars (bool) – Whether to plot error bars for ensemble predictions.
confidence_level (float) – Confidence level for the confidence interval.
_metrics (dict) – Dictionary of metrics to evaluate.
min_val (float) – Minimum value for the axes.
max_val (float) – Maximum value for the axes.

property active_metrics: Return metrics applicable to the current target scale.

axes_labels: list[str]

confidence_level: float

is_cross_val: ClassVar[bool] = True

max_val: float

property metric_names

Get the list of metric names.

Returns:: List of metric names.
Return type:: list of str

min_val: float

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) → None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:: self: The BaseModel instance. context: The context.

pXC50: bool

plot_errbars: bool

title: str

class openadmet.models.eval.cross_validation.PytorchLightningRepeatedKFoldCrossValidation(*, n_resamples: int = 9999, axes_labels: list[str] = ['Measured', 'Predicted'], title: str = 'Pred vs ', pXC50: bool = False, plot_errbars: bool = False, confidence_level: float = 0.95, min_val: float = None, max_val: float = None, n_splits: int = 5, n_repeats: int = 1, random_seed: int = 42, use_wandb: bool = False, **extra_data: Any)[source]

Bases: CrossValidationBase

Cross-validation evaluator for PyTorch Lightning models.

Variables:

n_splits (int) – Number of splits for cross-validation.
n_repeats (int) – Number of repeats for cross-validation.
random_seed (int) – Random seed for reproducibility. The legacy random_state name is accepted as a deprecated alias.
_evaluated (bool) – Whether the evaluator has been run.
axes_labels (list[str]) – Labels for the axes in plots.
title (str) – Title for the plots.
pXC50 (bool) – Whether to plot for pXC50, highlighting 0.5 and 1.0 log range unit.
confidence_level (float) – Confidence level for the confidence interval.
_metrics (dict) – Dictionary of metrics to evaluate.
min_val (float) – Minimum value for the axes.
max_val (float) – Maximum value for the axes.
use_wandb (bool) – Whether to use wandb for logging.

property active_metrics: Return metrics applicable to Lightning CV using raw metric callables.

axes_labels: list[str]

confidence_level: float

evaluate(model=None, X_train=None, y_true=None, y_pred=None, y_train=None, X_all=None, y_all=None, groups=None, featurizer=None, trainer=None, tag=None, use_wandb=False, target_labels=None, **kwargs)[source]

Evaluate the regression model using repeated K-fold cross-validation with PyTorch Lightning.

Parameters:

model (LightningModelBase) – The PyTorch Lightning model to evaluate.
X_train (array-like) – Training features.
y_true (array-like) – True values for the full dataset.
y_pred (array-like) – Predicted values for the full dataset.
y_train (array-like) – Training targets.
X_all (array-like) – All data features.
y_all (array-like) – All data targets.
groups (array-like, optional) – Group labels for the samples used while splitting the dataset.
featurizer (object) – Featurizer instance for data preprocessing.
trainer (LightningTrainer) – Trainer instance for model training.
tag (str, optional) – Tag for the evaluation run.
use_wandb (bool, optional) – Whether to use Weights & Biases logging.
target_labels (list of str, optional) – List of target names.
kwargs (Dict) – Additional keyword arguments.

Returns:

Dictionary containing cross-validation metrics and confidence intervals.

Return type:

dict

get_stat_caption(t_label)[source]

Get a formatted statistics caption for a given task.

Parameters:: t_label (str) – Task label.
Returns:: Caption string with statistics.
Return type:: str

get_stat_dict(t_label)[source]

Get a statistics dictionary for a given task.

Parameters:: t_label (str) – Task label.
Returns:: Dictionary of statistics for the task.
Return type:: dict

max_val: float

min_val: float

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) → None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:: self: The BaseModel instance. context: The context.

n_repeats: int

n_splits: int

pXC50: bool

random_seed: int

report(write=False, output_dir=None)[source]

Report the evaluation results, optionally writing to disk.

Parameters:

write (bool, optional) – Whether to write the report to disk.
output_dir (str, optional) – Output directory for the report.

Returns:

Dictionary of computed metrics.

Return type:

dict

property task_names

Get the task names after evaluation.

Returns:: List of task names.
Return type:: list of str

title: str

use_wandb: bool

write_report(output_dir)[source]

Write the evaluation report and plots to disk.

Parameters:: output_dir (str) – Output directory for the report and plots.

class openadmet.models.eval.cross_validation.SKLearnRepeatedKFoldCrossValidation(*, n_resamples: int = 9999, axes_labels: list[str] = ['Measured', 'Predicted'], title: str = 'Pred vs ', pXC50: bool = False, plot_errbars: bool = False, confidence_level: float = 0.95, min_val: float = None, max_val: float = None, n_splits: int = 5, n_repeats: int = 1, random_seed: int = 42, **extra_data: Any)[source]

Bases: CrossValidationBase

Cross-validation evaluator for sklearn models (single-task regression).

Variables:

n_splits (int) – Number of splits for cross-validation.
n_repeats (int) – Number of repeats for cross-validation.
random_seed (int) – Random seed for reproducibility. The legacy random_state name is accepted as a deprecated alias.

evaluate(model=None, X_train=None, y_train=None, y_pred=None, y_true=None, X_all=None, y_all=None, groups=None, tag=None, target_labels=None, **kwargs)[source]

Evaluate the regression model using repeated K-fold cross-validation.

Parameters:

model (sklearn-like estimator) – The regression model to evaluate.
X_train (array-like) – Training features.
y_train (array-like) – Training targets.
y_pred (array-like) – Predicted values (not used in cross-validation, but required for interface).
y_true (array-like) – True values (not used in cross-validation, but required for interface).
X_all (array-like) – All data features.
y_all (array-like) – All data targets.
groups (array-like, optional) – Group labels for the samples used while splitting the dataset.
tag (str, optional) – Tag for the evaluation run.
target_labels (list of str, optional) – List of target names.
kwargs (Dict) – Additional keyword arguments.

Returns:

Dictionary containing cross-validation metrics and confidence intervals.

Return type:

dict

get_stat_caption(t_label)[source]

Get a formatted statistics caption for a given task.

Parameters:: t_label (str) – Task label.
Returns:: Caption string with statistics.
Return type:: str

get_stat_dict(t_label)[source]

Get a statistics dictionary for a given task.

Parameters:: t_label (str) – Task label.
Returns:: Dictionary of statistics for the task.
Return type:: dict

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) → None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:: self: The BaseModel instance. context: The context.

n_repeats: int

n_splits: int

random_seed: int

report(write=False, output_dir=None)[source]

Report the evaluation results, optionally writing to disk.

Parameters:

write (bool, optional) – Whether to write the report to disk.
output_dir (str, optional) – Output directory for the report.

Returns:

Dictionary of computed metrics.

Return type:

dict

write_report(output_dir)[source]

Write the evaluation report and plots to disk.

Parameters:: output_dir (str) – Output directory for the report and plots.

openadmet.models.eval.cross_validation.repeated_group_k_fold(X, y, groups, n_splits, n_repeats, random_state)[source]

Generate train/test indices for Repeated Group K-Fold cross-validation.

Parameters:

X (array-like) – Feature data.
y (array-like) – Target data.
groups (array-like) – Group labels for the samples used while splitting the dataset.
n_splits (int) – Number of splits for cross-validation.
n_repeats (int) – Number of repeats for cross-validation.
random_state (int) – Random seed for reproducibility.

Returns:

train_inds (list of np.ndarray) – List of training set indices for each fold.
test_inds (list of np.ndarray) – List of test set indices for each fold.

openadmet.models.eval.cross_validation.wrap_ktau(y_true, y_pred)[source]: Wrap ktau nan omission.

openadmet.models.eval.cross_validation.wrap_spearmanr(y_true, y_pred)[source]: Wrap spearmanR nan omission.