Cross Validation

Cross-validation evaluators for regression models.

class openadmet.models.eval.cross_validation.CrossValidationBase(*, n_resamples: int = 9999, axes_labels: list[str] = ['Measured', 'Predicted'], title: str = 'Pred vs ', pXC50: bool = False, plot_errbars: bool = False, confidence_level: float = 0.95, min_val: float = None, max_val: float = None, **extra_data: Any)[source]

Bases: EvalBase

Base class for cross-validation evaluators.

Variables:
  • _evaluated (bool) – Whether the evaluator has been run.

  • axes_labels (list[str]) – Labels for the axes in plots.

  • title (str) – Title for the plots.

  • pXC50 (bool) – Whether to plot for pXC50, highlighting 0.5 and 1.0 log range unit.

  • plot_errbars (bool) – Whether to plot error bars for ensemble predictions.

  • confidence_level (float) – Confidence level for the confidence interval.

  • _metrics (dict) – Dictionary of metrics to evaluate.

  • min_val (float) – Minimum value for the axes.

  • max_val (float) – Maximum value for the axes.

property active_metrics

Return metrics applicable to the current target scale.

axes_labels: list[str]
confidence_level: float
is_cross_val: ClassVar[bool] = True
max_val: float
property metric_names

Get the list of metric names.

Returns:

List of metric names.

Return type:

list of str

min_val: float
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

pXC50: bool
plot_errbars: bool
title: str
class openadmet.models.eval.cross_validation.PytorchLightningRepeatedKFoldCrossValidation(*, n_resamples: int = 9999, axes_labels: list[str] = ['Measured', 'Predicted'], title: str = 'Pred vs ', pXC50: bool = False, plot_errbars: bool = False, confidence_level: float = 0.95, min_val: float = None, max_val: float = None, n_splits: int = 5, n_repeats: int = 1, random_state: int = 42, use_wandb: bool = False, **extra_data: Any)[source]

Bases: CrossValidationBase

Cross-validation evaluator for PyTorch Lightning models.

Variables:
  • n_splits (int) – Number of splits for cross-validation.

  • n_repeats (int) – Number of repeats for cross-validation.

  • random_state (int) – Random state for reproducibility.

  • _evaluated (bool) – Whether the evaluator has been run.

  • axes_labels (list[str]) – Labels for the axes in plots.

  • title (str) – Title for the plots.

  • pXC50 (bool) – Whether to plot for pXC50, highlighting 0.5 and 1.0 log range unit.

  • confidence_level (float) – Confidence level for the confidence interval.

  • _metrics (dict) – Dictionary of metrics to evaluate.

  • min_val (float) – Minimum value for the axes.

  • max_val (float) – Maximum value for the axes.

  • use_wandb (bool) – Whether to use wandb for logging.

property active_metrics

Return metrics applicable to Lightning CV using raw metric callables.

axes_labels: list[str]
confidence_level: float
evaluate(model=None, X_train=None, y_true=None, y_pred=None, y_train=None, X_all=None, y_all=None, groups=None, featurizer=None, trainer=None, tag=None, use_wandb=False, target_labels=None, **kwargs)[source]

Evaluate the regression model using repeated K-fold cross-validation with PyTorch Lightning.

Parameters:
  • model (LightningModelBase) – The PyTorch Lightning model to evaluate.

  • X_train (array-like) – Training features.

  • y_true (array-like) – True values for the full dataset.

  • y_pred (array-like) – Predicted values for the full dataset.

  • y_train (array-like) – Training targets.

  • X_all (array-like) – All data features.

  • y_all (array-like) – All data targets.

  • groups (array-like, optional) – Group labels for the samples used while splitting the dataset.

  • featurizer (object) – Featurizer instance for data preprocessing.

  • trainer (LightningTrainer) – Trainer instance for model training.

  • tag (str, optional) – Tag for the evaluation run.

  • use_wandb (bool, optional) – Whether to use Weights & Biases logging.

  • target_labels (list of str, optional) – List of target names.

  • kwargs (Dict) – Additional keyword arguments.

Returns:

Dictionary containing cross-validation metrics and confidence intervals.

Return type:

dict

get_stat_caption(t_label)[source]

Get a formatted statistics caption for a given task.

Parameters:

t_label (str) – Task label.

Returns:

Caption string with statistics.

Return type:

str

get_stat_dict(t_label)[source]

Get a statistics dictionary for a given task.

Parameters:

t_label (str) – Task label.

Returns:

Dictionary of statistics for the task.

Return type:

dict

max_val: float
min_val: float
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

n_repeats: int
n_splits: int
pXC50: bool
random_state: int
report(write=False, output_dir=None)[source]

Report the evaluation results, optionally writing to disk.

Parameters:
  • write (bool, optional) – Whether to write the report to disk.

  • output_dir (str, optional) – Output directory for the report.

Returns:

Dictionary of computed metrics.

Return type:

dict

property task_names

Get the task names after evaluation.

Returns:

List of task names.

Return type:

list of str

title: str
use_wandb: bool
write_report(output_dir)[source]

Write the evaluation report and plots to disk.

Parameters:

output_dir (str) – Output directory for the report and plots.

class openadmet.models.eval.cross_validation.SKLearnRepeatedKFoldCrossValidation(*, n_resamples: int = 9999, axes_labels: list[str] = ['Measured', 'Predicted'], title: str = 'Pred vs ', pXC50: bool = False, plot_errbars: bool = False, confidence_level: float = 0.95, min_val: float = None, max_val: float = None, n_splits: int = 5, n_repeats: int = 1, random_state: int = 42, **extra_data: Any)[source]

Bases: CrossValidationBase

Cross-validation evaluator for sklearn models (single-task regression).

Variables:
  • n_splits (int) – Number of splits for cross-validation.

  • n_repeats (int) – Number of repeats for cross-validation.

  • random_state (int) – Random state for reproducibility.

evaluate(model=None, X_train=None, y_train=None, y_pred=None, y_true=None, X_all=None, y_all=None, groups=None, tag=None, target_labels=None, **kwargs)[source]

Evaluate the regression model using repeated K-fold cross-validation.

Parameters:
  • model (sklearn-like estimator) – The regression model to evaluate.

  • X_train (array-like) – Training features.

  • y_train (array-like) – Training targets.

  • y_pred (array-like) – Predicted values (not used in cross-validation, but required for interface).

  • y_true (array-like) – True values (not used in cross-validation, but required for interface).

  • X_all (array-like) – All data features.

  • y_all (array-like) – All data targets.

  • groups (array-like, optional) – Group labels for the samples used while splitting the dataset.

  • tag (str, optional) – Tag for the evaluation run.

  • target_labels (list of str, optional) – List of target names.

  • kwargs (Dict) – Additional keyword arguments.

Returns:

Dictionary containing cross-validation metrics and confidence intervals.

Return type:

dict

get_stat_caption(t_label)[source]

Get a formatted statistics caption for a given task.

Parameters:

t_label (str) – Task label.

Returns:

Caption string with statistics.

Return type:

str

get_stat_dict(t_label)[source]

Get a statistics dictionary for a given task.

Parameters:

t_label (str) – Task label.

Returns:

Dictionary of statistics for the task.

Return type:

dict

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

n_repeats: int
n_splits: int
random_state: int
report(write=False, output_dir=None)[source]

Report the evaluation results, optionally writing to disk.

Parameters:
  • write (bool, optional) – Whether to write the report to disk.

  • output_dir (str, optional) – Output directory for the report.

Returns:

Dictionary of computed metrics.

Return type:

dict

write_report(output_dir)[source]

Write the evaluation report and plots to disk.

Parameters:

output_dir (str) – Output directory for the report and plots.

openadmet.models.eval.cross_validation.repeated_group_k_fold(X, y, groups, n_splits, n_repeats, random_state)[source]

Generate train/test indices for Repeated Group K-Fold cross-validation.

Parameters:
  • X (array-like) – Feature data.

  • y (array-like) – Target data.

  • groups (array-like) – Group labels for the samples used while splitting the dataset.

  • n_splits (int) – Number of splits for cross-validation.

  • n_repeats (int) – Number of repeats for cross-validation.

  • random_state (int) – Random state for reproducibility.

Returns:

  • train_inds (list of np.ndarray) – List of training set indices for each fold.

  • test_inds (list of np.ndarray) – List of test set indices for each fold.

openadmet.models.eval.cross_validation.wrap_ktau(y_true, y_pred)[source]

Wrap ktau nan omission.

openadmet.models.eval.cross_validation.wrap_spearmanr(y_true, y_pred)[source]

Wrap spearmanR nan omission.