Workflow Base Classes

Base class for Anvil workflows.

class openadmet.models.anvil.workflow_base.AnvilWorkflowBase(*, metadata: ~openadmet.models.anvil.specification.Metadata, data_spec: ~openadmet.models.anvil.specification.DataSpec, transform: ~openadmet.models.transforms.transform_base.TransformBase | None = None, split: ~openadmet.models.split.split_base.SplitterBase, feat: ~openadmet.models.features.feature_base.FeaturizerBase, model: ~openadmet.models.architecture.model_base.ModelBase, ensemble: ~openadmet.models.active_learning.ensemble_base.EnsembleBase | None = None, trainer: ~openadmet.models.trainer.trainer_base.TrainerBase, evals: list[openadmet.models.eval.eval_base.EvalBase], model_kwargs: dict = <factory>, ensemble_kwargs: dict = <factory>, debug: bool = False, resolved_output_dir: ~pathlib._local.Path | None = None)[source]

Bases: BaseModel

Base class for Anvil workflows.

Variables:
  • metadata (Metadata) – Metadata for the workflow.

  • data_spec (DataSpec) – Data specification for the workflow.

  • transform (Optional[TransformBase]) – Optional transform step.

  • split (SplitterBase) – Data splitting strategy.

  • feat (FeaturizerBase) – Feature extraction method.

  • model (ModelBase) – The model to be used.

  • ensemble (Optional[EnsembleBase]) – Optional ensemble model.

  • trainer (TrainerBase) – The trainer for the model.

  • evals (list[EvalBase]) – List of evaluation metrics.

  • model_kwargs (dict) – Runtime model settings from the specification domain.

  • ensemble_kwargs (dict) – Runtime ensemble settings from the specification domain.

  • debug (bool) – Whether to run in debug mode.

check_model_trainer_compatibility() AnvilWorkflowBase[source]

Validate that the model and trainer are compatible.

Raises:

ValueError – If the model and trainer driver types do not match.

Returns:

The validated workflow instance.

Return type:

AnvilWorkflowBase

check_multitask_compatibility() None[source]

Validate that the model and data specification are compatible for multitask learning.

Raises:

ValueError – If the model is multitask but the data specification does not support multitask learning.

check_trainer_cv_compatibility() AnvilWorkflowBase[source]

Validate that the trainer supports cross-validation if any evaluation requires it.

Raises:

ValueError – If the trainer does not support cross-validation but an evaluation requires it.

Returns:

The validated workflow instance.

Return type:

AnvilWorkflowBase

data_spec: DataSpec
debug: bool
ensemble: EnsembleBase | None
ensemble_kwargs: dict
evals: list[openadmet.models.eval.eval_base.EvalBase]
feat: FeaturizerBase
metadata: Metadata
model: ModelBase
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_kwargs: dict
no_ensemble_cross_val() AnvilWorkflowBase[source]

Validate that ensemble models are not used with cross-validation.

Raises:

ValueError – If an ensemble model is used with cross-validation.

Returns:

The validated workflow instance.

Return type:

AnvilWorkflowBase

resolved_output_dir: Path | None
abstract run(output_dir: PathLike = 'anvil_training', debug: bool = False) Any[source]

Run the workflow.

Parameters:
  • output_dir (PathLike, optional) – Directory to save outputs, by default “anvil_training”

  • debug (bool, optional) – Whether to run in debug mode, by default False

Returns:

Result of the workflow run

Return type:

Any

split: SplitterBase
trainer: TrainerBase
transform: TransformBase | None