Predict CLI Guide
The predict command-line interface (CLI) generates predictions using trained Anvil models. It supports inference from CSV or SDF input files, hardware accelerator configuration, and optional active learning acquisition functions.
Usage
predict --input-path PATH --model-dir MODEL_DIR [OPTIONS]
Options
- --input-path PATH
Required. Path to the input file containing molecular structures. Supported formats: CSV or SDF.
Example:
predict --input-path ./data/molecules.csv --model-dir ./models/my_model
- --input-col NAME
Column name in the CSV file that contains the molecular structures (SMILES strings). Defaults to
OPENADMET_SMILESif not specified.Example:
predict --input-path ./data/molecules.csv --model-dir ./models/my_model --input-col smiles
- --model-dir PATH
Required. Path to one or more trained model directories produced by
openadmet anvil. Can be specified multiple times to run predictions with multiple models.Example:
predict --input-path ./data/molecules.csv \ --model-dir ./models/model_a \ --model-dir ./models/model_b
- --output-csv FILE
Path to the output CSV file where predictions will be written. Defaults to
predictions.csv.Example:
predict --input-path ./data/molecules.csv \ --model-dir ./models/my_model \ --output-csv ./results/preds.csv
- --accelerator {cpu,gpu,tpu,ipu,mps,auto}
Hardware accelerator to use for inference. Defaults to
gpuif available.Choices:
cpu– Run inference on the CPU.gpu– Run inference on the GPU (default).tpu– Use TPU hardware.ipu– Use IPU hardware.mps– Use Apple MPS backend.auto– Automatically select available hardware.
Example:
predict --input-path ./data/molecules.csv \ --model-dir ./models/my_model \ --accelerator cpu
- --aq-fxn {ucb,ei,pi}
Acquisition function(s) for active learning. Can be specified multiple times to combine different functions. Supported values:
ucb– Upper Confidence Bound (requires--beta).ei– Expected Improvement (requires--best-yand--xi).pi– Probability of Improvement (requires--best-yand--xi).
Example:
predict --input-path ./data/molecules.csv \ --model-dir ./models/my_model \ --aq-fxn ucb --beta 0.5 predict --input-path ./data/molecules.csv \ --model-dir ./models/my_model \ --aq-fxn ei --best-y 1.0 --xi 0.1
- --beta VALUE
Parameter for the
ucbacquisition function.Example:
predict --input-path ./data/molecules.csv \ --model-dir ./models/my_model \ --aq-fxn ucb --beta 2.0
- --best-y VALUE
Parameter for the
eiandpiacquisition functions. Must be specified once per acquisition function.
Description
The predict CLI:
Reads molecular input data from CSV or SDF files.
Loads one or more trained Anvil models from
--model-dir.Runs inference on the specified hardware accelerator.
Optionally applies active learning acquisition functions (UCB, EI, PI).
Writes predictions to the output CSV file.
Example Workflow Run
predict \
--input-path ./data/test_set.csv \
--input-col smiles \
--model-dir ./models/final_model \
--output-csv ./results/predictions.csv \
--accelerator gpu \
--aq-fxn ei --best-y 0.9 --xi 0.05 \
--debug
Expected output:
Predictions written to ./results/predictions.csv
Example: Predict from an SDF File
Suppose you have an input file molecules.sdf containing a set of molecular structures. You can run inference with a trained model directory as follows:
predict \
--input-path ./data/molecules.sdf \
--model-dir ./models/final_model \
--output-csv ./results/predictions_from_sdf.csv \
--accelerator gpu
Notes:
The
--input-coloption is not required when using SDF input.Predictions will be saved in
./results/predictions_from_sdf.csv.If metadata fields (e.g.,
<ID>) are present in the SDF, they will be included in the output CSV alongside predictions.Hardware can be selected with
--accelerator(e.g.,cpu,gpu).
Expected output:
Predictions written to ./results/predictions_from_sdf.csv
Exit Codes
0: Prediction completed successfully.Non-zero: Prediction encountered an error (see logs or use
--debug).
Notes
Multiple models can be specified with
--model-dirto perform ensemble predictions.Acquisition functions must be configured with their required parameters, otherwise execution will fail.
Debug mode provides detailed logging for troubleshooting.