ChemProp
ChemProp featurizer implementation.
- class openadmet.models.features.chemprop.ChemPropFeaturizer(*, normalize_targets: bool = True, n_jobs: int = 4, batch_size: int = 128, shuffle: bool = False)[source]
Bases:
DeepLearningFeaturizerChemPropFeaturizer featurizer for molecules, relies on chemprop.
- Parameters:
normalize_targets (bool, optional) – Whether to normalize the targets using StandardScaler, by default True
n_jobs (int, optional) – Number of parallel workers to use, by default 4
batch_size (int, optional) – Batch size for the DataLoader, by default 128
shuffle (bool, optional) – Whether to shuffle the data in the DataLoader, by default False
- static dataset_to_dataloader(dataset: MoleculeDataset, batch_size: int = 128, shuffle: bool = False, sampler=None, **kwargs) DataLoader[source]
Convert a MoleculeDataset to a PyTorch DataLoader.
- Parameters:
dataset (MoleculeDataset) – The dataset containing the molecules to load.
batch_size (int, optional) – Number of samples per batch to load (default is 128).
shuffle (bool, optional) – Whether to shuffle the data at every epoch (default is False).
sampler (torch.utils.data.Sampler, optional) – Custom sampler to use for loading data (default is None).
**kwargs – Additional keyword arguments passed to the DataLoader.
- Returns:
A PyTorch DataLoader for the given MoleculeDataset.
- Return type:
DataLoader
- featurize(smiles: Iterable[str], y: Iterable[Any] = None) tuple[DataLoader, np.ndarray, StandardScaler, MoleculeDataset | ReactionDataset | MulticomponentDataset][source]
Featurize a list of SMILES strings.
- Parameters:
smiles (Iterable[str]) – List or iterable of SMILES strings to featurize.
y (Iterable[Any], optional) – Target values corresponding to the SMILES strings.
- Returns:
Tuple containing: - DataLoader: PyTorch DataLoader for the dataset. - np.ndarray: Array of indices corresponding to the original input. - StandardScaler: Scaler used for any scaling during featurization. - Union[MoleculeDataset, ReactionDataset, MulticomponentDataset]: PyTorch Dataset containing the features and targets.
- Return type:
tuple
- make_new() ChemPropFeaturizer[source]
Copy parameters to a new ChemPropFeaturizer instance.