Molecule Fingerprints

Fingerprint featurizer using molfeat library.

class openadmet.models.features.molfeat_fingerprint.FingerprintFeaturizer(*args, fp_type: str, dtype: ~typing.Any = <class 'numpy.float32'>, n_jobs: int = -1)[source]

Bases: MolfeatFeaturizer

Fingerprint featurizer for molecules, relies on molfeat backend.

Variables:
  • type (ClassVar[str]) – The type of the featurizer.

  • fp_type (str) – The type of fingerprint to use (e.g., ‘ecfp4’, ‘morgan’, ‘rdkit’, etc.).

  • dtype (Any) – The data type to use for the fingerprint (e.g., np.float32).

  • n_jobs (int) – The number of jobs to use for featurization, -1 for maximum parallelism.

dtype: Any
featurize(smiles: Iterable[str]) tuple[numpy.ndarray, numpy.ndarray][source]

Featurize a list of SMILES strings.

Parameters:

smiles (Iterable[str]) – List or iterable of SMILES strings to featurize.

Returns:

Tuple of (features, indices). Features is a 2D numpy array of shape ( n_samples, n_features) and indices is a 1D numpy array of the indices of the successfully featurized molecules.

Return type:

tuple

fp_type: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Args:

self: The BaseModel instance. context: The context.

n_jobs: int
type: ClassVar[str] = 'FingerprintFeaturizer'