Example Tutorial Notebooks

We recommend starting with our interactive tutorials hosted as a web-book. These tutorials introduce the fundamentals of using OpenADMET Models and provide hands-on, end-to-end examples. The corresponding source code and materials are available on GitHub.

Overview

In these tutorials, we will walk through a typical ADMET modeling workflow — from data curation to model training, comparison, and inference.

As a case study, we focus on Cytochrome P450 (CYP450) inhibition, specifically CYP3A4, the most abundant hepatic isoform and the enzyme responsible for metabolizing nearly 50% of marketed drugs. CYP3A4 inhibition is a major driver of drug–drug interactions (DDIs) and is therefore a critical endpoint in early-stage drug discovery.

What is CYP3A4 inhibition and how is it measured?

CYP3A4 inhibition occurs when a compound decreases the enzymatic activity of CYP3A4, slowing or blocking substrate metabolism. This can occur through:

  • Reversible inhibition: The inhibitor binds transiently, and normal activity resumes upon dissociation.

  • Irreversible inhibition: The inhibitor permanently inactivates the enzyme, requiring new CYP3A4 synthesis to restore activity.

Inhibition is typically measured in vitro using enzyme assays with probe substrates. The most common metric is the \(IC_{50}\) — the concentration of inhibitor required to reduce enzyme activity by 50%. Lower \(IC_{50}\) values indicate stronger inhibition.

Tutorial Structure

All tutorial notebooks are available in the demos/ directory of the GitHub repository. Each notebook demonstrates a key step in building and deploying a CYP3A4 inhibition model using OpenADMET tooling.

  1. Curating CYP3A4 data from ChEMBL Retrieve relevant CYP3A4 inhibition data from public sources (e.g., ChEMBL) and perform essential cleaning and preprocessing to prepare data for modeling.

  2. Training CYP3A4 models with Anvil Train machine learning models using Anvil, OpenADMET’s YAML-based infrastructure for scalable and reproducible model development.

  3. Comparing trained models Evaluate and compare model performance across metrics, and generate standardized reports. Learn how to incorporate your own (BYO) models for comparison.

  4. Training a CYP3A4 model ensemble Use the best-performing models to create an ensemble. Ensembles provide uncertainty estimates, helping contextualize predictions and improve decision-making.

  5. Running model ensemble inference Apply trained ensembles to predict CYP3A4 inhibition on unseen datasets, such as lead series or screening compounds. Introduces active learning workflows for prioritizing compounds for testing.

Curating CYP3A4 Inhibition Data from ChEMBL

Learn how to retrieve and preprocess CYP3A4 inhibition data from ChEMBL for downstream modeling.

https://demos.openadmet.org/en/latest/demos/01_Data_Curation/01_Curate_ChEMBL_Data.html
Training Models

Follow a step-by-step guide to train machine learning models using OpenADMET and Anvil.

https://demos.openadmet.org/en/latest/demos/02_Model_Training/02_Training_Models.html
Comparing Models

Learn how to compare model performance, visualize results, and benchmark across approaches.

https://demos.openadmet.org/en/latest/demos/03_Model_Comparison/03_Comparing_Models.html
Ensemble Model Training

Build ensemble models and integrate active learning strategies to quantify uncertainty and improve prediction robustness.

https://demos.openadmet.org/en/latest/demos/04_Ensemble_Model_Training/04_Ensemble_Model_Training_Active_Learning.html
Model Inference

Run ensemble inference on new datasets and interpret CYP3A4 inhibition predictions.

https://demos.openadmet.org/en/latest/demos/05_Ensemble_Model_Inference/05_Model_Ensemble_Inference.html
Showcase Notebook

Explore a comprehensive, end-to-end showcase notebook hosted on Google Colab — ideal for a quick overview of OpenADMET Models in action.

https://try.openadmet.org