Combining biological and computational approaches
Award Winners Series: Combining biological and computational approaches
Feb 21, 2019 10:00 AM EST
Developing Mechanism-Based Animal Toxicity Models:A Chemocentric Approach Using Big Data
D. P. Russo1, J. Strickland2, A. Karmaus2, W. Wang1, S. Shende1, 3, T. Hartung4, 5,
L. M. Aleksunes6, H. Zhu1, 7
High-throughput in vitro bioassays show potential as alternatives to animal models for toxicity testing. Big data resources, e.g. PubChem, are updated continuously with new in vitro bioassay data, resulting in a massive amount of ever-changing public data. However, incorporating in vitro bioassays from these resources into chemical toxicity evaluations requires significant data curation and analysis based on knowledge of relevant toxicity mechanisms. In this work, we aimed to develop a computational method to automatically extract useful bioassay data from PubChem and assess its ability to predict animal toxicity using read-across. To achieve this, a database containing 7,385 compounds with diverse rat acute oral toxicity data was searched against PubChem to establish in vitro bioprofiles. Using a novel subspace clustering algorithm, bioassays were grouped together based upon chemical substructures identified as significant to bioassay activity. Several bioassay groups showed high predictivity for animal acute oral toxicity using read-across through a cross-validation process (positive prediction rates range from 62%-100%). The predictivity of these models were further validated using a set of over 600 new compounds. Incorporating individual clusters into a consensus model, chemical toxicants in the validation set were prioritized (positive prediction rate equal to 76%). In addition to high predicitivity, chemical fragment – in vitro – in vivo relationships can be highlighted in bioassay clusters to illustrate new animal toxicity mechanisms. This data-driven profiling strategy meets the urgent needs of computational toxicology in the big data era and can be extended to develop predictive models for other complex toxicity endpoints.
1Center for Computational and Integrative Biology, Rutgers University, 2Integrated Laboratory Systems, Inc., 3Department of Computer Science, Rutgers University, 4Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), 5University of Konstanz, CAAT-Europe, 6Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, 7Department of Chemistry, Rutgers University
Integrating Genomics and Epigenomics into Predictive Toxicology of the Aryl Hydrocarbon Receptor
*Departments of Biomedical Engineering and Pharmacology & Toxicology, Center for Research on Ingredient Safety, Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
In the canonical model of gene regulation by the ligand-inducible transcription factor aryl hydrocarbon receptor (AHR), the AHR forms a heterodimer with the related nuclear protein aryl hydrocarbon nuclear translocator (ARNT) and binds cognate recognition sites (AHR response elements, AHREs) in the promoter regions of target genes harboring the 5-bp core motif 5’-GCGTG-3’. However, the vast majority of 5’-GCGTG-3’ sequences in the genome do not bind the AHR upon ligand activation. Conversely, a minority of 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD)-induced genes show AHR binding to AHREs in their proximal promoters. What then are the determinants of AHR binding genome-wide, and the mechanisms by which the AHR regulates target genes?
We have used a combination of published ChIP-Seq data, bioinformatic analysis, and machine learning models to identify sequences flanking the 5-bp core motif that distinguish bound from unbound AHREs. In addition, we found considerable overlap genome-wide between AHR binding and binding by the CCCTC-binding factor (CTCF) and cohesin protein complex, suggesting a significant role for DNA looping and long-range chromatin interactions in AHR-mediated gene regulation. We present a predictive model of proximal and distal gene regulation by the AHR that can be validated by a combination of long-range chromatin interaction assays and targeted epigenome editing (CRISPR interference).
This work provides a general framework to map gene regulatory networks mediated by nuclear receptors and other ligand-activated transcription factors, and develop more mechanistic predictive models of toxicity associated with activation of these factors.