Bayer’s in silico ADMET platform: a journey of machine learning over the past two decades

This paper was originally published by Andreas H. Göller, Lara Kuhnke, Floriane Montanari, Anne Bonin, Sebastian Schneckener, Antoniuster Laak, Jörg Wichard, Mario Lobell, and Alexander Hillisch in Drug Discovery Today (Vol 25, Iss 9) under a Creative Commons license.

Abstract

Over the past two decades, an in silico absorption, distribution, metabolism, and excretion (ADMET) platform has been created at Bayer Pharma with the goal to generate models for a variety of pharmacokinetic and physicochemical endpoints in early drug discovery. These tools are accessible to all scientists within the company and can be a useful in assisting with the selection and design of novel leads, as well as the process of lead optimization. Here. we discuss the development of machine-learning (ML) approaches with special emphasis on data, descriptors, and algorithms. We show that high company internal data quality and tailored descriptors, as well as a thorough understanding of the experimental endpoints, are essential to the utility of our models. We discuss the recent impact of deep neural networks and show selected application examples.

Introduction

ADMET are crucial parameters for the discovery and optimization of new drugs. This has long been recognized, and pharmaceutical companies have invested heavily to develop new assays and to ramp up their testing capacity, enabling them to characterize thousands of compounds in high-quality in vitro ADMET assays. The stored structure–activity/structure–property relationship (SAR/SPR) data can be a treasure and have the potential to impact research beyond the specific project for which these measurements were conducted. Computational groups have since been using those data to generate an understanding for the principles underlying certain ADMET endpoints and to develop in silico models that can serve as an additional tool to assist researchers in their quest for new compounds 1, 2, 3, 4, 5. The primary goal of those models is not to reduce the overall number of in vitro or in vivo ADMET experiments, but to allow scientists to better focus their experiments on the most promising compounds.

In this review, we discuss the development of in silico ADMET approaches generated at Bayer over the past 20 years. There are two conceptually different approaches to in silico ADMET. The first is a protein structure-based approach, in which the interaction of compounds with defined proteins, important for ADMET properties, is modeled and used to design better compounds. It requires ADMET effects clearly associated with single ADMET-relevant proteins (e.g., cytochrome P450 enzymes, PXR, hERG, PgP, or HSA) and high-resolution X-ray structures of those proteins. The application and utility of such approaches has been discussed elsewhere 6, 7. Here, we concentrate on the second conceptual approach, in which measured in vitro/in vivo data of many compounds are used to build models using ML (Fig. 1). We summarize the experience at Bayer over 20 years in ML by pointing out our ‘working-horse’ approaches and we discuss the latest developments in tailored descriptors and algorithms, such as deep neural networks. Finally, we provide application examples and specialized approaches.

Data

Although algorithms and descriptors receive more attention in academic research (where, because of the lack of real-world data, the same standard benchmark sets are used repeatedly), it is the robustness of the underlying data set that defines model performance. Therefore, the usefulness of our in silico ADMET models depends heavily on the quality of the internal, proprietary Bayer data sets.

When we started with ML in 2001, data were sparse, sometimes only some hundred values per endpoint. With the merger of Bayer and Schering in 2007, we not only had to harmonize data, but also made a leap in structural diversity and numbers [8]. Our pharmacokinetic and physicochemistry assays at Bayer now typically produce thousands of homogenous new data points per year (compared with the compiled heterogenous public sets), which is still extremely low compared with our compound collection 8, 9 and even more compared with the drug-like chemical universe 10, 11. Moreover, produced data have multiple issues and biases regarding chemical structures and assay values when used for ML. For example, compound purity is typically <100% and the crystalline state often not well defined. Lipophilic compounds sometimes stick to the test apparatus, resulting in false substance concentrations. Working on congeneric series restricts structural diversity.

In addition, experiments are sometimes performed without reconfirmation. Errors in IC50s [12] can occur because of assay miniaturization, read-outs (e.g., missing chromophores) or automatic curve-fitting procedures. Also, assays have a certain experimental window. Minima are defined by the analytical detection method, maxima often by the solubility of the compound or the typical safety factor to pharmaceutical target concentration. Finally, there can be an outcome bias in terms of which compounds are selected for further testing by the scientists working in drug discovery projects.

Before data are amenable for modeling, some data preparation steps have to be performed. For chemical structures, there is a generally applied procedure [13]. Salts are stripped, charges and tautomer state are standardized, and stereochemistry typically flattened.

Assay data typically need even more attention. Uncertain data have to be removed. Censored data (outside experimental window) [14] might add noise. They can be kept for classifiers for unambiguous cases but have either to be removed or adjusted for regression models. In case of multiple measurements, numbers have to be aggregated. Median values are better here than mean values, because they are less influenced by outliers. For some endpoints, we apply minimum values for assessing the highest known associated risk.

In case the assay protocol changes, different scenarios can be applied. Depending on the assay, the data are merged in such a way that the values stay comparable, if possible. This always requires close collaboration with experimentalists to gain a deeper understanding of assay setup and interpretation of the results. This is more challenging if the data are used for regression rather than for classification models.

If it is not possible to merge the data in a meaningful way, there is the option to use both assays as different tasks in a multitask model. Only as worst case, we would neglect the old data coming from a previous assay setup and create a completely new model.

The importance of data curation and extension is nicely illustrated by two examples. In a collaboration with Simulations Plus, we provided ∼19 500 compounds with experimental pKa values and, by this, reduced the mean absolute error from 0.72 to 0.5 log units [15], resulting in one of the best-performing pKa models [16]. The rigorous curation of the Accelrys Metabolite Database yielding 18 000 high-quality metabolic reactions provided a quality increase in the Site-of-Metabolite (SoM) models [17] (see ‘Applications’ section) compared with the former CypScore [18] based on only 2400 transformations.

Generally, the larger the data set, the more diverse the chemistry, the broader and the more evenly the distribution of the values; in addition, the lower the experimental error, the better performing the model. Over the years, with an increasing size of our proprietary data set, we were able to turn models from classifiers into regression models (Fig. 2).

Figure 2. Bayer in silico absorption, distribution, metabolism, and excretion (ADMET)/physicochemical model portfolio. Endpoint describes the respective assay, model type C (Classifier), N (Numerical); C(N) is a numerical model presented to the user as classifier; Score as described 50, 63. Algorithm: PLS partial least squares, RF random forest, SVR support vector regression, SVM support vector machine, ANN artificial neural network, MTNN multi-task neural network. Generally, MTNN refers to a multi-task model learning the properties logD, melting point, solubility, serum albumin binding, and membrane affinity together. Fraction of unbound of human, rat, mouse, and monkey is learned together with logD as a distinct MTNN. Historical development of model quality from 2005 onwards is shown via color code from red (insufficient quality) light green (good model: high-quality classifier and medium-quality numeric models) to dark green (robust model: high-quality numeric models). These scores reflect the relative robustness of the models for the different endpoints. The low quality of the mouse microsomal stability models is speculated to be linked to a lower quality of these data compared with other species.

Descriptors

Molecules are dynamic multiconformational 3D entities comprising nuclei and electrons. Translation into machine-readable descriptors will result in information loss. Descriptors can be classified by increasing complexity into 1D constitutional descriptors, such as molecular weight, 2D or topological descriptors (e.g., fingerprints), and 3D descriptors, based on molecular fields or quantum chemistry from 3D structures [19].

Our work-horse descriptors since 2001, confirmed by many publications, are circular extended connectivity [20] fingerprints (ECFP), which encode properties of atoms and their neighbors into a bit vector of certain radius, feature type, and fold. The best setting for each task, depending on structure diversity and endpoint, always has to be empirically identified by trial.

Nevertheless, we asked ourselves: can we do better? Inspired by machine translation models [21], we developed a method to encode the SMILES of a molecule into a 512D continuous space, and a recurrent network to translate that embedding back into a canonical form of SMILES [22]. This type of network, depending only on chemical structure, can be trained on very large data sets of tens of millions of structures. The resulting molecular descriptor was useful for building quantitative structure–activity relationship (QSAR) models (especially in combination with support vector machines) and for virtual screening [23]. The presence of the decoder that can map the embedding of a compound to its chemical structure, is also significant in the sense that the chemical representation is now reversible. This means that navigating the embedding space in directions that improve the properties of interest can lead to the generation of new ideas for lead optimization projects, after decoding the optimized embedding [22].

Alternatively, a molecule can be seen as a graph, with atoms being the vertices and bonds the edges. This opens another way to generate machine-learned descriptors. Graph convolutional networks [24] are a specific neural network architecture that can learn vertex (and bond) features in an end-to-end fashion. The feature representation of each node is aggregated by summing or averaging with those of its neighbors using a so-called ‘adjacency matrix’. The aggregated nodes are then fed into neural networks applying affine transformation with learned weights followed by a nonlinear activation function. The effect is that the atom features are learned and the neighboring atoms in the graph can influence each other. This approach has been implemented specifically for chemistry applications 25, 26, 27. However, because the training is end-to-end (meaning the features are tailored to the problem at hand), large training sets are needed to avoid overfitting.

Earlier, we discussed molecular descriptors (i.e., describing the molecule as a whole). By contrast, SoM modeling of the regioselectivity of reactions catalyzed by metabolic enzymes requires ML on individual atoms, not whole molecules. We recently developed anisotropic atomic reactivity descriptors from quantum mechanical atomic charges [28]. A SoM model was trained using ∼18 000 chemical transformations of 18 different enzymatic reactions, said descriptors, and random forests 17, 29. The model performs well with a precision of 0.51 for a hard test set extracted from the most recent literature (see ‘Application Examples’ section). Over the past year, the descriptor was further applied in an attempt to assess the risk of Ames mutagenicity of primary aromatic amines [30] and the strengths of hydrogen bonding 31, 32, 33.

Especially with regard to compounds beyond the rule-of-five 34, 35 chemical space, and the recent interest in macrocycles, it is believed that descriptors derived from 2D representations are not sufficient. Starting from 3D structures adds one level of complexity because of conformational ensembles [36], but enables researchers to derive descriptors for the fluctuation of, for example, the polar surface area [37] or (intramolecular) hydrogen bonding 17, 29, 33, 38. One such descriptor is the MDFP derived from molecular dynamics for modeling solvation-free energies and distribution coefficients [39]. Clever descriptions of 3D features of molecules would constitute one approach towards the improvement of in silico ADMET and other ML models.

Algorithms

The dependence between descriptors and endpoints is nonlinear, requiring also nonlinear algorithms. Our observation over the years [40], again supported by many publications, is that support vector machines [41] and random forests [42] are typically among the most useful algorithms. Partial least squares [43] sometimes yields models that are more stable over longer periods of time. Regression models are always preferred over classifiers and, therefore, new algorithms are constantly evaluated.

Over the past 5 years, there has been a steep increase in the use of deep neural networks in computational chemistry [44]. They are well suited for multitask learning and are promising because they can extract chemical features unbiased by the choice of particular fingerprints (see ‘Descriptors’ section). An intriguing example is the modeling of cellular toxicity [45] or the Bayer model for rat oral bioavailability from chemical structures [46]. Whereas the classical approach of fingerprint and/or random forest was identical in classification performance to deep learning, regression models for exposure were only achievable with a deep-learning approach.

Deep learning also makes multitask learning (i.e., simultaneously learning several related tasks in one model) very natural. We performed comparisons on in-house ADMET data sets and found that, in general, in line with earlier publications 47, 48, deep neural networks outperform random forest or similar methods by a small margin, whereas combined training of related endpoints allows for tasks with fewer measurements to benefit from the joined learned representations in a multitask setting [49]. Particularly for physicochemical properties, combining graph convolutional networks with multitask training led to a significant increase in performance for all modeled endpoints compared with the original methodology [24]. One of the endpoints that especially benefitted from this new multitask deep-learning network was solubility. Here, we were able to replace the former classification with a regression model for solubility (Fig. 2).

The need for novel methods that allow for good regression models is based on the requirement to feed the model outcomes into multiple parameter optimization processes, which does not work with classifiers. Numerical data also allow for more flexible decision making because each regression model can be turned into a classification model at any desired threshold.

For endpoints with few data points, imbalanced, or censored data, for which it is challenging to build good regression models, we can still derive robust classifications. Other candidates for classification models are endpoints with a binary output, such as phospholipidosis (Fig. 2). Finally, traffic lights [50] are often the clearest and most user-friendly visualization, especially if they are validated with meaningful druglike datasets.

Quality assessment of models

Proper evaluation of models via nested cross-validation (CV) and on independent test sets is crucial to ensure robust modeling outside the chemical space used for training. Different metrics have to be applied for categorical or regression problems.

Common metrics for classification models are derived from the so-called ‘confusion matrix’, which provides numbers for true positives, true negatives, false positives, and false negatives and, as derivatives, overall accuracy, sensitivity, specificity, negative/positive precision values, and the Matthews Correlation Coefficient (MCC). Another popular metric is the area under the ROC curve (AUC), which gives the classification performance of the model for every possible class threshold, including the parameters true positive and false positive rate. For regression models, common metrics are R² (coefficient of determination R-squared), root mean square error (standard deviation of residuals), and Spearman’s rho (nonparametric rank correlation coefficient).

Besides choosing the right metrics, also well-chosen statistical validation techniques are crucial to find a suitable balance between the two extremes of overfitted and underfitted models [51]. We typically put 20% of the data aside as external test set for the quality of the final model. The other 80% is used as training data in a CV set up. CV with random split is inadequate for pharma-like chemical congeneric series. Chronological ‘time-dependent’ CV or ‘leave-cluster-out’ CCV are more realistic estimators for the ability of a model to extrapolate into new unforeseen chemical spaces. K-Means clustering is our preferred method for the ‘leave-cluster-out’ validation because it is convenient to compute and leads to good results, but any clustering method can be used. Which type of CV shows the most realistic predictivity is data set dependent. Whereas our best logD (pH7.4) model is relatively robust with a leave-one-cluster-out CV R2 of 0.88 compared with 0.91 at random split [49], our at all times best DMSO solubility model showed R2 of 0.59 in leave-one-cluster-out but only 0.68 at random split CV.

Our current minimum quality requirements for implementation are primarily based on leave-one-cluster-out CV. For classification, the MCC should be >0.4 and for regression models the Pearson R² should be >0.3 and Spearman R² >0.6.

Rodgers et al. [52] reported the importance of regular model updates to keep them useful for current compounds. The achievable improvement from regular retraining depends on the model building technique used and on the studied property.

We have already implemented a weekly automated data download and filtering process and an automated model retraining for endpoints such as microsomal and hepatocyte stability (Fig. 2) and are currently implementing it for further endpoints. It is vital here to include the time stamps of measured data to track retraining effects and check the quality of the automated models.

For instance, for metabolic stability, we observed that, over time, although global CV performance stays steady, retraining has a positive effect on modeling accuracy of upcoming chemical classes, even if a low number of molecules (<50) are being added to the training data. In our experience, the risk of frequent ‘flipping’ between classes, which might be annoying for the medicinal chemists, is rather low.

As mentioned earlier, the accuracy of property modeling for completely novel molecules has limitations because the drug-like [53] space is immense and our models are based on limited sets from our confined chemical space.

The test set-derived prediction error provides information on the average performance on this set, but not on the prediction reliability for individual future molecules. Therefore, many different so-called ‘applicability domain’ (AD) measures have been introduced in recent years, which can be grouped into two classes. Methods that apply distance measures on how well the future object is embedded in the training set, are termed ‘novelty detection’. Methods that quantify the distance to the decision boundary of the classifier are called ‘confidence estimation’. Whereas the former can be applied for any algorithm using, for example, cosine, Tanimoto, or Mahalanobis distance [54] to the full training set [55], the latter measures are algorithm dependent. Recently, it was shown that confidence measures are superior in general 56, 57. Alternative concepts, such as conformal prediction [58] or using one ML approach for estimation of the AD of another one [59], still have to prove their usefulness.

Most of the Bayer models provide algorithm-inherent reliability estimates together with the actual values. In case of random forests, this is the percentage of tree votes or, in the case of SVM, distance from the hyperplane. Predictions below certain predefined thresholds (e.g., 0.6 for random forest) are not reported. Multitask models do not yet provide a reliability metric.

Application examples

Our internal data informatics platform has become a useful tool for assisting the processes of lead selection, compound design, and synthesis planning. It allows all scientists at Bayer quick access to the newest models. A typical decision spreadsheet is shown in Fig. 3.

Figure 3. Bayer models are made available to researchers via an in-house developed data informatics platform, a thick-client workbook with capabilities for data retrieval from the data warehouse, data visualization, compute tools, and document and project management interface. A typical visualization with spreadsheet and scatterplot is shown.

In the early hit-to-lead phase, hit lists are clustered and prioritized by Bayer expert teams using standardized reports that incorporate both experimental data and a full in silico profile, allowing documented data-driven decision taking. Our models are applied for in silico property characterization and help to assess virtual compounds during lead identification and optimization in a variety of drug discovery projects at Bayer (e.g., 60, 61, 62). For example, during the refinement of PTGES inhibitors, experimental and predicted Caco-2 data and calculated pKa values for a series of 30 compounds implicated a narrow balance between acidity and basicity, which could then be translated into good oral bioavailability in rats [62].

The in silico ADMET platform was also an integral part of the Next Generation Library Initiative (NGLI) [9], aiming to enhance the screening collection with 500 000 newly designed compounds, applying Pareto design to achieve favorable physicochemical and ADMET properties 50, 63. Figure 4a shows the distribution of oral PhysChem scores for the NGLI compounds compared with the historic Bayer high-throughput screening (HTS) library, which is significantly left-shifted (favorable compound properties) for the NGLI compounds. The oral PhysChem Score is a composite of five predicted physicochemical properties (solubility, topological polar surface area, molecular weight, lipophilicity, and flexibility). The lower the score (can be in the range 0–10), the better.

Figure 4. Examples of the application of Bayer’s absorption, distribution, metabolism, and excretion (ADMET) tools. (a) Distribution of oral PhysChem Scores for the compounds from the Next-Generation Library Initiative (NGLI) campaign compared with a historical library used in high-throughput screening (HTS), the oral PhysChem Score being a composite of five predicted physicochemical properties (solubility, topological polar surface area, molecular weight, lipophilicity, and flexibility), the lower the score (can be in the range 0–10) the better; (b) representative examples from an mineralocorticoid receptor (MR) antagonist project at Bayer describing the optimization of metabolically labile compounds, and (c) the protein structure-based ADMET modeling of reducing pregnane x receptor (PXR) affinity [measured EC50 and the minimal effective concentration (MEC) of compounds by chemical modifications].

Over the years, we developed two methods for regioselectivity modeling of metabolic transformations, namely CypScore [18] and MetScore 17, 29, which have been applied in various projects in an attempt to reduce hepatic clearance. One example is a mineralocorticoid receptor antagonists series (Fig. 4b) [64], for which a yet unknown metabolic clearance route was hypothesized.

We have also been looking into protein structure-based ADMET design for several ADMET-relevant proteins [6], but successful examples are rare. Those off-target proteins are characterized by huge, highly flexible ligand-binding sites recognizing a spectrum of ligands. Typically, docking approaches are overburdened in those cases, when X-ray crystallography reveals novel protein conformations with each newly co-crystallized ligand. In an attempt to address this issue, we applied a combined strategy of co-crystallization of novel ligands with pregnane-X-receptor (PXR), overlaying/docking compounds and using the protein structure information to reduce protein–ligand contacts and to overcome persistent Cyp3A4 induction problems [60]. Figure 4c shows a high-affinity thrombin inhibitor [65] with a significant PXR-binding liability. Introduction of a phenolic OH group in a highly lipophilic region in the PXR ligand-binding site (between Trp299 and Phe288, orange surface, red arrow) led to a significant reduction in binding interactions.

Modeling is but one of several tools that can help guide the drug discovery process. The latter remains a challenging process still dominated by trial and error. Modeling has its limitations in terms of how much useful information it can provide, and the development of our modeling program over the past two decades has produced successes as well as failures. Even with our robust internal data set and advanced techniques, modeling ultimately does not eliminate the need for experimental data, especially when it comes to biological experiments.

Concluding remarks and outlook

Over the past two decades, in developing, applying, and experimenting with in silico ADMET models, we have learned that the successful application of those tools is mainly dependent on: (i) the model quality; (ii) model relevance for research processes; and (iii) easy access and interpretability of results. Data, algorithms, and descriptors are all contributing to model quality. A large number of homogenous data and descriptors that are tailored to the underlying experimental endpoint are essential to achieve robust models. The automated generation of large numbers of models (combination of data splits, descriptors, and ML algorithms) and selection of the most accurate ones is helpful. It is also the technical basis for regular model retraining to cover the most current chemical space and to ensure a positive impact on drug discovery projects. The alignment of in silico models with readily available in vitro/in vivo assays is important for the acceptance and usage of in silico approaches in drug discovery projects by many scientists from different fields. Finally, the easy access to in silico models is achieved in software tools that allow compound properties to be modeled with a few clicks in seconds. Color coding and estimates of the applicability domain increase the interpretability of the output and ensure usage in drug discovery projects. Although modeling of physicochemical endpoints works reasonably well, the main optimization parameters for oral bioavailability, such as cellular permeation and metabolic clearance, and in vivo approaches still need significant improvements. Quantitative models for these endpoints and cytochrome inhibition would be desirable. The limited datasets of single pharmaceutical companies and those that are published might not be enough to reach that goal. Thus, novel approaches of privacy preserving data sharing could be one solution to overcome the lack of data and further advance the field. Additional improvements could come from better 3D-based descriptors of molecules taking intramolecular hydrogen bonds and tautomers into account. Current and future challenges are the proper embedding of in silico ADMET models in holistic artificial intelligence approaches (together with estimation of binding affinity and compound synthesizability) and superior solutions for applicability domain estimates.

Conflict of interest

All authors are employees of Bayer AG. A.B., S.S., A.tL., J.W., M.L. and A.H. own stock in Bayer AG.

Acknowledgments

We thank Michael Beck, Matthias Busemann, Djork-Arne Clevert, Ursula Ganzer, Mark Gnoth, Francois Guillou, Michael Grimm, Nikolaus Heinrich, Jörg Keldenich, Ursula Krenz, Dieter Lang, Philip Lienau, Klemens Lustig, Heinrich Meier, Stephan Menz, Britta Nisius, Martin Radke, Andreas Reichel, Karl-Heinz Schlemmer, Rolf Schönneis, Rudolf Schohe-Loop, Thomas Steger-Hartmann, Andreas Sutter, and many other colleagues from Medicinal Chemistry, Toxicology and DMPK for valuable contributions to the in silico ADMET platform. Special thanks go to Christian Paulitz-Erdmann and Gabriele Handke-Ergüden for critical reviewing of the manuscript.

References

H. van de Waterbeemd, E. Gifford
ADMET in silico modelling: towards prediction paradise? Nat. Rev. Drug Discov., 2 (2003), pp. 192-204
M.P. Gleeson, et al. In-silico ADME models: a general assessment of their utility in drug discovery applications Curr. Top. Med. Chem., 11 (2011), pp. 358-381
F. Cheng, et al. In silico ADMET prediction: recent advances, current challenges and future trends Curr. Top. Med. Chem., 13 (2013), pp. 1273-1289
S. Alqahtani In silico ADME-Tox modeling: progress and prospects Expert Opin. Drug Metab. Toxicol., 13 (2017), pp. 1147-1158
F. Lombardo, et al. In silico absorption, distribution, metabolism, excretion, and pharmacokinetics (ADME-PK): utility and best practices. An industry perspective from the International Consortium for Innovation through Quality in Pharmaceutical Development J. Med. Chem., 60 (2017), pp. 9097-9113
F. Stoll, et al. Utility of protein structures in overcoming ADMET-related issues of drug-like compounds Drug Discov. Today, 16 (2011), pp. 530-538
G. Moroy, et al. Toward in silico structure-based ADMET prediction in drug discovery Drug Discov. Today, 17 (2012), pp. 44-55
J. Schamberger, et al.Rendezvous in chemical space? Comparing the small molecule compound libraries of Bayer and Schering Drug Discov. Today, 16 (2011), pp. 636-641
M. Follmann, et al. An approach towards enhancement of a screening library: The Next Generation Library Initiative (NGLI) at Bayer – against all odds? Drug Discov. Today, 24 (2019), pp. 668-672
R.S. Bohacek, et al.The art and practice of structure-based drug design: a molecular modeling perspective Med. Res. Rev., 16 (1996), pp. 3-50
C.M. Dobson Chemical space and biology Nature, 432 (2004), pp. 824-828
S.P. Brown, et al. Healthy skepticism: assessing realistic model performance Drug Discov. Today, 14 (2009), pp. 420-427
D. Fourches, et al. Trust, but Verify II: a practical guide to chemogenomics data curation J. Chem. Inf. Model., 56 (2016), pp. 1243-1252
J. Buckley, I. James Linear regression with censored data Biometrika, 66 (1979), pp. 429-436
R. Fraczkiewicz, et al.Best of both worlds: combining pharma data and state of the art modeling technology to improve in silico pKa prediction J. Chem. Inf. Model., 55 (2015), pp. 389-397
Drug Design Data Resource Sampl6 Drug Design Data Resource (2018)
A.R. Finkelmann, et al.MetScore: site of metabolism prediction beyond cytochrome P450 enzymes ChemMedChem, 13 (2018), pp. 2281-2289
M. Hennemann, et al. CypScore: quantitative prediction of reactivity toward cytochromes P450 based on semiempirical molecular orbital theory ChemMedChem, 4 (2009), pp. 657-669
R.C. Todeschini, V. Consonni Handbook of Molecular Descriptors Wiley-VCH (2000)
D. Rogers, M. Hahn Extended-connectivity fingerprints J. Chem. Inf. Model., 50 (2010), pp. 742-754
K. Cho, et al. On the properties of neural machine translation: encoder-decoder approaches arXiv, 2014 (2014) arXiv:1409.1259
R. Winter, et al. Efficient multi-objective molecular optimization in a continuous latent space Chem. Sci., 10 (2019), pp. 8016-8024
R. Winter, et al. Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations Chem. Sci., 10 (2019), pp. 1692-1701
T.N. Kipf, M. Welling Semi-supervised classification with graph convolutional networks arXiv, 2016 (2016) arXiv:1609.02907
D. Duvenaud, et al. Convolutional networks on graphs for learning molecular fingerprints arXiv, 2015 (2015) arXiv:1509.09292
B.L. Ramsundar, et al. Deep Learning for the Life Sciences: Applying Deep Learning to Genomics, Microscopy, Drug Discovery, and More O’Reilly Media (2019)
C.W. Coley, et al. A graph-convolutional neural network model for the prediction of chemical reactivity Chem. Sci., 10 (2019), pp. 370-377
A.R. Finkelmann, et al. Robust molecular representations for modelling and design derived from atomic partial charges Chem. Commun. (Camb.), 52 (2016), pp. 681-684
A.R. Finkelmann, et al.Site of metabolism prediction based on ab initio derived atom representations ChemMedChem, 12 (2017), pp. 606-612
L. Kuhnke, et al. Mechanistic reactivity descriptors for the prediction of ames mutagenicity of primary aromatic amines J. Chem. Inf. Model., 59 (2019), pp. 668-672
C.A. Bauer, et al. Gaussian process regression models for the prediction of hydrogen bond acceptor strengths Mol. Inf., 38 (2018), p. 1800115
C.A. Bauer, et al. Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies J. Cheminf., 11 (2019), p. 59
C.A. Bauer How to model inter- and intramolecular hydrogen bond strengths with quantum chemistry J. Chem. Inf. Model., 59 (2019), pp. 3735-3743
E. Valeur, et al. New modalities for challenging targets in drug discovery Angew. Chem. Int. Ed. Engl., 56 (2017), pp. 10294-10323
M. Egbert, et al. Why some targets benefit from beyond rule of five drugs J. Med. Chem., 62 (2019), pp. 10005-10025
A.T. Cavasin, et al. Reliable and performant identification of low-energy conformers in the gas phase and water J. Chem. Inf. Model., 58 (2018), pp. 1005-1020
V. Poongavanam, et al. Conformational sampling of macrocyclic drugs in different environments: can we find the relevant conformations? ACS Omega, 3 (2018), pp. 11742-11757
G. Caron, et al. Intramolecular hydrogen bonding: an opportunity for improved design in medicinal chemistryMed. Res. Rev., 39 (2019), pp. 1707-1729
S. Riniker Molecular dynamics fingerprints (MDFP): machine learning from MD data to predict free-energy differences J. Chem. Inf. Model., 57 (2017), pp. 726-741
A. Hillisch, et al. Computational chemistry in the pharmaceutical industry: from childhood to adolescence ChemMedChem, 10 (2015), pp. 1958-1962
C. Cortes, V. Vapnik Support-vector networks Mach. Learn., 20 (1995), pp. 273-297
L. Breiman Random forests Mach. Learn., 45 (2001), pp. 5-32
S. Wold, et al. The collinearity problem in linear regression. the partial least squares (PLS) approach to generalized inverses SIAM J. Sci. Stat. Comput., 5 (1984), pp. 735-743
G.B. Goh, et al. Deep learning for computational chemistry J. Comput. Chem., 38 (2017), pp. 1291-1307
A. Mayr, G. Klambauer, T. Unterthiner, S. Hochreiter DeepTox: toxicity prediction using deep learning Front. Environ. Sci., 3 (2016), p. 80
S. Schneckener, et al. Prediction of oral bioavailability in rats: transferring insights from in vitro correlations to (deep) machine learning models using in silico model outputs and chemical structure parameters J. Chem. Inf. Model., 59 (2019), pp. 4893-4905
S. Kearnes, et al. Modeling industrial ADMET data with multitask networks arXiv, 2016 (2016) arXiv:1606.08793
B. Ramsundar, et al. Massively multitask networks for drug discovery arXiv, 2015 (2015) arXiv:1502.02072
F. Montanari, et al. Modeling physico-chemical ADMET endpoints with multitask graph convolutional networks Molecules, 25 (2019), p. 44
M. Lobell, et al. In silico ADMET traffic lights as a tool for the prioritization of HTS hits ChemMedChem, 1 (2006), pp. 1229-1236
OECD Guidance Document on Good In Vitro Method Practices (GIVIMP) OECD (2018)
S.L. Rodgers, et al. Predictivity of simulated ADME AutoQSAR models over time Mol. Inf., 30 (2011), pp. 256-266
C.A. Lipinski, et al. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings Adv. Drug Deliv. Rev., 23 (1997), pp. 3-25
P.C. Mahalanobis On the generalised distance in statistics Proc. Nat. Inst. Sci. India, 2 (1936), pp. 49-55
R.P. Sheridan, et al. Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR J. Chem. Inf. Comput. Sci., 44 (2004), pp. 1912-1928
T.I. Oprea, J. Gottfries Chemography: the art of navigating in chemical space J. Comb. Chem., 3 (2001), pp. 157-166
W. Klingspohn, et al. Efficiency of different measures for defining the applicability domain of classification models J. Cheminf., 9 (2017), p. 44
U. Norinder, et al. Introducing conformal prediction in predictive modeling. A transparent and flexible alternative to applicability domain determination J. Chem. Inf. Model., 54 (2014), pp. 1596-1603
R.P. Sheridan Using random forest to model the domain applicability of another random forest model J. Chem. Inf. Model., 53 (2013), pp. 2837-2850
S. Baurle, et al. Identification of a benzimidazolecarboxylic acid derivative (BAY 1316957) as a potent and selective human prostaglandin E2 receptor subtype 4 (hEP4-R) antagonist for the treatment of endometriosis J. Med. Chem., 62 (2019), pp. 2541-2563
M. Koppitz, et al. Discovery and optimization of pyridyl-cycloalkyl-carboxylic acids as inhibitors of microsomal prostaglandin E synthase-1 for the treatment of endometriosis Bioorg. Med. Chem. Lett., 29 (2019), pp. 2700-2705
S. Werner, et al. Discovery and characterization of the potent and selective P2X4 inhibitor N-[4-(3-chlorophenoxy)-3-sulfamoylphenyl]-2-phenylacetamide (BAY-1797) and structure-guided amelioration of Its CYP3A4 induction profile J. Med. Chem., 62 (2019), pp. 11194-11217
T. Wunberg, et al. Improving the hit-to-lead process: data-driven assessment of drug-like and lead-like screening hits Drug Discov. Today, 11 (2006), pp. 175-180
Kolkhof, P. et al. Bayer Schering Pharma AG. Substituted 7-sulfanylmethyl, 7-sulfinylmethyl and 7–sulfonylmethyl indoles and use thereof. WO2009156072.
Allerheiligen, S. et al. Bayer Pharma Aktiengesellschaft. Substituted benzoxazoles. WO2014195230.