Prediction of hERG Liability - Using SVM Classification, Bootstrapping and Jackknifing

Mol Inform. 2017 Apr;36(4):10.1002/minf.201600126. doi: 10.1002/minf.201600126. Epub 2016 Dec 21.

Abstract

Drug-induced QT prolongation leads to life-threatening cardiotoxicity, mostly through blockage of the human ether-à-go-go-related gene (hERG) encoded potassium ion (K+ ) channels. The hERG channel is one of the most important antitargets to be addressed in the early stage of drug discovery process, in order to avoid more costly failures in the development phase. Using a thallium flux assay, 4,323 molecules were screened for hERG channel inhibition in a quantitative high throughput screening (qHTS) format. Here, we present support vector classification (SVC) models of hERG channel inhibition with the averaged area under the receiver operator characteristics curve (AUC-ROC) of 0.93 for the tested compounds. Both Jackknifing and bootstrapping have been employed to rebalance the heavily biased training datasets, and the impact of these two under-sampling rebalance methods on the performance of the predictive models is discussed. Our results indicated that the rebalancing techniques did not enhance the predictive power of the resulting models; instead, adoption of optimal cutoffs could restore the desirable balance of sensitivity and specificity of the binary classifiers. In an external validation set of 66 drug molecules, the SVC model exhibited an AUC-ROC of 0.86, further demonstrating the utility of this modeling approach to predict hERG liabilities.

Keywords: ROC; bootstrap; hERG; jackknife; rebalance; support vector classification.

MeSH terms

  • Animals
  • Area Under Curve
  • Cell Line
  • Ether-A-Go-Go Potassium Channels / antagonists & inhibitors
  • Ether-A-Go-Go Potassium Channels / genetics
  • Ether-A-Go-Go Potassium Channels / metabolism*
  • High-Throughput Screening Assays
  • Humans
  • Models, Molecular
  • Patch-Clamp Techniques
  • Pharmaceutical Preparations / chemistry
  • Pharmaceutical Preparations / metabolism
  • Protein Structure, Tertiary
  • Quantitative Structure-Activity Relationship
  • ROC Curve
  • Support Vector Machine

Substances

  • Ether-A-Go-Go Potassium Channels
  • Pharmaceutical Preparations