Combining machine learning and pharmacophore-based interaction fingerprint for in silico screening

Tomohiro Sato; Teruki Honma; Shigeyuki Yokoyama

doi:10.1021/ci900382e

Combining machine learning and pharmacophore-based interaction fingerprint for in silico screening

J Chem Inf Model. 2010 Jan;50(1):170-85. doi: 10.1021/ci900382e.

Authors

Tomohiro Sato¹, Teruki Honma, Shigeyuki Yokoyama

Affiliation

¹ Department of Biophysics and Biochemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan.

PMID: 20038188
DOI: 10.1021/ci900382e

Abstract

In this study, we developed a new pharmacophore-based interaction fingerprint (Pharm-IF) and examined its usefulness for in silico screening using machine learning techniques such as support vector machine (SVM) and random forest (RF) instead of similarity-based ranking. Using the docking results of PKA, SRC, cathepsin K, carbonic anhydrase II, and HIV-1 protease, the screening efficiencies of the Pharm-IF models were compared to GLIDE score and the residue-based IF (PLIF) models. The combination of SVM and Pharm-IF demonstrated a higher enrichment factor at 10% (5.7 on average) than those of GLIDE score (4.2) and PLIF (4.3). In terms of the size of the training sets, learning more than five crystal structures enabled the machine learning models to stably achieve better efficiencies than GLIDE score. We also employed the docking poses of known active compounds, in addition to the crystal structures, as positive samples of training sets. The enrichment factors of the RF models at 10% using the docking poses for SRC and cathepsin K showed significantly higher values (6.5 and 6.3) than those using only the crystal structures (3.9 and 3.2), respectively.

MeSH terms

Algorithms
Artificial Intelligence*
Computational Biology*
Drug Evaluation, Preclinical / methods*
Enzyme Inhibitors / chemistry
Enzyme Inhibitors / metabolism
Enzyme Inhibitors / pharmacology
Humans
Ligands
Models, Molecular
Protein Binding
Protein Conformation
Structure-Activity Relationship

Substances

Enzyme Inhibitors
Ligands