PyAutoFEP: An Automated Free Energy Perturbation Workflow for GROMACS Integrating Enhanced Sampling Methods

Luan Carvalho Martins; Elio A Cino; Rafaela Salgado Ferreira

doi:10.1021/acs.jctc.1c00194

PyAutoFEP: An Automated Free Energy Perturbation Workflow for GROMACS Integrating Enhanced Sampling Methods

J Chem Theory Comput. 2021 Jul 13;17(7):4262-4273. doi: 10.1021/acs.jctc.1c00194. Epub 2021 Jun 18.

Authors

Luan Carvalho Martins¹, Elio A Cino², Rafaela Salgado Ferreira²

Affiliations

¹ Graduate Program in Bioinformatics. Institute for Biological Sciences, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil.
² Biochemistry and Immunology Department, Institute for Biological Sciences, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil.

PMID: 34142828
DOI: 10.1021/acs.jctc.1c00194

Abstract

Free energy perturbation (FEP) calculations are now routinely used in drug discovery to estimate the relative FEB (RFEB) of small molecules to a biomolecular target of interest. Using enhanced sampling can improve the correlation between predictions and experimental data, especially in systems with conformational changes. Due to the large number of perturbations required in drug discovery campaigns, the manual setup of FEP calculations is no longer viable. Here, we introduce PyAutoFEP, a flexible and open-source tool to aid the setup of RFEB FEP. PyAutoFEP is written in Python3, and automates the generation of perturbation maps, dual topologies, system building and molecular dynamics (MD), and analysis. PyAutoFEP supports multiple force fields, incorporates replica exchange with solute tempering (REST) and replica exchange with solute scaling (REST2) enhanced sampling methods, and allows flexible λ values along perturbation windows. To validate PyAutoFEP, it was applied to a set of 14 Farnesoid X receptor ligands, a system included in the drug design data resource grand challenge 2. An 88% mean correct sign prediction was achieved, and 75% of the predictions had an error below 1.5 kcal/mol. Results using Amber03/GAFF, CHARMM36m/CGenFF, and OPLS-AA/M/LigParGen had Pearson's r values of 0.71 ± 0.13, 0.30 ± 0.27, and 0.66 ± 0.20, respectively. The Amber03/GAFF and OPLS-AA/M/LigParGen results were on par with the top grand challenge 2 submissions. Applying REST2 improved the results using CHARMM36m/CGenFF (Pearson's r = 0.43 ± 0.21) but had little impact on the other force fields. CHARMM36-YF and CHARMM36-WYF modifications did not yield improved predictions compared to CHARMM36m. Finally, we estimated the probability of finding a molecule 1 pK_i better than a lead when using PyAutoFEP to screen 10 or 100 analogues. The probabilities, when compared to random sampling, increased up to sevenfold when 100 molecules were to be screened, suggesting that PyAutoFEP would likely be useful for lead optimization. PyAutoFEP is available on GitHub at https://github.com/lmmpf/PyAutoFEP.