Retrosynthesis Zero: Self-Improving Global Synthesis Planning Using Reinforcement Learning

Jiasheng Guo; Chenning Yu; Kenan Li; Yijian Zhang; Guoqiang Wang; Shuhua Li; Hao Dong

doi:10.1021/acs.jctc.4c00071

Retrosynthesis Zero: Self-Improving Global Synthesis Planning Using Reinforcement Learning

J Chem Theory Comput. 2024 May 15. doi: 10.1021/acs.jctc.4c00071. Online ahead of print.

Authors

Jiasheng Guo¹, Chenning Yu¹, Kenan Li¹, Yijian Zhang¹, Guoqiang Wang², Shuhua Li², Hao Dong^{1

3}

Affiliations

¹ Kuang Yaming Honors School, Nanjing University, Nanjing 210023, China.
² School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing 210023, China.
³ State Key Laboratory of Analytical Chemistry for Life Science, Chemistry and Biomedicine Innovation Center (ChemBIC), Institute for Brain Sciences, Nanjing University, Nanjing 210023, China.

PMID: 38747149
DOI: 10.1021/acs.jctc.4c00071

Abstract

The field of computer-aided synthesis planning (CASP) has witnessed significant growth in recent years. Still, many CASP programs rely on large data sets to train neural networks, resulting in limitations due to the data quality and prior knowledge from chemists. In response, we propose Retrosynthesis Zero (ReSynZ), a reaction template-based method that combines Monte Carlo Tree Search with reinforcement learning inspired by AlphaGo Zero. Unlike other single-step reaction template-based CASP methods, ReSynZ takes complete synthesis paths for complex molecules, determined by reaction rules, as input for training the neural network. ReSynZ enables neural networks trained with relatively small reaction data sets (tens of thousands of data) to generate multiple synthesis pathways for a target molecule and suggest possible reaction conditions. On multiple data sets of molecular retrosynthesis, ReSynZ demonstrates excellent predictive performance compared to existing algorithms. The advantages, such as self-improving model features, flexible reward settings, the potential to surpass human limitations in chemical synthesis route planning, and others, make ReSynZ a valuable tool in chemical synthesis design.