Multi-reward Reinforcement Learning Based Bond-Order Potential to Study Strain-Assisted Phase Transitions in Phosphorene

J Phys Chem Lett. 2022 Feb 24;13(7):1886-1893. doi: 10.1021/acs.jpclett.1c03551. Epub 2022 Feb 17.

Abstract

We introduce a multi-reward reinforcement learning (RL) approach to train a flexible bond-order potential (BOP) for 2D phosphorene based on ab initio training data sets. Our approach is based on a continuous action space Monte Carlo tree search algorithm that is general and scalable and presents an efficient multiobjective optimization scheme for high-dimensional materials design problems. As a proof-of-concept, we deploy this scheme to parametrize multiple structural and dynamical properties of 2D phosphorene polymorphs. Our RL-trained BOP model adequately captures the structure, energetics, transformation barriers, equation of state, elastic constants, and phonon dispersions of various 2D P polymorphs. We use this model to probe the impact of temperature and strain rate on the phase transition from black (α-P) to blue phosphorene (β-P) through molecular dynamics simulations. A decrease in critical strain for this phase transition with increase in temperature is observed, and the underlying atomistic mechanisms are discussed.