A probabilistic view of protein stability, conformational specificity, and design

Jacob A Stern; Tyler J Free; Kimberlee L Stern; Spencer Gardiner; Nicholas A Dalley; Bradley C Bundy; Joshua L Price; David Wingate; Dennis Della Corte

doi:10.1038/s41598-023-42032-1

A probabilistic view of protein stability, conformational specificity, and design

Sci Rep. 2023 Sep 19;13(1):15493. doi: 10.1038/s41598-023-42032-1.

Authors

Jacob A Stern¹, Tyler J Free², Kimberlee L Stern³, Spencer Gardiner⁴, Nicholas A Dalley³, Bradley C Bundy², Joshua L Price³, David Wingate¹, Dennis Della Corte⁵

Affiliations

¹ Department of Computer Science, Brigham Young University, Provo, UT, USA.
² Department of Chemical Engineering, Brigham Young University, Provo, UT, USA.
³ Department of Chemistry and Biochemistry, Brigham Young University, Provo, UT, USA.
⁴ Department of Physics and Astronomy, Brigham Young University, Provo, UT, USA.
⁵ Department of Physics and Astronomy, Brigham Young University, Provo, UT, USA. dennis.dellacorte@byu.edu.

Abstract

Various approaches have used neural networks as probabilistic models for the design of protein sequences. These "inverse folding" models employ different objective functions, which come with trade-offs that have not been assessed in detail before. This study introduces probabilistic definitions of protein stability and conformational specificity and demonstrates the relationship between these chemical properties and the [Formula: see text] Boltzmann probability objective. This links the Boltzmann probability objective function to experimentally verifiable outcomes. We propose a novel sequence decoding algorithm, referred to as "BayesDesign", that leverages Bayes' Rule to maximize the [Formula: see text] objective instead of the [Formula: see text] objective common in inverse folding models. The efficacy of BayesDesign is evaluated in the context of two protein model systems, the NanoLuc enzyme and the WW structural motif. Both BayesDesign and the baseline ProteinMPNN algorithm increase the thermostability of NanoLuc and increase the conformational specificity of WW. The possible sources of error in the model are analyzed.

MeSH terms

Algorithms*
Amino Acid Sequence
Bayes Theorem
Likelihood Functions
Protein Stability