Protein evolution often occurs at unequal rates in different sites along an amino acid chain. Site-specific evolutionary rates have been linked to several structural and functional properties of proteins. Previous analyses of this phenomenon have involved relatively small datasets and, in some cases, the interaction among multiple structural factors is not evaluated. Here, we present the results of a large-scale phylogenetic and statistical analysis, testing the effects and interactions of three structural properties on amino acid replacement rates. We used sequence-based computational methods to predict (i) intrinsic disorder propensity, (ii) secondary structure, and (iii) functional domain involvement across millions of amino acid sites in thousands of sequence alignments of metazoan proteins. Our results somewhat corroborate earlier findings that intrinsically disordered sites tend to be more variable than ordered sites, but there is considerable overlap among their rate distributions, and a significant confounding interaction exists between intrinsic disorder and secondary structure. Notably, protein sites that are consistently predicted to be both intrinsically disordered and involved in secondary structures tend to be the most conserved at the amino acid level, suggesting that they are highly constrained and functionally important. In addition, a significant interaction exists between functional domain involvement and secondary structure. These findings suggest that multiple structural drivers of protein evolution should be evaluated simultaneously in order to get a clear picture of their individual effects as well as any confounding interactions among them.
Keywords: Pfam domain.; divergence; evolutionary rate; intrinsic disorder; secondary structure; site-specific rate.
© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.