'Genome order index' should not be used for defining compositional constraints in nucleotide sequences

Comput Biol Chem. 2008 Apr;32(2):147. doi: 10.1016/j.compbiolchem.2007.11.003. Epub 2007 Dec 15.

Abstract

A "genome order index," defined as S=a(2)+c(2)+t(2)+g(2), where a, c, t, and g are the nucleotide frequencies of A, C, T, and G, respectively, was used to suggest that there exist genome-specific constraints on nucleotide composition. We show that the "evidence" for constraint, S<1/3, is in fact a mathematical property that is always true regardless of data. Moreover, we show that S is strictly equivalent to and derivable from the Shannon H-function and has no advantage over it.

Publication types

  • Comment
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Composition*
  • Base Sequence*
  • Genome*
  • Sequence Analysis, DNA