The signature molecular descriptor. 2. Enumerating molecules from their extended valence sequences

J Chem Inf Comput Sci. 2003 May-Jun;43(3):721-34. doi: 10.1021/ci020346o.

Abstract

We present a new algorithm that enumerates molecular structures matching a predefined extended valence sequence or signature. The algorithm can construct molecular structures composed of about 50 non-hydrogen atoms in CPU seconds time scale. The algorithm is run to produce all molecular structures matching the binding affinities (IC(50)) of some HIV-1 protease inhibitors. The algorithm is also used to compute the degeneracy, or the number of molecular structures, corresponding to a given signature. Signature degeneracy is systematically studied for varying signature heights on four molecular series, alkanes, alcohols, fullerene-type structures, and peptides. Signature degeneracy is compared with similar results obtained with popular topological indices (TIs). As a general rule, we find that signature degeneracy decreases as the signature height increases. We also find that alkanes, alcohols, and fullerene-type structures comprising n non-hydrogen atoms are uniquely characterized by signatures of height n/4, while peptides up to 4000 amino acids can be singled out with signatures of heights as small as 2 and 3.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Chemistry, Pharmaceutical / methods
  • Computer-Aided Design
  • HIV Protease Inhibitors / chemistry
  • HIV Protease Inhibitors / pharmacology
  • Hydrocarbons / chemistry
  • Inhibitory Concentration 50
  • Isomerism
  • Models, Chemical*
  • Molecular Structure
  • Peptides / chemistry

Substances

  • HIV Protease Inhibitors
  • Hydrocarbons
  • Peptides