Microdeletions and microinsertions causing human genetic disease: common mechanisms of mutagenesis and the role of local DNA sequence complexity

Hum Mutat. 2005 Sep;26(3):205-13. doi: 10.1002/humu.20212.

Abstract

In the Human Gene Mutation Database (www.hgmd.org), microdeletions and microinsertions causing inherited disease (both defined as involving < or = 20 bp of DNA) account for 8,399 (17%) and 3,345 (7%) logged mutations, in 940 and 668 genes, respectively. A positive correlation was noted between the microdeletion and microinsertion frequencies for 564 genes for which both microdeletions and microinsertions are reported in HGMD, consistent with the view that the propensity of a given gene/sequence to undergo microdeletion is related to its propensity to undergo microinsertion. While microdeletions and microinsertions of 1 bp constitute respectively 48% and 66% of the corresponding totals, the relative frequency of the remaining lesions correlates negatively with the length of the DNA sequence deleted or inserted. Many of the microdeletions and microinsertions of more than 1 bp are potentially explicable in terms of slippage mutagenesis, involving the addition or removal of one copy of a mono-, di-, or trinucleotide tandem repeat. The frequency of in-frame 3-bp and 6-bp microinsertions and microdeletions was, however, found to be significantly lower than that of mutations of other lengths, suggesting that some of these in-frame lesions may not have come to clinical attention. Various sequence motifs were found to be over-represented in the vicinity of both microinsertions and microdeletions, including the heptanucleotide CCCCCTG that shares homology with the complement of the 8-bp human minisatellite conserved sequence/chi-like element (GCWGGWGG). The previously reported indel hotspot GTAAGT and its complement ACTTAC were also found to be overrepresented in the vicinity of both microinsertions and microdeletions, thereby providing a first example of a mutational hotspot that is common to different types of gene lesion. Other motifs overrepresented in the vicinity of microdeletions and microinsertions included DNA polymerase pause sites and topoisomerase cleavage sites. Several novel microdeletion/microinsertion hotspots were noted and some of these exhibited sufficient similarity to one another to justify terming them "super-hotspot" motifs. Analysis of sequence complexity also demonstrated that a combination of slipped mispairing mediated by direct repeats, and secondary structure formation promoted by symmetric elements, can account for the majority of microdeletions and microinsertions. Thus, microinsertions and microdeletions exhibit strong similarities in terms of the characteristics of their flanking DNA sequences, implying that they are generated by very similar underlying mechanisms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods
  • DNA-Directed DNA Polymerase / genetics
  • Databases, Genetic
  • Gene Deletion
  • Genetic Diseases, Inborn / genetics*
  • Genetic Variation
  • Humans
  • Mutagenesis*
  • Mutation
  • Repetitive Sequences, Nucleic Acid
  • Sequence Analysis, DNA / methods*

Substances

  • DNA-Directed DNA Polymerase