Identification of sequence motifs causing band compressions on human cDNA sequencing

DNA Res. 1996 Apr 30;3(2):81-6. doi: 10.1093/dnares/3.2.81.

Abstract

In order to characterize DNA sequences leading to band compressions in an automated dideoxy-DNA sequencing system which uses fluorescent dye primers, we compiled DNA sequences at compression sites from accumulated sequence data of human cDNAs (about 205 kb in total length). The results clearly showed that almost all the 3'-end regions at the compression sites (> 98%) carried two types of common sequence motifs. The predominant one (about 68%) contained a sequence of 5'-Y'GN1-2AR'-3' (Y' and R': pyrimidine and purine residues capable of base pairing). The remainder (about 32%) carried a hairpin motif with a relatively stable GC-rich stem (> or = 3 bp) connected by a loop consisting of 3 or 4 nucleotides. The occurrence of compressions at these motif sites was further confirmed by using synthetic DNAs with random sequences (about 58 kb in total length). Since DNA sequences at compression sites analyzed so far shared either of the type of motifs in the sequencing system employed here, it was possible to predict the nucleotide residue to be located at a compression site by carefully checking the sequence preceding the site.

MeSH terms

  • Artifacts*
  • Base Sequence
  • DNA, Complementary / chemical synthesis
  • DNA, Complementary / chemistry*
  • Electrophoresis / instrumentation
  • Electrophoresis / methods
  • Humans
  • Sequence Analysis, DNA / instrumentation
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Complementary