An efficient and high-throughput approach for experimental validation of novel human gene predictions

Genomics. 2006 Apr;87(4):437-45. doi: 10.1016/j.ygeno.2005.11.016. Epub 2006 Jan 9.

Abstract

A highly automated RT-PCR-based approach has been established to validate novel human gene predictions with no prior experimental evidence of mRNA splicing (ab initio predictions). Ab initio gene predictions were selected for high-throughput validation using predicted protein classification, sequence similarity to other genomes, colocalization with an MPSS tag, or microarray expression. Initial microarray prioritization followed by RT-PCR validation was the most efficient combination, resulting in approximately 35% of the ab initio predictions being validated by RT-PCR. Of the 7252 novel genes that were prioritized and processed, 796 constituted real transcripts. In addition, high-throughput RACE successfully extended the 5' and/or 3' ends of >60% of RT-PCR-validated genes. Reevaluation of these transcripts produced 574 novel transcripts using RefSeq as a reference. RT-PCR sequencing in combination with RACE on ab initio gene predictions could be used to define the transcriptome across all species.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms
  • Alternative Splicing
  • Computational Biology
  • Gene Expression Profiling
  • Genome
  • Genome, Human*
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Predictive Value of Tests
  • Proteins / classification
  • Reproducibility of Results
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Analysis, DNA
  • Software

Substances

  • Proteins