Genome wide identification and classification of alternative splicing based on EST data

Bioinformatics. 2004 Nov 1;20(16):2579-85. doi: 10.1093/bioinformatics/bth288. Epub 2004 Apr 29.

Abstract

Motivation: Alternative splicing is currently seen to explain the vast disparity between the number of predicted genes in the human genome and the highly diverse proteome. The mapping of expressed sequences tag (EST) consensus sequences derived from the GeneNest database onto the genome provides an efficient way of predicting exon-intron boundaries, gene structure and alternative splicing events. However, the alternative splicing events are obscured by a large number of putatively artificial exon boundaries arising due to genomic contamination or alignment errors. The current work describes a methodology to associate quality values to the predicted exon-intron boundaries. High quality exon-intron boundaries are used to predict constitutive and alternative splicing ranked by confidence values, aiming to facilitate large-scale analysis of alternative splicing and splicing in general.

Results: Applying the current methodology, constitutive splicing is observed in 33,270 EST clusters, out of which 45% are alternatively spliced. The classification derived from the computed confidence values for 17 of these splice events frequently correlate (15/17) with RT-PCR experiments performed for 40 different tissue samples. As an application of the confidence measure, an evaluation of distribution of alternative splicing revealed that majority of variants correspond to the coding regions of the genes. However, still a significant fraction maps to non-coding regions, thereby indicating a functional relevance of alternative splicing in untranslated regions.

Availability: The predicted alternative splice variants are visualized in the SpliceNest database at http://splicenest.molgen.mpg.de

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Alternative Splicing / genetics*
  • Artificial Intelligence
  • Chromosome Mapping / methods*
  • Chromosomes, Human / genetics
  • Consensus Sequence / genetics
  • Expressed Sequence Tags*
  • Humans
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*