Computational methods for ab initio and comparative gene finding

Methods Mol Biol. 2010:609:269-84. doi: 10.1007/978-1-60327-241-4_16.

Abstract

High-throughput DNA sequencing is increasing the amount of public complete genomes even though a precise gene catalogue for each organism is not yet available. In this context, computational gene finders play a key role in producing a first and cost-effective annotation. Nowadays a compilation of gene prediction tools has been made available to the scientific community and, despite the high number, they can be divided into two main categories: (1) ab initio and (2) evidence based. In the following, we will provide an overview of main methodologies to predict correct exon-intron structures of eukaryotic genes falling in such categories. We will take into account also new strategies that commonly refine ab initio predictions employing comparative genomics or other evidence such as expression data. Finally, we will briefly introduce metrics to in house evaluation of gene predictions in terms of sensitivity and specificity at nucleotide, exon, and gene levels as well.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology*
  • Data Mining*
  • Databases, Genetic*
  • Databases, Protein
  • Exons
  • Humans
  • Introns
  • Markov Chains
  • Models, Statistical
  • Reproducibility of Results
  • Sequence Analysis, DNA*
  • Sequence Analysis, Protein