Assessing the accuracy of prediction algorithms for classification: an overview

P Baldi; S Brunak; Y Chauvin; C A Andersen; H Nielsen

doi:10.1093/bioinformatics/16.5.412

Assessing the accuracy of prediction algorithms for classification: an overview

Bioinformatics. 2000 May;16(5):412-24. doi: 10.1093/bioinformatics/16.5.412.

Authors

P Baldi¹, S Brunak, Y Chauvin, C A Andersen, H Nielsen

Affiliation

¹ Department of Information and Computer Science, University of California, Irvine, CA 92697, USA. pfbaldi@ics.uci.edu

PMID: 10871264
DOI: 10.1093/bioinformatics/16.5.412

Abstract

We provide a unified overview of methods that currently are widely used to assess the accuracy of prediction algorithms, from raw percentages, quadratic error measures and other distances, and correlation coefficients, and to information theoretic measures such as relative entropy and mutual information. We briefly discuss the advantages and disadvantages of each approach. For classification tasks, we derive new learning algorithms for the design of prediction systems by directly optimising the correlation coefficient. We observe and prove several results relating sensitivity and specificity of optimal systems. While the principles are general, we illustrate the applicability on specific problems such as protein secondary structure and signal peptide prediction.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, P.H.S.
Review

MeSH terms

Algorithms*
Classification / methods*
Computational Biology
Learning
Models, Statistical
Neural Networks, Computer