Understanding the characteristics of mass spectrometry data through the use of simulation

Cancer Inform. 2005;1(1):41-52.

Abstract

Background: Mass spectrometry is actively being used to discover disease-related proteomic patterns in complex mixtures of proteins derived from tissue samples or from easily obtained biological fluids. The potential importance of these clinical applications has made the development of better methods for processing and analyzing the data an active area of research. It is, however, difficult to determine which methods are better without knowing the true biochemical composition of the samples used in the experiments.

Methods: We developed a mathematical model based on the physics of a simple MALDI-TOF mass spectrometer with time-lag focusing. Using this model, we implemented a statistical simulation of mass spectra. We used the simulation to explore some of the basicoperating characteristics of MALDI or SELDI instruments.

Results: The simulation reproduced several characteristics of actual instruments. We found that the relative mass error is affected by the time discretization of the detector (about 0.01%) and the spread of initial velocities (about 0.1%). The accuracy of calibration based on external standards decays rapidly outside the range spanned by the calibrants. Natural isotope distributions play a major role inbroadening peaks associated with individual proteins. The area of a peak is a more accurate measure of its size than the height.

Conclusions: The model described here is capable of simulating realistic mass spectra. The simulation should become a useful tool forgenerating spectra where the true inputs are known, allowing researchers to evaluate the performance of new methods for processing and analyzing mass spectra.

Availability: http://bioinformatics.mdanderson.org/cromwell.html.

Keywords: MALDI; SELDI; isotope distribution; mass resolution; mass spectrometry; peak capacity; peak quantification; simulation.