Biomarker discovery for arsenic exposure using functional data. Analysis and feature learning of mass spectrometry proteomic data

J Proteome Res. 2008 Jan;7(1):217-24. doi: 10.1021/pr070491n. Epub 2008 Jan 4.

Abstract

Plasma biomarkers of exposure to environmental contaminants play an important role in early detection of disease. The emerging field of proteomics presents an attractive opportunity for candidate biomarker discovery, as it simultaneously measures and analyzes a large number of proteins. This article presents a case study for measuring arsenic concentrations in a population residing in an As-endemic region of Bangladesh using plasma protein expressions measured by SELDI-TOF mass spectrometry. We analyze the data using a unified statistical method based on functional learning to preprocess mass spectra and extract mass spectrometry (MS) features and to associate the selected MS features with arsenic exposure measurements. The task is challenging due to several factors, the high dimensionality of mass spectrometry data, complicated error structures, and a multiple comparison problem. We use nonparametric functional regression techniques for MS modeling, peak detection based on the significant zero-downcrossing method, and peak alignment using a warping algorithm. Our results show significant associations of arsenic exposure to either under- or overexpressions of 20 proteins.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Arsenic / pharmacology*
  • Arsenic Poisoning / diagnosis*
  • Bangladesh
  • Biomarkers
  • Blood Proteins / analysis*
  • Environmental Exposure*
  • Gene Expression Regulation / drug effects*
  • Humans
  • Mass Spectrometry / methods*
  • Models, Statistical
  • Proteomics / methods*

Substances

  • Biomarkers
  • Blood Proteins
  • Arsenic