A pipeline for automated analysis of flow cytometry data: preliminary results on lymphoma sub-type diagnosis

Annu Int Conf IEEE Eng Med Biol Soc. 2009:2009:4945-8. doi: 10.1109/IEMBS.2009.5332710.

Abstract

Flow cytometry (FCM) is widely used in health research and is a technique to measure cell properties such as phenotype, cytokine expression, etc., for up to millions of cells from a sample. FCM data analysis is a highly tedious, subjective and manually time-consuming (to the level of impracticality for some data) process that is based on intuition rather than standardized statistical inference. This study proposes a pipeline for automatic analysis of FCM data. The proposed pipeline identifies biomarkers that correlate with physiological/pathological conditions and classifies the samples to specific pathological/physiological entities. The pipeline utilizes a model-based clustering approach to identify cell populations that share similar biological functions. Support vector machine (SVM) and random forest (RF) classifiers were then used to classify the samples and identify biomarkers associated with disease status. The performance of the proposed data analysis pipeline has been evaluated on lymphoma patients. Preliminary results show more than 90% accuracy in differentiating between some sub-types of lymphoma. The proposed pipeline also finds biologically meaningful biomarkers that differ between lymphoma subtypes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Flow Cytometry / methods*
  • Humans
  • Lymphoma / classification*
  • Lymphoma / diagnosis*
  • Statistics as Topic / methods*