Functional clustering algorithm for high-dimensional proteomics data

Halima Bensmail; Buddana Aruna; O John Semmes; Abdelali Haoudi

doi:10.1155/JBB.2005.80

Functional clustering algorithm for high-dimensional proteomics data

J Biomed Biotechnol. 2005 Jun 30;2005(2):80-6. doi: 10.1155/JBB.2005.80.

Authors

Halima Bensmail¹, Buddana Aruna, O John Semmes, Abdelali Haoudi

Affiliation

¹ Department of Statistic Operation and Management Sciences (SOMS), The University of Tennessee, Knoxville, TN 37996, USA.

Abstract

Clustering proteomics data is a challenging problem for any traditional clustering algorithm. Usually, the number of samples is largely smaller than the number of protein peaks. The use of a clustering algorithm which does not take into consideration the number of features of variables (here the number of peaks) is needed. An innovative hierarchical clustering algorithm may be a good approach. We propose here a new dissimilarity measure for the hierarchical clustering combined with a functional data analysis. We present a specific application of functional data analysis (FDA) to a high-throughput proteomics study. The high performance of the proposed algorithm is compared to two popular dissimilarity measures in the clustering of normal and human T-cell leukemia virus type 1 (HTLV-1)-infected patients samples.

Grants and funding

U01 CA085067/CA/NCI NIH HHS/United States