CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design

PLoS One. 2016 Aug 3;11(8):e0160435. doi: 10.1371/journal.pone.0160435. eCollection 2016.

Abstract

A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Computational Biology / methods*
  • Computer Simulation
  • Nucleotide Motifs*
  • Sequence Alignment / methods
  • Sequence Alignment / statistics & numerical data
  • Sequence Analysis, DNA* / methods
  • Sequence Analysis, DNA* / statistics & numerical data
  • Software*

Grants and funding

The publication of this article has been funded by two grants (61572358 to SZ, 61273228 to YC) from the National Natural Science Foundation of China and two grants (15JCYBJC46600 and 16JCYBJC23600 to SZ) from Natural Science Foundation of Tianjin.