Constraint neighborhood projections for semi-supervised clustering

IEEE Trans Cybern. 2014 May;44(5):636-43. doi: 10.1109/TCYB.2013.2263383.

Abstract

Semi-supervised clustering aims to incorporate the known prior knowledge into the clustering algorithm. Pairwise constraints and constraint projections are two popular techniques in semi-supervised clustering. However, both of them only consider the given constraints and do not consider the neighbors around the data points constrained by the constraints. This paper presents a new technique by utilizing the constrained pairwise data points and their neighbors, denoted as constraint neighborhood projections that requires fewer labeled data points (constraints) and can naturally deal with constraint conflicts. It includes two steps: 1) the constraint neighbors are chosen according to the pairwise constraints and a given radius so that the pairwise constraint relationships can be extended to their neighbors, and 2) the original data points are projected into a new low-dimensional space learned from the pairwise constraints and their neighbors. A CNP-Kmeans algorithm is developed based on the constraint neighborhood projections. Extensive experiments on University of California Irvine (UCI) datasets demonstrate the effectiveness of the proposed method. Our study also shows that constraint neighborhood projections (CNP) has some favorable features compared with the previous techniques.

Publication types

  • Research Support, Non-U.S. Gov't