Identification and correction of previously unreported spatial phenomena using raw Illumina BeadArray data

Mike L Smith; Mark J Dunning; Simon Tavaré; Andy G Lynch

doi:10.1186/1471-2105-11-208

Identification and correction of previously unreported spatial phenomena using raw Illumina BeadArray data

BMC Bioinformatics. 2010 Apr 27:11:208. doi: 10.1186/1471-2105-11-208.

Authors

Mike L Smith¹, Mark J Dunning, Simon Tavaré, Andy G Lynch

Affiliation

¹ Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK. mike.l.smith@cancer.org.uk

Abstract

Background: A key stage for all microarray analyses is the extraction of feature-intensities from an image. If this step goes wrong, then subsequent preprocessing and processing stages will stand little chance of rectifying the matter. Illumina employ random construction of their BeadArrays, making feature-intensity extraction even more important for the Illumina platform than for other technologies. In this paper we show that using raw Illumina data it is possible to identify, control, and perhaps correct for a range of spatial-related phenomena that affect feature-intensity extraction.

Results: We note that feature intensities can be unnaturally high when in the proximity of a number of phenomena relating either to the images themselves or to the layout of the beads on an array. Additionally we note that beads neighbour beads of the same type more often than one might expect, which may cause concern in some models of hybridization. We highlight issues in the identification of a bead's location, and in particular how this both affects and is affected by its intensity. Finally we show that beads can be wrongly identified in the image on either a local or array-wide scale, with obvious implications for data quality.

Conclusions: The image processing issues identified will often pass unnoticed by an analysis of the standard data returned from an experiment. We detail some simple diagnostics that can be implemented to identify problems of this nature, and outline approaches to correcting for such problems. These approaches require access to the raw data from the arrays, not just the summarized data usually returned, making the acquisition of such raw data highly desirable.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Computational Biology / methods*
Databases, Genetic
Gene Expression Profiling / methods
Oligonucleotide Array Sequence Analysis / methods*

Grants and funding

Cancer Research UK/United Kingdom