Machine Learning and Assay Development for Image-based Phenotypic Profiling of Drug Treatments

Review
In: Assay Guidance Manual [Internet]. Bethesda (MD): Eli Lilly & Company and the National Center for Advancing Translational Sciences; 2004.
.

Excerpt

High content imaging produces significant volumes of data on individual cells. The number of discrete measurements per cell can be in the thousands for highly multiplexed assays. Typically, one of these measurements is used as the assay metric (as examples: cell cycle phase, post-translational modification of a protein reporter expression, shape change), and one or a few others can be used as counter screen measures (frequently cell death or stress, sometimes activity of an orthogonal pathway). However, the rich data sets from high content studies can be combined using machine learning to integrate many features into an assay metric that can enhance assay performance and increase the information content in a screen, allowing a more judicious choice of hits. This chapter will review the potential benefits of implementing a machine learning approach in screening, including examples of where it provides more information or better hit selection than single metrics. Examples of machine learning in screening design that cover a few of the key methods, including regression analyses, decision trees, linear discriminant analyses and support vector machines are provided. This chapter will emphasize key elements of assay validation, including increasing general reproducibility and robustness through tracking algorithm performance, and linking feature measurements to the underlying biology. The basic role of an assay to identify perturbations that are most similar to a positive control are covered in depth. The ability to identify novel phenotypes, such as encountered in phenotypic profiling or “Cell Painting”, are also presented.

Publication types

  • Review