Have Standard Formulas Correcting Correlations for Range Restriction Been Adequately Tested?: Minor Sampling Distribution Quirks Distort Them

Educ Psychol Meas. 2018 Dec;78(6):1021-1055. doi: 10.1177/0013164417736092. Epub 2017 Oct 26.

Abstract

Most study samples show less variability in key variables than do their source populations due most often to indirect selection into study participation associated with a wide range of personal and circumstantial characteristics. Formulas exist to correct the distortions of population-level correlations created. Formula accuracy has been tested using simulated normally distributed data, but empirical data are rarely available for testing. We did so in a rare data set in which it was possible: the 6-Day Sample, a representative subsample of 1,208 from the Scottish Mental Survey 1947 of cognitive ability in 1936-born Scottish schoolchildren (70,805). 6-Day Sample participants completed a follow-up assessment in childhood and were re-recruited for study at age 77 years. We compared full 6-Day Sample correlations of early-life variables with those of the range-restricted correlations in the later-participating subsample, before and after adjustment for direct and indirect range restriction. Results differed, especially for two highly correlated cognitive tests; neither reproduced full-sample correlations well due to small deviations from normal distribution in skew and kurtosis. Maximum likelihood estimates did little better. To assess these results' typicality, we simulated sample selection and made similar comparisons using the 42 cognitive ability tests administered to the Minnesota Study of Twins Reared Apart, with very similar results. We discuss problems in developing further adjustments to offset range-restriction distortions and possible approaches to solutions.

Keywords: adjustment formulas; distortion; range restriction; skew; statistical bias; study participation.