Principal components analysis of diet and alternatives for identifying the combination of foods that are associated with the risk of disease: a simulation study

Br J Nutr. 2014 Jul 14;112(1):61-9. doi: 10.1017/S0007114514000221. Epub 2014 Apr 11.

Abstract

Dietary patterns derived empirically using principal components analysis (PCA) are widely employed for investigating diet-disease relationships. In the present study, we investigated whether PCA performed better at identifying such associations than an analysis of each food on a FFQ separately, referred to here as an exhaustive single food analysis (ESFA). Data on diet and disease were simulated using real FFQ data and by assuming a number of food intakes in combination that were associated with the risk of disease. In each simulation, ESFA and PCA were employed to identify the combinations of foods that are associated with the risk of disease using logistic regression, allowing for multiple testing and adjusting for energy intake. ESFA was also separately adjusted for principal components of diet, foods that were significant in the unadjusted ESFA and propensity scores. For each method, we investigated the power with which an association between diet and disease could be identified, and the power and false discovery rate (FDR) for identifying the specific combination of food intakes. In some scenarios, ESFA had greater power to detect a diet-disease association than PCA. ESFA also typically had a greater power and a lower FDR for identifying the combinations of food intakes that are associated with the risk of disease. The FDR of both methods increased with increasing sample size, but when ESFA was adjusted for foods that were significant in the unadjusted ESFA, FDR were controlled at the desired level. These results question the widespread use of PCA in nutritional epidemiology. The adjusted ESFA identifies the combinations of foods that are causally linked to the risk of disease with low FDR and surprisingly good power.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Chronic Disease / epidemiology
  • Chronic Disease / prevention & control*
  • Computer Simulation
  • Databases, Factual
  • Diet* / adverse effects
  • Energy Intake
  • Female
  • Functional Food*
  • Health Promotion*
  • Humans
  • Male
  • Middle Aged
  • Models, Biological*
  • Nutrition Policy*
  • Nutrition Surveys
  • Principal Component Analysis
  • Risk Factors
  • United Kingdom / epidemiology
  • Young Adult