Comparison of cluster-based and source-attribution methods for estimating transmission risk using large HIV sequence databases

Epidemics. 2018 Jun:23:1-10. doi: 10.1016/j.epidem.2017.10.001. Epub 2017 Oct 20.

Abstract

Phylogenetic clustering of HIV sequences from a random sample of patients can reveal epidemiological transmission patterns, but interpretation is hampered by limited theoretical support and statistical properties of clustering analysis remain poorly understood. Alternatively, source attribution methods allow fitting of HIV transmission models and thereby quantify aspects of disease transmission. A simulation study was conducted to assess error rates of clustering methods for detecting transmission risk factors. We modeled HIV epidemics among men having sex with men and generated phylogenies comparable to those that can be obtained from HIV surveillance data in the UK. Clustering and source attribution approaches were applied to evaluate their ability to identify patient attributes as transmission risk factors. We find that commonly used methods show a misleading association between cluster size or odds of clustering and covariates that are correlated with time since infection, regardless of their influence on transmission. Clustering methods usually have higher error rates and lower sensitivity than source attribution method for identifying transmission risk factors. But neither methods provide robust estimates of transmission risk ratios. Source attribution method can alleviate drawbacks from phylogenetic clustering but formal population genetic modeling may be required to estimate quantitative transmission risk factors.

Keywords: Cluster analysis; Computer simulation; HIV epidemiology; Phylodynamics; Phylogenetic analysis.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Cluster Analysis
  • Computer Simulation*
  • Databases, Factual / statistics & numerical data*
  • HIV Infections / epidemiology*
  • HIV Infections / transmission
  • Homosexuality, Male / statistics & numerical data
  • Humans
  • Male
  • Middle Aged
  • Phylogeny
  • Reproducibility of Results
  • Risk Factors
  • Sexual and Gender Minorities / statistics & numerical data
  • United Kingdom
  • Young Adult