Profiles of tobacco smokers and ex-smokers in a large-scale random sample survey across Wales: an unsupervised machine-learning cluster analysis

Lancet. 2023 Nov:402 Suppl 1:S7. doi: 10.1016/S0140-6736(23)02070-6.

Abstract

Background: The Welsh government recently set a target to be smoke-free by 2030, which means reducing the prevalence of tobacco smoking in adults to 5% by then. The goal is to improve health and population life expectancy. To support this strategy, we identified profile groups with different sets of socioeconomic and demographic characteristics within the population of smokers. We compared these profiles to those identified in the ex-smoker population to provide a broader understanding of smokers and inform targeting of interventions and policy.

Methods: We did a cross-sectional study using data from the National Survey for Wales. This survey is a random sample telephone survey of individuals aged 16 years and older across Wales carried out from Sept 1, 2021 to Jan 31, 2022, weighted to be representative of the Welsh population. For the smoking subgroup, we did a weighted hierarchical cluster analysis with multiple imputation to impute missing data and repeated it for ex-smokers. In total, 63 survey variables were used in the analysis. These variables included smoking history, e-cigarette use, sociodemographics, lifestyle factors, individual-level deprivation, general health and long-term conditions, mental health, and wellbeing.

Findings: Among the 6407 respondents (weighted proportions: 49% male, 51% female; 28% aged 16-34 years, 46% aged 35-44 years, 26% aged ≥65 years; 95% white, 5% other ethnicity), 841 (13%) smoked and 2136 (33%) were ex-smokers. Four distinctive profiles of smokers were identified, the groups were of relatively comparable size and characterised by similarities described as (1) high-risk alcohol drinkers and without children; (2) single, mostly in social housing, and poor health and mental health; (3) mostly single, younger, tried e-cigarettes, and poor mental health; (4) older couples and poor health; when comparing the groups with each other. Cluster quality and validation statistics were considered fair: silhouette coefficient=0·09, Dunn index (Dunn2)=1·06. Generally, ex-smoker clusters differed from smoking clusters because of themes related to increased sickness, better affluence, employment, and older age (≥75 years).

Interpretation: This study suggests that not all smokers are the same, and they do not fall into one coherent group. Smoking cessation interventions to improve the health of ageing populations might need a different approach to consider a wider context or motivations to inform targeted quitting. It is acknowledged that smoking might be underreported because of perceived social unacceptability.

Funding: Public Health Wales.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Cluster Analysis
  • Cross-Sectional Studies
  • Electronic Nicotine Delivery Systems*
  • Ex-Smokers
  • Female
  • Humans
  • Machine Learning
  • Male
  • Middle Aged
  • Smokers
  • Smoking Cessation*
  • Surveys and Questionnaires
  • Wales / epidemiology
  • Young Adult