Epicosm-a framework for linking online social media in epidemiological cohorts

Int J Epidemiol. 2023 Jun 6;52(3):952-957. doi: 10.1093/ije/dyad020.

Abstract

Motivation: Social media represent an unrivalled opportunity for epidemiological cohorts to collect large amounts of high-resolution time course data on mental health. Equally, the high-quality data held by epidemiological cohorts could greatly benefit social media research as a source of ground truth for validating digital phenotyping algorithms. However, there is currently a lack of software for doing this in a secure and acceptable manner. We worked with cohort leaders and participants to co-design an open-source, robust and expandable software framework for gathering social media data in epidemiological cohorts.

Implementation: Epicosm is implemented as a Python framework that is straightforward to deploy and run inside a cohort's data safe haven.

General features: The software regularly gathers Tweets from a list of accounts and stores them in a database for linking to existing cohort data.

Availability: This open-source software is freely available at [https://dynamicgenetics.github.io/Epicosm/].

Keywords: ALSPAC; Big Data; Social media; cohort studies; data linkage; data science; epidemiology; longitudinal studies; mental health; wellbeing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Data Accuracy
  • Databases, Factual
  • Humans
  • Social Media*
  • Software