GET WELL: an automated surveillance system for gaining new epidemiological knowledge

BMC Public Health. 2011 Apr 21:11:252. doi: 10.1186/1471-2458-11-252.

Abstract

Background: The assumption behind the presented work is that the information people search for on the internet reflects the disease status in society. By having access to this source of information, epidemiologists can get a valuable complement to the traditional surveillance and potentially get new and timely epidemiological insights. For this purpose, the Swedish Institute for Infectious Disease Control collaborates with a medical web site in Sweden.

Methods: We built an application consisting of two conceptual parts. One part allows for trends, based on user specified requests, to be extracted from anonymous web query data from a Swedish medical web site. The second conceptual part permits tailored analyses of particular diseases, where more complex statistical methods are applied to the data. To evaluate the epidemiological relevance of the output, we compared Google search data and search data from the medical web site.

Results: In the paper, we give concrete examples of the output from the web query-based system. We also present results from the comparison between data from the search engine Google and search data from the national medical web site.

Conclusions: The application is in regular use at the Swedish Institute for Infectious Disease Control. A system based on web queries is flexible in that it can be adapted to any disease; we get information on other individuals than those who seek medical care; and the data do not suffer from reporting delays. Although Google data are based on a substantially larger search volume, search patterns obtained from the medical web site may still convey more information from an epidemiological perspective. Furthermore we can see advantages with having full access to the raw data.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Academies and Institutes / organization & administration
  • Communicable Disease Control / standards*
  • Epidemiology*
  • Health Promotion*
  • Humans
  • Influenza, Human / diagnosis
  • Influenza, Human / physiopathology
  • Information Storage and Retrieval / statistics & numerical data*
  • Information Storage and Retrieval / trends
  • Internet / statistics & numerical data*
  • Medical Informatics Applications*
  • Population Surveillance / methods*
  • Search Engine / statistics & numerical data*
  • Seasons
  • Software
  • Sweden
  • Terminology as Topic
  • Time and Motion Studies
  • Vomiting / diagnosis
  • Vomiting / physiopathology