Title, abstract, and keyword searching resulted in poor recovery of articles in systematic reviews of epidemiologic practice

J Clin Epidemiol. 2020 May:121:55-61. doi: 10.1016/j.jclinepi.2020.01.009. Epub 2020 Jan 23.

Abstract

Objective: Article full texts are often inaccessible via the standard search engines of biomedical literature, such as PubMed and Embase, which are commonly used for systematic reviews. Excluding the full-text bodies from a literature search may result in a small or selective subset of articles being included in the review because of the limited information that is available in only title, abstract, and keywords. This article describes a comparison of search strategies based on a systematic literature review of all articles published in 5 top-ranked epidemiology journals between 2000 and 2017.

Study design and setting: Based on a text-mining approach, we studied how nine different methodological topics were mentioned across text fields (title, abstract, keywords, and text body). The following methodological topics were studied: propensity score methods, inverse probability weighting, marginal structural modeling, multiple imputation, Kaplan-Meier estimation, number needed to treat, measurement error, randomized controlled trial, and latent class analysis.

Results: In total, 31,641 Hypertext Markup Language (HTML) files were downloaded from the journals' websites. For all methodological topics and journals, at most 50% of articles with a mention of a topic in the text body also mentioned the topic in the title, abstract, or keywords. For several topics, a gradual decrease over calendar time was observed of reporting in the title, abstract, or keywords.

Conclusion: Literature searches based on title, abstract, and keywords alone may not be sufficiently sensitive for studies of epidemiological research practice. This study also illustrates the potential value of full-text literature searches, provided there is accessibility of full-text bodies for literature searches.

Keywords: Bibliometrics; Epidemiological methods; Statistical methods; Systematic literature review; Text mining.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Abstracting and Indexing / methods*
  • Basic Reproduction Number
  • Data Mining / methods
  • Humans
  • Hypermedia
  • Information Storage and Retrieval / methods*
  • Information Storage and Retrieval / statistics & numerical data
  • Kaplan-Meier Estimate
  • Probability
  • Propensity Score
  • Randomized Controlled Trials as Topic
  • Systematic Reviews as Topic*