Supervised learning for infection risk inference using pathology data

BMC Med Inform Decis Mak. 2017 Dec 8;17(1):168. doi: 10.1186/s12911-017-0550-1.

Abstract

Background: Antimicrobial Resistance is threatening our ability to treat common infectious diseases and overuse of antimicrobials to treat human infections in hospitals is accelerating this process. Clinical Decision Support Systems (CDSSs) have been proven to enhance quality of care by promoting change in prescription practices through antimicrobial selection advice. However, bypassing an initial assessment to determine the existence of an underlying disease that justifies the need of antimicrobial therapy might lead to indiscriminate and often unnecessary prescriptions.

Methods: From pathology laboratory tests, six biochemical markers were selected and combined with microbiology outcomes from susceptibility tests to create a unique dataset with over one and a half million daily profiles to perform infection risk inference. Outliers were discarded using the inter-quartile range rule and several sampling techniques were studied to tackle the class imbalance problem. The first phase selects the most effective and robust model during training using ten-fold stratified cross-validation. The second phase evaluates the final model after isotonic calibration in scenarios with missing inputs and imbalanced class distributions.

Results: More than 50% of infected profiles have daily requested laboratory tests for the six biochemical markers with very promising infection inference results: area under the receiver operating characteristic curve (0.80-0.83), sensitivity (0.64-0.75) and specificity (0.92-0.97). Standardization consistently outperforms normalization and sensitivity is enhanced by using the SMOTE sampling technique. Furthermore, models operated without noticeable loss in performance if at least four biomarkers were available.

Conclusion: The selected biomarkers comprise enough information to perform infection risk inference with a high degree of confidence even in the presence of incomplete and imbalanced data. Since they are commonly available in hospitals, Clinical Decision Support Systems could benefit from these findings to assist clinicians in deciding whether or not to initiate antimicrobial therapy to improve prescription practices.

Keywords: Antimicrobial resistance; Behaviour change; Biochemical markers; Decision support; Infection; Machine learning; Predictive modelling; Supervised learning.

MeSH terms

  • Anti-Infective Agents*
  • Biomarkers*
  • Decision Support Systems, Clinical* / statistics & numerical data
  • Drug Resistance, Microbial*
  • Humans
  • Risk Assessment / methods*
  • Risk Assessment / statistics & numerical data
  • Support Vector Machine*

Substances

  • Anti-Infective Agents
  • Biomarkers