Risk controlled decision trees and random forests for precision Medicine

Stat Med. 2022 Feb 20;41(4):719-735. doi: 10.1002/sim.9253. Epub 2021 Nov 16.

Abstract

Statistical methods generating individualized treatment rules (ITRs) often focus on maximizing expected benefit, but these rules may expose patients to excess risk. For instance, aggressive treatment of type 2 diabetes (T2D) with insulin therapies may result in an ITR which controls blood glucose levels but increases rates of hypoglycemia, diminishing the appeal of the ITR. This work proposes two methods to identify risk-controlled ITRs (rcITR), a class of ITR which maximizes a benefit while controlling risk at a prespecified threshold. A novel penalized recursive partitioning algorithm is developed which optimizes an unconstrained, penalized value function. The final rule is a risk-controlled decision tree (rcDT) that is easily interpretable. A natural extension of the rcDT model, risk controlled random forests (rcRF), is also proposed. Simulation studies demonstrate the robustness of rcRF modeling. Three variable importance measures are proposed to further guide clinical decision-making. Both rcDT and rcRF procedures can be applied to data from randomized controlled trials or observational studies. An extensive simulation study interrogates the performance of the proposed methods. A data analysis of the DURABLE diabetes trial in which two therapeutics were compared is additionally presented. An R package implements the proposed methods ( https://github.com/kdoub5ha/rcITR).

Keywords: decision trees; precision medicine; random forests; risk control; variable importance.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Computer Simulation
  • Decision Trees
  • Diabetes Mellitus, Type 2* / drug therapy
  • Humans
  • Precision Medicine* / methods