Real-world data on diffuse large B-cell lymphoma in 2010-2019: usability of large data sets of Finnish hospital data lakes

Future Oncol. 2022 Mar;18(9):1103-1114. doi: 10.2217/fon-2021-0806. Epub 2022 Feb 3.

Abstract

Background: Real-world data on diffuse large B-cell lymphoma (DLBCL) has remained incomplete. In Finland, health record data originally recorded in different hospital data record systems are collectively available via data lake technology, enabling efficient extraction and analysis of large data sets. The usability of Finnish data lake data in the assessment of DLBCL was evaluated. Methods: Adult DLBCL patients diagnosed between 2010 and 2019, home municipality in the Hospital District of Southwest Finland and data available in respective data lake were included. Results: The algorithmic determination of treatment lines and respective survival was successful. Patient characterization was feasible, albeit partly incomplete because of limited data content/availability and coverage. Stage, International Prognostic Index and cell of origin were available for 63.0, 68.3 and 28.4% of patients, respectively. Genetic aberrations were not structurally available or feasible to extract without a manual chart review. Conclusion: Finnish data lakes represent an efficient way to analyze large DLBCL data sets. The current study provides a tool for developing recording practices in routine care.

Keywords: data mining; diffuse large B-cell lymphoma; electronic health record data; hospital data lake; real-world data/evidence.

MeSH terms

  • Aged
  • Aged, 80 and over
  • Algorithms
  • Antineoplastic Agents / therapeutic use*
  • Electronic Health Records
  • Female
  • Finland / epidemiology
  • Hospitals
  • Humans
  • Lymphoma, Large B-Cell, Diffuse / drug therapy
  • Lymphoma, Large B-Cell, Diffuse / epidemiology*
  • Lymphoma, Large B-Cell, Diffuse / mortality
  • Male
  • Middle Aged
  • Registries*
  • Retrospective Studies
  • Survival Analysis

Substances

  • Antineoplastic Agents