Sharing behavioral data through a grid infrastructure using data standards

J Am Med Inform Assoc. 2014 Jul-Aug;21(4):642-9. doi: 10.1136/amiajnl-2013-001763. Epub 2013 Sep 27.

Abstract

Objective: In an effort to standardize behavioral measures and their data representation, the present study develops a methodology for incorporating measures found in the National Cancer Institute's (NCI) grid-enabled measures (GEM) portal, a repository for behavioral and social measures, into the cancer data standards registry and repository (caDSR).

Methods: The methodology consists of four parts for curating GEM measures into the caDSR: (1) develop unified modeling language (UML) models for behavioral measures; (2) create common data elements (CDE) for UML components; (3) bind CDE with concepts from the NCI thesaurus; and (4) register CDE in the caDSR.

Results: UML models have been developed for four GEM measures, which have been registered in the caDSR as CDE. New behavioral concepts related to these measures have been created and incorporated into the NCI thesaurus. Best practices for representing measures using UML models have been utilized in the practice (eg, caDSR). One dataset based on a GEM-curated measure is available for use by other systems and users connected to the grid.

Conclusions: Behavioral and population science data can be standardized by using and extending current standards. A new branch of CDE for behavioral science was developed for the caDSR. It expands the caDSR domain coverage beyond the clinical and biological areas. In addition, missing terms and concepts specific to the behavioral measures addressed in this paper were added to the NCI thesaurus. A methodology was developed and refined for curation of behavioral and population science data.

Keywords: Behavioral Measure; Common Data Element; Data Sharing; Grid Infrastructure; Ontology; Vocabulary.

MeSH terms

  • Behavioral Sciences / organization & administration*
  • Biomedical Research / organization & administration*
  • Computer Security
  • Databases, Factual / standards*
  • Health Behavior
  • Humans
  • Information Dissemination / methods*
  • Information Storage and Retrieval
  • Internet
  • Medical Informatics
  • National Cancer Institute (U.S.)
  • Registries*
  • United States