A Temporal Abstraction-based Extract, Transform and Load Process for Creating Registry Databases for Research

AMIA Jt Summits Transl Sci Proc. 2011:2011:46-50. Epub 2011 Mar 7.

Abstract

In the CTSA era there is great interest in aggregating and comparing populations across institutions. These sites likely represent data differently in their clinical data warehouses and other databases. Clinical data warehouses frequently are structured in a generalized way that supports many constituencies. For research, there is a need to transform these heterogeneous data into a shared representation, and to perform categorization and interpretation to optimize the data representation for investigators. We are addressing this need by extending an existing temporal abstraction-based clinical database query system, PROTEMPA. The extended system allows specifying data types of interest in federated databases, extracting the data into a shared representation, transforming it through categorization and interpretation, and loading it into a registry database that can be refreshed. Such a registry's access control, data representation and query tools can be tailored to the needs of research while keeping local databases as the source of truth.