Validation of ethnicity in cancer data: which Hispanics are we misclassifying?

J Registry Manag. 2009 Summer;36(2):42-6.

Abstract

The study of cancer in Hispanics in the United States has been hindered by misclassification of Hispanics as non-Hispanic and by the convenient practice of aggregating the diverse Hispanic subgroups into a general Hispanic category. The Hispanic Origin Identification Algorithm (HOIA) was developed to improve the identification of both the general Hispanic ethnicity and the specific Hispanic subgroup in cancer incidence data. Using an independent study of prostate cancer cases from South Florida as the "gold standard" and the Florida incident cancer registry data, we validated this algorithm and studied the characteristics of those Hispanics whose ethnicity was commonly missed in the cancer registry records. Overall, agreement between the gold standard information (derived from self-report) and HOIA derived ethnicity was 97%. For Hispanic subgroup, among a subset of subjects with known birthplace, the percent agreement was 98%. After HOIA, age-adjusted Hispanic cancer rates reflected an increase of 8% in males and 10% in females. Hispanics born in the United States were 4.6 times more likely to be misclassified as non-Hispanic than foreign-born Hispanics; black Hispanics 2.5 times more than whites; and women 1.3 times more than men. HOIA is a valid and effective tool for improving the accuracy of both general Hispanic ethnicity and Hispanic subgroup data in cancer registries. Improved procedures for identifying and recording ethnicity in health facilities are recommended, particularly focusing on improving the information gathered on Hispanics born in the United States, or who are black or female.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Validation Study

MeSH terms

  • Adolescent
  • Adult
  • Age Factors
  • Aged
  • Algorithms
  • Case-Control Studies
  • Child
  • Child, Preschool
  • Data Collection
  • Epidemiologic Methods*
  • Ethnicity
  • Female
  • Florida / epidemiology
  • Hispanic or Latino / statistics & numerical data*
  • Humans
  • Incidence
  • Infant
  • Infant, Newborn
  • Male
  • Middle Aged
  • Neoplasms / epidemiology
  • Neoplasms / ethnology*
  • Registries
  • Statistics as Topic
  • Young Adult