What Categories of Race/Ethnicity to Use?

A critical issue in race and ethnicity data collection is how many categories of race and ethnicity to include. Having every possible racial and ethnic category available in a data collection tool may be quite cumbersome and require sophisticated information technology. On the other hand, collecting data using very broad categories may not be useful for organizations serving very diverse populations. For example, the Asian category includes individuals from India, China, Korea and other countries with significantly different cultures and beliefs.

Use of a Separate Ethnicity Question

One of the unresolved questions in the collection of race and ethnicity data is how to collect information on Latino ethnicity. To address this, the Office of Management and Budget collects information on race and ethnicity using two separate questions. However, recent studies have found that many Latinos do not see themselves as having a race separate from their ethnicity.1 Indeed, a large proportion of individuals who respond to the ethnicity question tend to leave the race question blank.2 The Health Research Educational Trust recommends using a single race and ethnicity question that includes a Hispanic or Latino option.


Race and Ethnicity Categories

In its 2004 report, Eliminating Health Disparities: Measurement and Data Needs, the National Research Council of the National Academies recommended that health care organizations collect standardized data on race and ethnicity using the Office of Management and Budget (OMB) standards as a base minimum.3 The Health Research and Educational Trust (HRET) recommends that when possible, organizations should collect granular data on race and ethnicity that can be aggregated into the broader OMB categories. Specifically, the U.S. Centers for Disease Control and Prevention (CDC) have prepared a hierarchical code set that can support this approach. The CDC code set is based on current federal standards for classifying data on race and ethnicity, specifically the minimum race and ethnicity categories defined by the OMB and a more detailed set of race and ethnicity categories maintained by the U.S. Bureau of the Census.

The code set consists of two tables: (1) Race and (2) Ethnicity. Concepts in the Race and Ethnicity tables include the OMB minimum categories—five races and two ethnicities—along with a sixth race category—Other race—and a more detailed set of race and ethnicity categories used by the Census. Within the table, each race and ethnicity concept is assigned a unique identifier, which can be used in electronic interchange of race and ethnicity data. The hierarchical code is an alphanumeric code that places each discrete concept in a hierarchical position with reference to other related concepts. For example, Costa Rican, Guatemalan and Honduran are—all ethnicity concepts whose hierarchical codes place them at the same level relative to the concept Central American, which is the same hierarchical level as Spaniard within the broader concept Hispanic or Latino. In contrast to the unique identifier, the hierarchical code can change over time to accommodate the insertion of new concepts.



  1. Baker DW, Cameron KA, Feinglass J, et al. "A System for Rapidly and Accurately Collecting Patients' Race and Ethnicity." American Journal of Public Health, 96(3): 532-537, 2006.
  2. Weinick R, Flaherty K, Bristol SJ. Creating Equity Reports: A Guide for Hospitals. Boston: The Disparities Solutions Center, Massachusetts General Hospital, 2008.
  3. Eliminating Health Disparities: Measurement and Data Needs. Washington: National Research Council of the National Academies, 2004.