Direct REL Data Collection Methods

Primary Sources of Race, Ethnicity and Language Data

Health Insurance Plan Enrollment

In 2003–2004, America's Health Insurance Plans collaborated with the Robert Wood Johnson Foundation to conduct a survey and follow-up research to assess whether health plans and insurers collect racial and ethnic data on their enrollees and how this data is used to improve patient care. Their findings indicate that the most common method used to collect race, ethnicity and primary language information is via the enrollment process.1 The enrollees self-report these data, which have been found to be fairly accurate. Information collected at the time of enrollment has the advantage of being integrated into the health plan's central data system. The primary concern about collecting data during enrollment involves the potential for members to perceive that race and ethnicity data might be used to deny coverage. As mentioned previously, it may be important to include messages within the enrollment form that inform members about the use of the race, ethnicity and language data.

Report: Collection of Race and Ethnicity and Primary Language to Address Health Care Disparities

Did You Know? The enrollment process is an easy way to collect race, ethnicity and primary language information.

Disease Management

Disease management programs are another avenue for health plans to collect race, ethnicity and primary language information from their members. Data can be collected not only during enrollment in these targeted programs, but during any one of the frequent contacts that the disease management entity has with the member. Program participants self-report these data, so they are therefore likely to be accurate. This method may reach some of the plans' most vulnerable members. However, this method will capture only those individuals who participate in disease management programs, and will not provide race, ethnicity and language data across the majority of the plan membership. Also, race, ethnicity and language data collected through disease management programs may reside within a contracted disease management organization and are not necessarily transmitted to the health plan.

Top Tips: Data collection through disease management programs captures only a fraction of members. It may be helpful to supplement this with other data.

Health Risk Assessment

Health plans use health risk assessments (HRAs) to identify the future care needs of their members and to determine those individuals who would benefit from specific disease management or other health promotion programs. HRAs typically collect members' demographic information, including data on race, ethnicity and primary language. As in the case of data collected via disease management programs, health risk assessments may realistically capture only a limited fraction of all members.


Many medical groups, physicians' offices, hospitals and clinics collect information on the patient during his or her intake process. This includes information on the member's demographic characteristics, initial health condition and symptoms, and services and treatments received. Health plans can potentially obtain race, ethnicity and language data collected by providers through a data transfer. This is the primary method of data collection used by plans that are part of an integrated delivery system (IDS), such as HealthPartners and Kaiser Permanente. In these cases, shared systems and data infrastructure allow for the easy transfer of data from providers to the health plan. However, plans that are not part of an IDS may need to both negotiate access to these data and reconcile the data to ensure that the data categories used by providers match those used by the plan. An advantage of collecting data during an encounter is that members have the opportunity to ask questions about why data are being collected and what data will be used for. If staffs are properly trained, this method can be quite effective in collecting data. However, without proper education, providers may be hesitant to ask these questions of members, fearing exposure to litigation. Furthermore, providers who do not ask members to self-identify may note members' data incorrectly.

Did You Know? Data collection through encounters is the primary method of data collection used by plans that are part of an integrated delivery system (IDS), e.g., HealthPartners and Kaiser Permanente.

Top Tips: Staff education is essential for collecting data through the intake process. Without education and training, staff may incorrectly assess a member's race or ethnicity. It is also important to train staff to communicate why they are asking for race and ethnicity information.

Member Web Portal

Health plans are increasingly using Web portals to help members manage their health care. These Web portals offer information about enrollees' benefits, decision support tools and claims information. Health plans can use the member Web portal as a vehicle for collecting members' background information, such as race, ethnicity and preferred language. Aetna, HealthPartners and UnitedHealth Group are among the National Health Plan Collaborative plans employing this method.

Top Tips: Web portals can be effective in collecting data, but differences in Internet access across membership can lead to biased results.

Did You Know? Web portals allow data collection at a more granular level (e.g., more race, ethnicity and language options can be provided).

As mentioned earlier, the efficiency and appropriateness of specific methods may vary based on substantial differences in the rate of use of a plan's Web portal by enrollees in different markets and regions and/or those served by different plans in same area. The member Web portal can be an efficient method of data capture for Internet users. For example, Aetna found the use of the Web portal to be very efficient and effective for its membership. 

Other plans may not find the Web portal as effective, particularly those whose members are not likely to access the Internet. In addition, even if a relatively large proportion of enrollees uses the Web portal and is willing to provide race, ethnicity and language information, there still can be substantial differences within a given plan. For example, some groups of members may be more or less likely to use the Web portal than others. Health plans should be aware of these potential biases and attempt to use strategies that facilitate the collection of data from the widest range of their diverse membership.

Did You Know? The cost of adding race, ethnicity and language data collection to outreach is often minimal.

Case Study: Aetna: Voluntary Race, Ethnicity and Language Data Collection Program.

The most significant advantage of collecting race/ethnicity data through the member Web portal is that it allows health plans to collect the information at a more granular level. Plans can include more race, ethnicity and language options through drop-down menus, eliminating the space constraints associated with paper forms. The availability of this information could also populate other databases, removing the need for multiple data entry.

Health Plan Direct Outreach

Health plans frequently conduct outreach as part of efforts to educate members about existing programs, encourage preventive screenings or help members better understand their benefits. Race, ethnicity and language data collection can be incorporated into these outreach efforts. Harvard Pilgrim Health Care (HPHC) initiated the collection of race and ethnicity data through interactive voice response (IVR) outreach calls to educate members about colorectal cancer screening. A major advantage of this method is that since plans are already conducting outreach for other purposes, the cost of adding race, ethnicity and language data collection to the outreach is often minimal.

Case Study: Harvard Pilgrim Health Care: Pilot Test of IVR Outreach Calls as a Mechanism for Collecting Race and Ethnicity Data

Member Survey

Health plans can integrate race, ethnicity and language questions into member surveys that are intended for other topic areas. Alternatively, plans can conduct a survey for the specific purpose of collecting race, ethnicity and language information from members. The use of member surveys always raises the important concern of ensuring adequate response rates. Highmark Inc. developed a paper-based questionnaire asking members for their race and ethnicity, language spoken at home, language preference for communications with Highmark Inc. and whether a member or family member needs or wants an interpreter to communicate with a health care provider.

Case Study: Highmark Inc.: Obtaining Race, Ethnicity and Language Preference Through a Member Survey

In addition to internally developed member surveys, health plans may use existing standardized instruments. A common survey administered by plans is the Consumer Assessment of Health Providers and Systems (CAHPS). This survey evaluates the quality of services provided to health plan enrollees and contains a question on member race and ethnicity. However, the CAHPS usually captures a relatively small sample of members. Although a plan can make inferences and estimates about the composition and extent of disparities throughout its entire membership, health plans need to be aware of the potential selection bias associated with those more and less likely to respond to surveys.

Did You Know? Selection bias can be common in member surveys.

Member-Initiated Contact

Members initiate contact with their health plan for numerous reasons. These points of contact may include benefit questions, administrative or billing inquiries, as well as complaints or grievances. During these points of contact, health plans can ask members at the end of the call to "update" their information. Updated information could include the member's race, ethnicity or preferred language. For example, Molina Healthcare asks members who call into its nurse advice line about their language preferences and includes this information in the member's records. As is the case with several of the other data collection methods, data would be collected only for the subset of members who contact the health plan. Additionally, customers calling with grievances or complaints may be less likely to cooperate with requests for race, ethnicity and language information.

Secondary Sources of Race, Ethnicity and Language Data

Health plans that serve Medicare and Medicaid populations can link their enrollee data to race/ethnicity data collected in the course of program administration, and there are numerous examples of this practice. The accuracy of Medicare's race/ethnicity data has been steadily improving. The accuracy of race/ethnicity data in Medicaid programs varies both by state and by eligibility category. Those states and categories that rely on an enrollee-completed application form are likely to have the most accurate data.2

Centers for Medicare & Medicaid Services (CMS)

Medicare's databases provide a rich source of information about the program's 43 million beneficiaries. The program maintains beneficiary race and ethnicity data, which are derived from Social Security's administrative records. Plans that have a Medicare product can obtain these data from CMS, although the usefulness of these data may be limited. Specifically, most Medicare data on race and ethnicity only have four fields: white, black, other and unknown. In addition, the Social Security Administration does not maintain separate fields for race and ethnicity. As a result, the lack of specificity does not allow for accurate estimation of Asians, Hispanics and American Indians. To overcome this limitation, some health plans, such as Humana, are using surname analysis to estimate members' ethnicity and are combining this with Medicare data on race.

Top Tips: Due to the lack of specificity in Medicare data, some plans use surname analysis to estimate ethnicity and compare with Medicare date on race.

State Medicaid Agencies

Medicaid plans have an advantage over commercial plans in obtaining race, ethnicity and language information since this information is collected by states during eligibility determination or enrollment into a health plan. Since 2002, CMS has required state quality strategies to include "procedures that identify the race, ethnicity, and primary language of each Medicaid enrollee" for the managed care organization or prepaid in member health plan at the time of enrollment. However, it should be noted that although all state Medicaid agencies collect some form of data on race and ethnicity, data sources and frequency of collection vary significantly across states. For example, Molina Healthcare, which serves a significant number of Medicaid enrollees in California, receives information from the state that is accurate enough to use for strategic planning purposes. In contrast, Massachusetts-based Boston Medical Center HealthNet Plan receives data that are only about 30 percent complete. Some states implement additional processes to evaluate data accuracy, such as matching with other types of state data (e.g., vital statistics and immunization registries), matching with administrative or claims data, or comparing data with self-reported race/ethnicity from other sources such as CAHPS.


As noted previously, employers are increasingly engaging with health plans on disparities issues, particularly as they relate to their employees. Indeed, plans report that more employers are asking about their efforts to address disparities and are doing so in a more systematic fashion through the use of the Value8 Common Request for Information (RFI) developed by the National Business Coalition on Health (NBCH).

Did You Know? eValue8 is a tool used annually by health care purchasers to compare the quality and efficiency of America's health plans.

Many employers already collect race, ethnicity and language data for Equal Employment Opportunity purposes, which presents another opportunity for plans to obtain this information. As an example, CIGNA has partnered with one of its major employer accounts to examine health care utilization and quality for its employees, stratified by race and ethnicity. For this analysis, employers supplied information on members' race and ethnicity.

Case Study: CIGNA: Collecting Race and Ethnicity Data Through a Collaborative Clinical Initiative with a Major Employer




1. Collection and Use of Racial and Ethnic Data by Health Plans to Address Disparities: A Final Summary Report. America's Health Insurance Plans and the Robert Wood Johnson Foundation, 2004. ( 2. Lurie N and Fremont AM. "Looking Forward: Crosscutting Issues in Race/Ethnicity Data Collection." Health Services Research, 41(4 Pt 1):1519-33, 2006. ( )