Patient-centered healthcare is proposed as a way to help improve individual and population health outcomes by ensuring that the perspectives of patient are reflected in the care paradigm as well as the systems that collect and record their care journey.1 To provide this whole-person care, it is critical to understand the sociocultural factors such as race and ethnicity that impact health. In addition, understanding the impact of the medical devices used to provide this care is critical. Medical devices used in the diagnosis, management, and treatment of medical conditions are evaluated by the Food and Drug Administration (FDA) for their safety and effectiveness. Typically, randomized, controlled, clinical trials are the “gold standard” for the evaluation of some medical devices with their selected populations and idealized, controlled conditions. However, clinical trials have been plagued with underrepresentation of diverse racial and ethnic groups. African-derived and Latinx people represent 30% of the US population but only account for 6% of all participants in federally funded clinical trials.2 In addition, data shows that participation of racial and ethnic minorities in trials is exponentially lower than the incidence of disease in these groups, with little improvement over time.3 While clinical trials are critical to understanding the performance of some medical devices, they may not always include diverse populations or address the practice challenges encountered daily in the delivery of healthcare. The inability to account for patient diversity may impact the generalizability of the results to the US public.4 Therefore, other sources of information may be useful to help inform public health decisions and patient care.
In the delivery of healthcare, data on clinical, economic, and patient-reported outcomes are generated, collected, curated, and analyzed. Increasingly, health care providers, payers, patients and health delivery systems have demonstrated an interest in using this data to inform clinical care. This data, called real-world data (RWD), is an attractive option to understand the patients’ daily lived experiences with their conditions. As defined in the FDA guidance, RWD are data relating to patient health status and/or the delivery of health care routinely collected from a variety sources.5 Examples of RWD include product and disease registries, claims and billing data, patient-generated data, health surveys, electronic health records (EHRs), and medical chart reviews.5,6 Observational data collected in these data repositories are increasingly being used to guide clinical decision making, determine patient outcomes, and evaluate care paradigms.7 RWD may become valid scientific evidence for regulatory decision-making depending on the characteristics and quality of the data. RWD may help bridge the evidentiary gap between research and clinical practice while enhancing efficiency in generating evidence to improve health outcomes.
The 21st Century Cures Act emphasizes that FDA consider how best to use evidence from RWD for reasonable assurance of device safety and effectiveness while accelerating access to important new technologies.8 The Center for Devices and Radiological Health (CDRH) is committed to supporting the creation and analysis of RWD for medical device evaluation and surveillance. CDRH has evaluated RWD submitted as evidence in support of expanding the labeled indications of the population for which a device was previously used. For example, the initial approval of transcatheter aortic-valve replacement (TAVR) devices indicated that the procedure was to be done through the femoral artery or through a small incision in the chest. Using national registry data, the FDA expanded the labeled indication to include other routes of device placement. By not requiring a new clinical trial, the device was available to more patients in a shorter time frame.9
The full benefit of RWD can only be realized if the necessary data is collected in EHRs, registries, and claims datasets. Many efforts are underway to improve the seamless integration and application of the different types of RWD in regulatory decision-making, including increased use of patient-reported outcomes as part of routine clinical care, widespread use of medical device surveillance initiatives such as the Unique Device Identifier (UDI), and technological advances such as natural language processing. CDRH is collaborating with the Medical Device Innovation Consortium (MDIC), a public-private partnership, in building the National Evaluation System for Health Technology (NEST) to catalyze the timely, reliable, and cost-effective development of evidence using RWD to enhance regulatory and clinical decision-making.10 NEST creates strategic partnerships and linkages among data sources and the entities that manage them, including registries, electronic health records, payer claims data, patient-generated data, and other sources.11 These data sources historically suffer from inconsistent and incomplete data collection.12 While still early in development, NEST and other systems like it will require the use of appropriate data quality and methods standards such as ensuring that the identifying data for patients is consistently collected in all the data sources.13 It is critical that these data sources contain demographic data on race and ethnicity collected consistently and in a standardized manner to not only facilitate linkages to other data sources, but also to provide insights on device performance across different demographic groups. To assure that the data being collected can be most useful in informing health-related decisions, it is important that it reflect the composition of the patient population living with the condition.
In 1997, The Office of Management and Budget (OMB) issued revised recommendations for the collection and use of race and ethnicity data by US federal agencies (Policy Directive 15).14 The recommendation requires that respondents first be asked about ethnicity (i.e., “Hispanic or Latino”) and then asked to identify themselves by race categorized as follows: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. Not only is there the option to select one or more racial designations, but respondents may also be presented with more granular ethnicity (e.g., Mexican, Puerto Rican) and racial categories (e.g., Chinese, Filipino). The OMB policy also recommends that the information be provided by self-report to minimize misclassification.
The FDA recommends that clinical trials for FDA-regulated medical products use standard terminology for age, sex, gender, race, and ethnicity to ensure that subpopulation data is collected consistently.15 In addition, the FDA recommends sponsors enroll participants who reflect the demographics of the clinically relevant populations with regard to age, gender, race, and ethnicity.15 Similar to clinical trials, demographic data recorded in RWD sources are most robust when collected in a consistent and standardized manner. This information may not only help organizations proactively identify and address health disparities, but it also could be used to improve overall public health for all groups. Laws and Heckscher16 observed that 11.8% of hospital systems studied in New England did not collect any race or ethnicity data. In addition, many systems that collected the information did not conform to the OMB standards. Another survey of US hospitals found that 22% of facilities did not collection information on race and ethnicity.17 Some investigators have explored approaches to improve the reporting of these important population characteristics. Bhalla et al18 reported a system-wide, standardized effort to improve the collection of demographic characteristics from a New York-based hospital system. Training registration staff to ask patients to report their demographic information (not registrar observations) dropped the unknown race and ethnicity percentage (47.2% and 62.1%, respectively) to 21.3% and 9.7%, respectively.
In the absence of training patient intake staff or having clear data collection tools, it is common to see misclassification of race and ethnicity in RWD sources. For example, a study examining the correlation between race and ethnicity measures collected in a cancer registry compared to the EHR found that “Hispanic” was often recorded in the race field or that all patients recorded as being “Hispanic” were automatically coded to “white” race.19 In addition, they found that the “unknown” or missing race and ethnicity fields resulted in data quality that was worse than if the data had been generated at random.20
Misclassification of race and ethnicity is also a concern for other sources of RWD such as registries. The absence of standardized operational procedures to capture race and ethnicity can plague the quality of data obtained from registries. An analysis of the Surveillance, Epidemiology, and End Results (SEER) registry showed that SEER data under-reported the number of cancer patients in specific demographic groups compared to self-identification, except for the white group which was similar. This under-reporting was most marked in American Indians/Alaska Natives which were mostly misclassified as white.21 Of note, the demographic data in SEER is subject to the EHR data which may be based on hospital personnel’s subjective appraisal of race and ethnicity instead of self-report.22 Lee et al19 also evaluated cancer registries and EHRs, finding that there was significant discordance in patient racial and ethnicity descriptions amongst the different health databases. These studies not only highlight the importance of training staff to avoid imputing race and ethnicity, but also the importance of encouraging patients to self-report their demographic characteristics.
Like misclassification, missing data can impede compilation of various data sources as well as valid inferences being drawn from the data. In a comparison of three registries, Mendelsohn et al23 found an 18% missingness rate for race/ethnicity in one registry. One potential suggestion for improving the data capture for race and ethnicity is a forced-choice response field programmed during the development of the electronic data capture system. However, there are other potential barriers to collecting this information including concerns about patient privacy, potential resistance from patients, and the state laws around soliciting this information. Baker et al20 reported that 17.2% of the Californian’s surveyed were uncomfortable reporting their own race/ethnicity and 46.3% of respondents were worried that providing information could be used to discriminate against them. These findings emphasize the value of including patients in the development of race and ethnicity data collection tools.
By accurately collecting information on race and ethnicity, healthcare systems will enable better understanding of the needs of the populations that they serve, identification of health disparities within their population, and programmatic efforts to improve quality of care, ultimately leading to patient-centered care. Adopting common terminology and standardized approaches to collecting race and ethnicity demographic data such as that described by OMB and FDA will help healthcare systems foster high quality evidence-generation systems that support an inclusive learning health system that benefits society more broadly. The increased adoption of RWD, improved interoperability, and greater patient involvement in evidence generation will concomitantly promote the use of RWD to help inform regulatory decision-making. We have the opportunity to include the perspectives of patients as we shape the methods used to collect information on their sociocultural experiences through race and ethnicity as well as the outcomes they experience in the healthcare system. By working collaboratively, we can assure that all patients are represented in these rich sources of information and we collectively have the evidence needed to make informed healthcare decisions.