- Lee, Terrence C;
- Saseendrakumar, Bharanidharan Radha;
- Nayak, Mahasweta;
- Chan, Alison X;
- McDermott, John J;
- Shahrvini, Bita;
- Ye, Gordon Y;
- Sitapati, Amy M;
- Nebeker, Camille;
- Baxter, Sally L
Purpose
To quantify and characterize social determinants of health (SDoH) data coverage using single-center electronic health records (EHRs) and the National Institutes of Health All of Us research program.Design
Retrospective cohort study from June 2014 through June 2021.Participants
Adults 18 years of age or older with a diagnosis of diabetic retinopathy, glaucoma, cataracts, or age-related macular degeneration.Methods
For All of Us, research participants completed online survey forms as part of a nationwide prospective cohort study. In local EHRs, patients were selected based on diagnosis codes.Main outcome measures
Social determinants of health data coverage, characterized by the proportion of each disease cohort with available data regarding demographics and socioeconomic factors.Results
In All of Us, we identified 23 806 unique adult patients, of whom 2246 had a diagnosis of diabetic retinopathy, 13 448 had a diagnosis of glaucoma, 6634 had a diagnosis of cataracts, and 1478 had a diagnosis of age-related macular degeneration. Survey completion rates were high (99.5%-100%) across all cohorts for demographic information, overall health, income, education, and lifestyle. However, health care access (12.7%-29.4%), housing (0.7%-1.1%), social isolation (0.2%-0.3%), and food security (0-0.1%) showed significantly lower response rates. In local EHRs, we identified 80 548 adult patients, of whom 6616 had a diagnosis of diabetic retinopathy, 26 793 had a diagnosis of glaucoma, 40 427 had a diagnosis of cataracts, and 6712 had a diagnosis of age-related macular degeneration. High data coverage was found across all cohorts for variables related to tobacco use (82.84%-89.07%), alcohol use (77.45%-83.66%), and intravenous drug use (84.76%-93.14%). However, low data coverage (< 50% completion) was found for all other variables, including education, finances, social isolation, stress, physical activity, food insecurity, and transportation. We used chi-square testing to assess whether the data coverage varied across different disease cohorts and found that all fields varied significantly (P < 0.001).Conclusions
The limited and highly variable data coverage in both local EHRs and All of Us highlights the need for researchers and providers to develop SDoH data collection strategies and to assemble complete datasets.