This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.
Face mask wearing has been identified as an effective strategy to prevent the transmission of SARS-CoV-2, yet mask mandates were never imposed nationally in the United States. This decision resulted in a patchwork of local policies and varying compliance, potentially generating heterogeneities in the local trajectories of COVID-19 in the United States. Although numerous studies have investigated the patterns and predictors of masking behavior nationally, most suffer from survey biases and none have been able to characterize mask wearing at fine spatial scales across the United States through different phases of the pandemic.
Urgently needed is a debiased spatiotemporal characterization of mask-wearing behavior in the United States. This information is critical to further assess the effectiveness of masking, evaluate the drivers of transmission at different time points during the pandemic, and guide future public health decisions through, for example, forecasting disease surges.
We analyzed spatiotemporal masking patterns in over 8 million behavioral survey responses from across the United States, starting in September 2020 through May 2021. We adjusted for sample size and representation using binomial regression models and survey raking, respectively, to produce county-level monthly estimates of masking behavior. We additionally debiased self-reported masking estimates using bias measures derived by comparing vaccination data from the same survey to official records at the county level. Lastly, we evaluated whether individuals’ perceptions of their social environment can serve as a less biased form of behavioral surveillance than self-reported data.
We found that county-level masking behavior was spatially heterogeneous along an urban-rural gradient, with mask wearing peaking in winter 2021 and declining sharply through May 2021. Our results identified regions where targeted public health efforts could have been most effective and suggest that individuals’ frequency of mask wearing may be influenced by national guidance and disease prevalence. We validated our bias correction approach by comparing debiased self-reported mask-wearing estimates with community-reported estimates, after addressing issues of a small sample size and representation. Self-reported behavior estimates were especially prone to social desirability and nonresponse biases, and our findings demonstrated that these biases can be reduced if individuals are asked to report on community rather than self behaviors.
Our work highlights the importance of characterizing public health behaviors at fine spatiotemporal scales to capture heterogeneities that may drive outbreak trajectories. Our findings also emphasize the need for a standardized approach to incorporating behavioral big data into public health response efforts. Even large surveys are prone to bias; thus, we advocate for a social sensing approach to behavioral surveillance to enable more accurate estimates of health behaviors. Finally, we invite the public health and behavioral research communities to use our publicly available estimates to consider how bias-corrected behavioral estimates may improve our understanding of protective behaviors during crises and their impact on disease dynamics.
Human behavior plays a key role in infectious disease transmission [
Mask wearing has been identified as an effective strategy to reduce the transmission of SARS-CoV-2. At the individual level, masks decrease both the amount of viral particles dispersed by an infectious wearer and the amount of those inhaled by an uninfected wearer [
To address this gap, researchers and organizations have implemented extensive surveys on human behavior, including mask wearing (eg, [
The value of surveys on public health behaviors can be further restricted when data collection is at the national or state level. Coarse-grained spatiotemporal information about human behavior is of limited utility, providing only sparse insight into local trends. Collecting responses at the national or state level ignores spatial heterogeneity at these finer scales, preventing the identification of these local effects that can drive disease dynamics. Spatial heterogeneity in not only drivers of disease transmission, such as human behavior, but also disease prevalence has been well documented across pathogens (eg, [
Here, we systematically characterize mask wearing across the United States at a fine spatiotemporal scale for 9 months using a national survey and account for the bias in this survey. By comparing survey demographics and vaccination statuses with accurate ground-truth data, we estimate and account for survey and response biases in our analysis of masking behavior. With these bias-corrected estimates, we characterize the spatiotemporal heterogeneity in masking behavior at the county-month level across the United States. Finally, we examine the differences between self-reported and community-reported estimates of masking using an additional survey question, seeking to understand whether these 2 measures are good predictors of one another. Our results are the most precise estimates of masking in the United States during the COVID-19 pandemic, providing insight into the local variation in behavior in response to public health messaging and changes in COVID-19 incidence.
In this study, we sought to characterize the spatiotemporal heterogeneity in self-reported masking behavior in the United States from the fall of 2020 to the spring of 2021. Due to the small sample size in some counties, we used Bayesian binomial regression models to estimate mask-wearing proportions each month. Recognizing that surveys are subject to several types of bias, we used raking and resampling of responses to correct for unrepresentative samples and self-reported vaccination status compared to ground-truth vaccination data to quantify nonresponse and social desirability biases. With these estimates, we were able to identify spatiotemporal trends in bias-corrected masking behavior and compare these values to reported community levels of masking in a different survey question.
We analyzed self-reported mask-wearing survey responses for all 50 US states and the District of Columbia using data from the US COVID-19 Trends and Impact Survey (CTIS) [
By dichotomizing masking responses, we also lost information about the frequency with which people mask, though we expect the effect of this choice to be minimal (see
Due to the small sample sizes in some US counties, we used Bayesian binomial regression models to develop reliable estimates of the proportion of individuals masking in a given county-month. Population density was used as a fixed effect; masking behavior has previously been linked to population density, and this variable was easily available at the county scale [
where Di = log10(population densityi) for county
We also explored more complex model specifications that included state- or county-level random effects. However, both models suffered from a lack of convergence or overfitting and produced functionally similar results. Thus, we opted for the more parsimonious model presented earlier for our main findings; details of these additional models can be found in
We were unable to use the provided weights for responses to the CTIS, due to spatial and temporal mismatch with the scales of our data analysis. Thus, we calculated county-month weights for each observation using the anesrake package [
Given the likelihood of sampling, nonresponse, and social desirability biases, we generated bias-corrected estimates of masking in the United States. In the absence of ground-truth masking data with which to calibrate these CTIS responses, we turned to a different survey question for which ground-truth data were available.
Beginning in late December 2020, the CTIS asked respondents whether they had received a COVID-19 vaccine. The response options were (1) “Yes, “(2) “No,” and (3) “I don’t know.” Meanwhile, ground-truth vaccination data were collected by combining state-reported and CDC data to estimate the percentage of people vaccinated in each county in the United States [
Like the masking data, the CTIS vaccination response data suffers from small and unrepresentative samples in some counties. Thus, we resampled the responses from April and May 2021 according to the survey weights we generated before and then used a (frequentist) binomial generalized linear mixed-effects model to estimate
Given these modeled CTIS county-level vaccination proportions, we compared them with the true vaccination data to calculate the expected bias in reported survey responses relative to ground-truth data in county
To increase the stability of our bias estimates, we used a linear mixed-effects model. This mixed-effects model used random intercepts, which penalizes extreme coefficient estimates to the overall mean, and assumed that the residual error in the estimates was normally distributed. This model generated a penalized estimate of survey bias for each county from the difference in modeled reported vaccination and ground-truth vaccination:
This model was implemented using
We then incorporated these estimates into a Bayesian binomial regression model with an offset for bias to estimate the bias-corrected probability of reporting masking in county
where Di = log10(population densityi) for county
We ran the model using
Beginning November 24, 2020, the CTIS asked a question about masking in one’s community: “In the past 7 days, when out in public places where social distancing is not possible, about how many people would you estimate wore masks?” The answer options were (1) “All of the people,” (2) “Most of the people,” (3) “Some of the people,” (4) “A few of the people,” (5) “None of the people,” and (6) “I have not been out in public places in the past 7 days.” We dichotomized these responses and aggregated them to the county-month the same way as the self-reported CTIS masking responses for December 2020 through May 2021. We then modeled these community masking estimates the same way we modeled the CTIS masking data using Bayesian binomial regression and resampling weighted by survey weights but without a bias offset.
All analyses were completed in R version 4.1.3 (R Core Team and the R Foundation for Statistical Computing), and maps were produced using
This study was reviewed by the Institutional Review Board of Georgetown University and was determined not to be human subject research.
To characterize the trends in the masking behavior in the United States during the COVID-19 pandemic, we used data from the CTIS conducted via Facebook from September 2020 through May 2021. Respondents self-reported how often they had worn a mask while in public in the past week (8,338,877 valid responses). We transformed these responses into a binary variable of masking or not masking and aggregated the responses to the county-month level to analyze spatiotemporal trends. To validate this data source, we analyzed a separate data set from Outbreaks Near Me and found consistent spatiotemporal patterns (
To demonstrate the spatially heterogeneous effects of our data-processing scheme,
Visualization of spatially heterogeneous data-processing effects. (A) Residuals following the binomial regression model. (B) Residuals following the binomial regression model with raking/sample rebalancing. (C) Residuals following the binomial regression model with raking/sample rebalancing and an offset for bias. Residuals are defined as the difference between the modeled and the observed masking estimates at each analysis stage, where negative values indicate model estimates were higher than observed values and positive residuals indicate model estimates were lower than observed values. All maps are shown for February 2021. N/A: not applicable. See
Using bias-corrected masking proportions from the CTIS, we found that masking behavior was spatially heterogeneous over all months (Moran's I between 0.68 and 0.70 for all months,
Bias-corrected masking behavior is spatially heterogeneous and higher in urban areas. (A) Map of bias-corrected masking behavior in October 2020 reveals high spatial heterogeneity. Masking proportions vary substantially even within a single state. Spatial heterogeneity does not notably vary over time (
Masking behavior not only varied geographically but also temporally. Peak masking behavior was observed in January 2021, while the lowest masking proportions were observed in May 2021 (
Bias-corrected masking behavior peaked in the winter of 2020-2021 and fell in the spring of 2021, mirroring new cases and increasing vaccinations. Top curves show the time series of the z-score of bias-corrected masking proportions for each county colored by the average masking proportion across the survey period. The inset plot shows z-scores of the 7-day rolling average of new cases (green), the proportion of individuals vaccinated nationally (orange), and the reported worry about severe illness from COVID-19 in CTIS respondents (purple). Z-scores are based on the mean and SD of each county’s masking estimates over the survey period. CTIS: COVID-19 Trends and Impact Survey. See
Bias-corrected masking proportions were well approximated by modeled estimates of community-reported masking (
Community-reported masking gives a good estimate of bias-corrected self-reported masking. Community-reported masking refers to the CTIS question where individuals report how many people in their community are masking, which may decrease nonresponse and social desirability biases, compared to asking individuals to self-report their masking behavior. Point color denotes urban-rural classes. Comparisons of individual- and community-reported estimates at different analysis stages are shown in
Despite the widespread adoption of face mask-wearing at points during the COVID-19 pandemic in the United States, the true prevalence of this behavior across temporal and spatial scales is largely unknown. Data on mask wearing have been collected through surveys, at varying spatiotemporal resolutions and with potentially varying survey biases (eg, [
Our findings have several implications for the fields of infectious disease epidemiology and public health policy. We identified high spatial and moderate temporal heterogeneity in masking behavior at the county-level—patterns that are obscured if data are aggregated to the state or HHS region level. Contrary to our expectations, this level of spatial variability around the mean is consistent over time. Consequently, disease models should account for spatial variability in masking behavior but may only need to consider changes in masking dynamics over longer temporal scales. The high spatial heterogeneity we found in masking behavior also highlights the need for diverse and targeted public health approaches across the country rather than a single national program. Guidance set at the state level without regard for differences in local conditions may miss early opportunities to control disease spread, prematurely enforce public health restrictions, and contribute to fatigue with public health restrictions. Thus, we advocate for local behavioral data collection and geographically targeted public health policy for optimized resource use and efficient disease suppression.
Although county-level mask-wearing behavior varied across months, we observed little heterogeneity across counties in these temporal trends. The observed changes in masking behavior roughly correspond to national trends in new cases in the United States and self-reported worry about severe disease, as reported in the CTIS, though we did not determine causality or examine this relationship at the individual or county level. Because we modeled county-level averages, this observed correlation could be driven by a specific demographic group or subset of individuals modifying their masking behavior, rather than a uniform change in average mask uptake in a county’s population. The sharp decrease in masking in May 2021 is contemporaneous with many states lifting mask mandates (
Recent work has highlighted the potential for big data sources to provide a measurement of spatially disaggregated social phenomena (eg, [
Nevertheless, our approach has some limitations that are important to consider. We were unable to deal with all representation or response biases, including the exclusion of individuals under 18 years of age; a lack of representativeness due to factors other than age, sex, and education; recall bias; dishonest responses; and other characteristics that may be predictive of nonresponse or social desirability bias, such as political leanings or belief in COVID-19 conspiracies [
In summary, we produced the first accurate high-resolution spatiotemporal estimates of face mask wearing in the United States for the period from September 2020 through May 2021. Our work reveals that masking behavior is highly variable across the United States, suggesting that a one-size-fits-all approach to increasing mask-wearing behavior is likely to be ineffective. Instead, we identified regions of the country with higher and lower masking levels. These differences should be investigated going forward as public health organizations consider how to more effectively target these low-masking regions. For example, these communities may be more susceptible to mis- and disinformation regarding mitigation behaviors, which must be strategically confronted. Furthermore, this variability in behavior demonstrates the need to develop infectious disease dynamics models to analyze and predict how spatiotemporal trends in disease are affected by changes in human behavior, such as vaccination, contact patterns, and face mask wearing. Our analyses also address issues of survey bias, with the takeaway that, in the future, we should invest in a robust survey infrastructure that can recruit large representative samples with minimal bias, including using certain representative respondents as human social sensors to report on their communities.
Supplementary materials.
Visualization of spatially heterogeneous data-processing effects. (A) Residuals following the binomial regression model. (B) Residuals following the binomial regression model with raking/sample rebalancing. (C) Residuals following the binomial regression model with raking/sample rebalancing and an offset for bias. Residuals are defined as the difference between the modeled and the observed masking estimates at each analysis stage, where negative values indicate model estimates were higher than observed values and positive residuals indicate model estimates were lower than observed values. All maps are shown for February 2021. N/A: not applicable.
Bias-corrected masking behavior is spatially heterogeneous and higher in urban areas. (A) Map of bias-corrected masking behavior in October 2020 reveals high spatial heterogeneity. Masking proportions vary substantially even within a single state. Spatial heterogeneity does not notably vary over time (Multimedia Appendix 1, Figure S5). A selection of other months in the study period are shown in Multimedia Appendix 1 (Figures S6-S8). (B) Breakdown of county masking proportions over all survey months by the NCHS urban-rural classification. A direct relationship between the median masking proportion and population density is observed. N/A: not applicable; NCHS: National Center for Health Statistics.
Bias-corrected masking behavior peaked in the winter of 2020-2021 and fell in the spring of 2021, mirroring new cases and increasing vaccinations. Top curves show the time series of the z-score of bias-corrected masking proportions for each county colored by the average masking proportion across the survey period. The inset plot shows z-scores of the 7-day rolling average of new cases (green), the proportion of individuals vaccinated nationally (orange), and the reported worry about severe illness from COVID-19 in CTIS respondents (purple). Z-scores are based on the mean and SD of each county’s masking estimates over the survey period. CTIS: COVID-19 Trends and Impact Survey.
Community-reported masking gives a good estimate of bias-corrected self-reported masking. Community-reported masking refers to the CTIS question where individuals report how many people in their community are masking, which may decrease nonresponse and social desirability biases, compared to asking individuals to self-report their masking behavior. Point color denotes urban-rural classes. Comparisons of individual- and community-reported estimates at different analysis stages are shown in Multimedia Appendix 1 (Figures S11 and S12). CTIS: COVID-19 Trends and Impact Survey.
Centers for Disease Control and Prevention
COVID-19 Trends and Impact Survey
Health and Human Services
National Center for Health Statistics
The authors thank the Carnegie Mellon University (CMU) Delphi team for sharing the US COVID-19 Trends and Impact Survey (CTIS) data openly and freely and Alex Reinhart for his feedback on this work. We also thank the Outbreaks Near Me team at Boston Children’s Hospital and Momentive for data sharing and Benjamin Rader and John Brownstein for their feedback.
The research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number R01GM123007. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Data aggregated to the county-month scale and all code to analyze these data will be made available on GitHub [
JCT performed analyses, interpreted results, and drafted and edited the manuscript. ZS designed and guided analyses and edited the manuscript. SB designed the study, guided the analysis, interpreted results, and edited the manuscript. All authors have read and approved the final manuscript.
None declared.