Background: Face mask wearing has been identified as an effective strategy to prevent the transmission of SARS-CoV-2, yet mask mandates were never imposed nationally in the United States. This decision resulted in a patchwork of local policies and varying compliance, potentially generating heterogeneities in the local trajectories of COVID-19 in the United States. Although numerous studies have investigated the patterns and predictors of masking behavior nationally, most suffer from survey biases and none have been able to characterize mask wearing at fine spatial scales across the United States through different phases of the pandemic.
Objective: Urgently needed is a debiased spatiotemporal characterization of mask-wearing behavior in the United States. This information is critical to further assess the effectiveness of masking, evaluate the drivers of transmission at different time points during the pandemic, and guide future public health decisions through, for example, forecasting disease surges.
Methods: We analyzed spatiotemporal masking patterns in over 8 million behavioral survey responses from across the United States, starting in September 2020 through May 2021. We adjusted for sample size and representation using binomial regression models and survey raking, respectively, to produce county-level monthly estimates of masking behavior. We additionally debiased self-reported masking estimates using bias measures derived by comparing vaccination data from the same survey to official records at the county level. Lastly, we evaluated whether individuals’ perceptions of their social environment can serve as a less biased form of behavioral surveillance than self-reported data.
Results: We found that county-level masking behavior was spatially heterogeneous along an urban-rural gradient, with mask wearing peaking in winter 2021 and declining sharply through May 2021. Our results identified regions where targeted public health efforts could have been most effective and suggest that individuals’ frequency of mask wearing may be influenced by national guidance and disease prevalence. We validated our bias correction approach by comparing debiased self-reported mask-wearing estimates with community-reported estimates, after addressing issues of a small sample size and representation. Self-reported behavior estimates were especially prone to social desirability and nonresponse biases, and our findings demonstrated that these biases can be reduced if individuals are asked to report on community rather than self behaviors.
Conclusions: Our work highlights the importance of characterizing public health behaviors at fine spatiotemporal scales to capture heterogeneities that may drive outbreak trajectories. Our findings also emphasize the need for a standardized approach to incorporating behavioral big data into public health response efforts. Even large surveys are prone to bias; thus, we advocate for a social sensing approach to behavioral surveillance to enable more accurate estimates of health behaviors. Finally, we invite the public health and behavioral research communities to use our publicly available estimates to consider how bias-corrected behavioral estimates may improve our understanding of protective behaviors during crises and their impact on disease dynamics.
Human behavior plays a key role in infectious disease transmission [, ]. Individuals’ decisions to get vaccinated, reduce their contacts, or wear a face mask, for example, can have a tremendous impact on disease dynamics [ - ]. The COVID-19 pandemic has highlighted that we are grossly limited in our ability to accurately measure and predict human behavior in the face of a novel pathogen. Yet, knowledge of how human behaviors vary over time and space is critical to assess the effectiveness of mitigation strategies, to forecast disease surges, and to parameterize coupled disease-behavior models [ ]. In particular, there is a paucity of data on how the frequency of face mask wearing varies across the United States over different phases of the pandemic. This lack of fine-scale spatiotemporal data has forced public health organizations to adopt an inefficient one-size-fits-all approach to encourage masking nationwide, rather than directing resources and messaging to areas with the lowest uptake. Here, we identify the spatiotemporal trends in self-reported data on mask-wearing behavior across the United States from a large survey distributed from September 2020 to May 2021.
Mask wearing has been identified as an effective strategy to reduce the transmission of SARS-CoV-2. At the individual level, masks decrease both the amount of viral particles dispersed by an infectious wearer and the amount of those inhaled by an uninfected wearer . Modeling studies at the population level (eg, [ , , ]) have suggested that mask wearing can limit SARS-CoV-2 transmission and COVID-19 deaths, including under scenarios where masks are not worn universally or are not completely effective at blocking transmission. Randomized controlled trials (eg, [ ]) have also demonstrated that mask wearing is an effective community-level intervention against COVID-19. Despite limited information at the time, the Centers for Disease Control and Prevention (CDC) initially recommended mask wearing on April 3, 2020 [ ]. Lack of a national mandate, though, resulted in a heterogeneous landscape of mask policies across states, counties, towns, and even individual businesses in the United States [ , ]. Compounded with this spatial heterogeneity in mandates is additional heterogeneity in compliance, documented by localized observational studies (eg, [ ]). A collection of systematic, accurate data on mask-wearing levels across the United States is therefore essential to informing our understanding of the role of mask wearing in the US COVID-19 pandemic trajectory.
To address this gap, researchers and organizations have implemented extensive surveys on human behavior, including mask wearing (eg, [- ]). These surveys hold exciting promise, yet they have contributed relatively little to our understanding of human behavior, due to significant sampling limitations. Larger surveys with sufficient power to detect trends at local geographic scales are often not designed to capture a representative sample of the population. Demographic biases arising from a nonrepresentative sample can be addressed with standard statistical tools, such as survey weights, but other forms of bias, particularly nonresponse and social desirability biases, are more challenging to correct. Surveys about salient public health issues are especially likely to suffer from response bias; COVID-19–cautious individuals may be overrepresented in a survey about COVID-19 behavior. However, without estimates of the proportion of individuals in a given region who are COVID-19 cautious, there is no way to use survey weights on this demographic. Likewise, respondents may be influenced by social desirability bias when self-reporting COVID-19–preventive behaviors, such as vaccination, social distancing, or mask wearing, so that they respond in a manner deemed favorable by society despite being inaccurate [ ]. Without observational or ground-truth data to validate survey responses, quantifying this social desirability bias is difficult. Furthermore, it is critical that ground-truth data to correct biases in health behavior be used at a fine spatial and temporal scale to avoid further exacerbation of biases (eg, [ ]).
The value of surveys on public health behaviors can be further restricted when data collection is at the national or state level. Coarse-grained spatiotemporal information about human behavior is of limited utility, providing only sparse insight into local trends. Collecting responses at the national or state level ignores spatial heterogeneity at these finer scales, preventing the identification of these local effects that can drive disease dynamics. Spatial heterogeneity in not only drivers of disease transmission, such as human behavior, but also disease prevalence has been well documented across pathogens (eg, [- ]). For example, differences in connectivity between counties or states can affect the timing and geographic scale of disease spread, while national scale mobility data elide these key patterns [ , ]. Likewise, aggregation of vaccination data to the state level can hide spatial clustering of unvaccinated individuals, which undermines herd immunity and could drive sustained measles outbreaks in the United States [ , ]. Despite the importance of detailed local estimates on the drivers of disease incidence, few studies have analyzed human behavior during the COVID-19 pandemic nationally at these fine spatial scales. Furthermore, most surveys are not conducted for long enough to capture human behavior changes over time, leaving scant opportunity to assess the effects of changing public health messaging/guidance or disease prevalence on human behavior.
Here, we systematically characterize mask wearing across the United States at a fine spatiotemporal scale for 9 months using a national survey and account for the bias in this survey. By comparing survey demographics and vaccination statuses with accurate ground-truth data, we estimate and account for survey and response biases in our analysis of masking behavior. With these bias-corrected estimates, we characterize the spatiotemporal heterogeneity in masking behavior at the county-month level across the United States. Finally, we examine the differences between self-reported and community-reported estimates of masking using an additional survey question, seeking to understand whether these 2 measures are good predictors of one another. Our results are the most precise estimates of masking in the United States during the COVID-19 pandemic, providing insight into the local variation in behavior in response to public health messaging and changes in COVID-19 incidence.
In this study, we sought to characterize the spatiotemporal heterogeneity in self-reported masking behavior in the United States from the fall of 2020 to the spring of 2021. Due to the small sample size in some counties, we used Bayesian binomial regression models to estimate mask-wearing proportions each month. Recognizing that surveys are subject to several types of bias, we used raking and resampling of responses to correct for unrepresentative samples and self-reported vaccination status compared to ground-truth vaccination data to quantify nonresponse and social desirability biases. With these estimates, we were able to identify spatiotemporal trends in bias-corrected masking behavior and compare these values to reported community levels of masking in a different survey question.
Survey Data and Processing
We analyzed self-reported mask-wearing survey responses for all 50 US states and the District of Columbia using data from the US COVID-19 Trends and Impact Survey (CTIS) . The CTIS was created by the Delphi Research Group at Carnegie Mellon University and distributed through a partnership with Facebook. Beginning in September 2020, a random state-stratified sample of active Facebook users was invited daily to take the survey about COVID-19 and report how often they wore a mask in the past 5-7 days (the number of days changed from 5 to 7 on February 8, 2021). The answer options were (1) “All of the time,” (2) “Most of the time,” (3) “Some of the time,” (4) “A little of the time,” (5) “None of the time,” and (6) “I have not been in public in the last 5-7 days” ( , Figure S13). To dichotomize these responses for an analysis of the proportion of respondents wearing masks, we dropped respondents who had not been in public recently or did not respond to the masking question, and considered responses of “all of the time” and “most of the time” as masking, and all other responses as not masking. This cut-off is reasonable, considering the raw proportions of responses in each category for September through May ( , Figure S14). Due to sample size constraints, we aggregated these responses to the county-month scale. We ignored potential heterogeneity at smaller temporal (weekly) and spatial (zip code) scales due to the limited sample size.
By dichotomizing masking responses, we also lost information about the frequency with which people mask, though we expect the effect of this choice to be minimal (seefor details).
Bayesian Binomial Regression Model
Due to the small sample sizes in some US counties, we used Bayesian binomial regression models to develop reliable estimates of the proportion of individuals masking in a given county-month. Population density was used as a fixed effect; masking behavior has previously been linked to population density, and this variable was easily available at the county scale [, ]. We fit separate models for each month, allowing for a temporal trend without explicitly modeling it by specifying a parametric form. We defined Mi as the number of respondents masking in county i (eg, respondents who masked most or all of the time in the past 5-7 days), Ni as the total number of respondents in county i (Mi ≤ Ni), and pi as the county-level probability of a response consistent with masking. We used the following model to estimate and :
where Di = log10(population densityi) for county i. We ran the model using brms  with the cmdstanR [ ] backend. We ran the sampler with 4 chains for 3000 iterations per chain. Sampler diagnostics indicated efficient exploration and that the model had converged, neff < 0.25, neff per iteration ≥ 0.25, ≤ 1.01, E-BFMI > 0.25, and no transitions hit the maximum tree depth. All Pareto-smoothed importance sampling leave-one-out (PSIS-LOO) k statistic values were below 0.71, indicating that the model was robust to the influence of individual observations, and the distribution of Pareto k statistics was uniform, indicating that the model captured essential features of the data [ ]. Posterior predictive checks indicated a good model fit ( , Figure S15) as did the plots of observed versus predicted and residual values ( , Figures S16 and S17). We note that our binomial regression approach compensated for the small sample sizes in some county-months, but the resulting estimates depend on the validity of our model structure.
We also explored more complex model specifications that included state- or county-level random effects. However, both models suffered from a lack of convergence or overfitting and produced functionally similar results. Thus, we opted for the more parsimonious model presented earlier for our main findings; details of these additional models can be found in(Figures S20 and S21).
Survey Raking and Resampling
We were unable to use the provided weights for responses to the CTIS, due to spatial and temporal mismatch with the scales of our data analysis. Thus, we calculated county-month weights for each observation using the anesrake package  and the US Census American Community Survey’s 2018 5-year data on county age, sex, and education distributions. Age, sex, and education distributions were based on each county’s population over the age of 18 years to match the survey sample. We did not use race or ethnicity data in the raking scheme, as their inclusion substantially reduced algorithm convergence, but note that race/ethnicity was moderately correlated with education (Cramer's V>0.10). We then resampled from these responses using the calculated weights to estimate a raked masking proportion, which was fed into the binomial models, as described before. We excluded observations with missing age, sex, or education responses from the raking process and assigned equal weights to observations from county-months that did not converge (additional details in ).
Estimation of CTIS Masking Bias
Given the likelihood of sampling, nonresponse, and social desirability biases, we generated bias-corrected estimates of masking in the United States. In the absence of ground-truth masking data with which to calibrate these CTIS responses, we turned to a different survey question for which ground-truth data were available.
Beginning in late December 2020, the CTIS asked respondents whether they had received a COVID-19 vaccine. The response options were (1) “Yes, “(2) “No,” and (3) “I don’t know.” Meanwhile, ground-truth vaccination data were collected by combining state-reported and CDC data to estimate the percentage of people vaccinated in each county in the United States [, ]. A comparison of CTIS responses and ground-truth vaccination data revealed that the estimates of COVID-19 vaccination based on CTIS responses were much higher than true vaccination rates at the US county scale ( , Figures S18 and S19, [ , ]). Assuming that masking survey responses suffer from the same bias issues (in magnitude and direction) as vaccination responses, this result would suggest that CTIS responses also substantially overestimate masking behavior. Thus, we approximated survey bias by comparing the CTIS vaccination responses to the ground-truth vaccination data at the county level and incorporating this bias into the model of CTIS masking behavior.
Like the masking data, the CTIS vaccination response data suffers from small and unrepresentative samples in some counties. Thus, we resampled the responses from April and May 2021 according to the survey weights we generated before and then used a (frequentist) binomial generalized linear mixed-effects model to estimate pi, the proportion of respondents who were vaccinated (assumed to be partial vaccination, with 1 of a 1-dose or 2-dose vaccine) at the county level each week (details in).
Given these modeled CTIS county-level vaccination proportions, we compared them with the true vaccination data to calculate the expected bias in reported survey responses relative to ground-truth data in county i:
biasi = logit(CTIS estimated vaccination proportioni) – logit(true vaccination proportioni)
To increase the stability of our bias estimates, we used a linear mixed-effects model. This mixed-effects model used random intercepts, which penalizes extreme coefficient estimates to the overall mean, and assumed that the residual error in the estimates was normally distributed. This model generated a penalized estimate of survey bias for each county from the difference in modeled reported vaccination and ground-truth vaccination:
This model was implemented using lmer in the lme4 package . If there were no responses in county i or a bias estimate could not be calculated, bias estimates for this county were imputed by taking the mean of surrounding county estimates.
We then incorporated these estimates into a Bayesian binomial regression model with an offset for bias to estimate the bias-corrected probability of reporting masking in county i. We defined Mi as the number of respondents masking in county i out of Ni total respondents and pi as the county-level probability of a response consistent with masking. We used the following model to estimate and :
where Di = log10(population densityi) for county i. The bias-corrected proportion of individuals masking, ci, was calculated as
We ran the model using brms  with the cmdstanR [ ] backend. We ran the sampler with 4 chains for 3000 iterations per chain. Sampler diagnostics indicated efficient exploration and that the model had converged: neff>1000, neff per iteration≥0.3, and ≤ 1.01.
Beginning November 24, 2020, the CTIS asked a question about masking in one’s community: “In the past 7 days, when out in public places where social distancing is not possible, about how many people would you estimate wore masks?” The answer options were (1) “All of the people,” (2) “Most of the people,” (3) “Some of the people,” (4) “A few of the people,” (5) “None of the people,” and (6) “I have not been out in public places in the past 7 days.” We dichotomized these responses and aggregated them to the county-month the same way as the self-reported CTIS masking responses for December 2020 through May 2021. We then modeled these community masking estimates the same way we modeled the CTIS masking data using Bayesian binomial regression and resampling weighted by survey weights but without a bias offset.
All analyses were completed in R version 4.1.3 (R Core Team and the R Foundation for Statistical Computing), and maps were produced using choroplethr . Urban-rural classes were from the National Center for Health Statistics (NCHS) Urban-Rural Classification Scheme [ ].
This study was reviewed by the Institutional Review Board of Georgetown University and was determined not to be human subject research.
To characterize the trends in the masking behavior in the United States during the COVID-19 pandemic, we used data from the CTIS conducted via Facebook from September 2020 through May 2021. Respondents self-reported how often they had worn a mask while in public in the past week (8,338,877 valid responses). We transformed these responses into a binary variable of masking or not masking and aggregated the responses to the county-month level to analyze spatiotemporal trends. To validate this data source, we analyzed a separate data set from Outbreaks Near Me and found consistent spatiotemporal patterns (, Figures S1-S3), though both data sources suffer from issues of bias and small sample size. We addressed these issues in the CTIS data using binomial regression models to inform estimates of masking in counties with a small sample size, and raking and sample rebalancing on age, sex, and education to adjust for unrepresentative samples. Recognizing that CTIS responses to a question about vaccination overestimated the true vaccination rates, we quantified this bias for each county and used it to correct the estimates of masking behavior, assuming that vaccination and masking behavior responses were equally as biased. (Vaccination and mask wearing are both prosocial public health behaviors, which are socially desirable to report and are likely correlated [ - ].) We analyzed the overall spatial and temporal trends as well as fine-scale heterogeneity in the bias-corrected masking behavior estimates. Finally, we validated the bias-corrected CTIS values by comparing them to the respondents’ estimates of the proportion of people masking in their community.
Spatially Heterogeneous Effects of the Binomial Regression Model, Survey Raking, and Debiasing
To demonstrate the spatially heterogeneous effects of our data-processing scheme,shows the difference between estimates from 3 separate models and the raw CTIS masking data. We refer to this difference as the residual, though it is only an indicator of model fit in A. In B and 1C, the residual values indicate where data corrections caused the largest changes in estimates compared to the original data. After modeling the data with binomial regression, estimates of masking proportions were higher than the observed values ( , Figure S4) in the central United States and slightly lower than the observed values in the Northeast, Northwest, and Southwest ( A). Adjusting for unrepresentative samples with raking and resampling and rerunning the binomial regression model had a minor effect on mask-wearing estimates, only exhibiting a slight decrease compared to the model without raking ( B). Correcting for survey biases using vaccination data in the binomial regression model run on raked survey responses systematically decreased masking proportions, as expected and denoted by increased residuals ( C). (A map showing the spatial distribution of these biases is given in , Figure S19.) We refer to estimates from the model in C as debiased or bias-corrected for the remainder of the paper. Our results reinforce that behavioral surveillance should be conducted carefully to limit bias initially.
Masking Behavior Exhibits Spatial and Temporal Heterogeneity and Is Positively Associated With Population Density
Using bias-corrected masking proportions from the CTIS, we found that masking behavior was spatially heterogeneous over all months (Moran's I between 0.68 and 0.70 for all months,A and S5 in ). Bias-corrected masking proportions ranged from 0.11 to 0.96 and varied substantially within states, emphasizing the importance of analyzing masking behavior at finer scales than the state or Health and Human Services (HHS) region level. Masking proportions were closely linked to population density over the survey period: urban counties tended to have higher masking proportions than rural counties ( B). Although masking proportions ranged quite a bit within NCHS urban-rural classifications, all differences between NCHS classes were significant (Kruskal-Wallis and pairwise Wilcox test, n=27,842, all P<.001). Over all counties and survey months, the median fitted masking proportion in urban counties exceeded 0.8, while the median fitted masking proportion in the most rural counties was below 0.6.
Masking behavior not only varied geographically but also temporally. Peak masking behavior was observed in January 2021, while the lowest masking proportions were observed in May 2021 (). Counties with higher mean masking proportions fluctuated less than counties with lower mean masking proportions from September to April but experienced the largest differences from their mean values in May 2021. For context, we highlight that this decrease coincides with increasing proportions of vaccinated individuals in the United States ( , [ ]), declining new infections [ ], and decreasing reported worry about severe illness due to COVID-19 from the CTIS [ ]. The policy context during this time was also shifting: On April 27, 2021, the CDC announced that fully vaccinated individuals no longer needed to wear masks outdoors [ ], and on May 13, 2021, it announced that fully vaccinated individuals no longer had to wear masks indoors either [ ]. Meanwhile, 49% of counties that ever had a mask mandate lifted it before May 1, 2021 ( , Figure S9). These announcements coincided with the observed decrease in masking in these months. Together, these analyses underscore the importance of tracking and analyzing mask wearing at fine spatial and across long temporal scales: further spatial and temporal aggregation of these data would have missed key heterogeneity previously not quantified and prevented future work from investigating the connection between policy and behavior change at appropriate granularity.
Community-Reported Masking Levels Are a Good Predictor of Bias-Corrected Self-Reported Estimates
Bias-corrected masking proportions were well approximated by modeled estimates of community-reported masking (). The difference between the 2 mask-wearing proportion estimates ranged from –6% to 5% and became more apparent in April and May 2021, particularly in rural areas. In May 2021, though, community estimates in urban areas tended to overestimate bias-corrected individual masking estimates. This result is quantitatively affected by influential observations but is qualitatively robust ( , Figure S10). These results suggest that surveying participants about community masking may give less biased responses than asking individuals to report their own masking behavior, potentially reducing social desirability bias and capturing parts of the population that may be otherwise less likely to respond to the survey.
Despite the widespread adoption of face mask-wearing at points during the COVID-19 pandemic in the United States, the true prevalence of this behavior across temporal and spatial scales is largely unknown. Data on mask wearing have been collected through surveys, at varying spatiotemporal resolutions and with potentially varying survey biases (eg, [- ]). Here, we characterized mask-wearing behavior across the United States using self-reported masking data from a large national online survey. We used Bayesian binomial regression models to remediate issues of small sample size, performed raking/sample balancing to address unrepresentative survey samples, and corrected for additional response biases using measurable bias in vaccination data. We observed substantial spatial heterogeneity in the masking behavior across urban versus rural counties, with some temporal changes in mean masking estimates at the county-month level, most notably a steep decline in masking in May 2021. We found that community-reported masking responses well approximated our bias-corrected masking estimates. Other work adds to this validation: Similar spatial heterogeneity was found in 2 other surveys, with some overlapping periods ( , Figures S1-S3, [ , ]), and our debiased estimates generally agree with those recorded in observational studies, including higher levels of masking in urban areas ( , Table S1, [ , ]). Our results reveal the landscape of the masking behavior across 3 distinct phases of the pandemic (presurge, surge during winter 2020-2021, and postsurge during the initial COVID-19 vaccination rollout). Our work also highlights the critical role that behavioral big data can play in the pandemic response, if such data are used with caution.
Our findings have several implications for the fields of infectious disease epidemiology and public health policy. We identified high spatial and moderate temporal heterogeneity in masking behavior at the county-level—patterns that are obscured if data are aggregated to the state or HHS region level. Contrary to our expectations, this level of spatial variability around the mean is consistent over time. Consequently, disease models should account for spatial variability in masking behavior but may only need to consider changes in masking dynamics over longer temporal scales. The high spatial heterogeneity we found in masking behavior also highlights the need for diverse and targeted public health approaches across the country rather than a single national program. Guidance set at the state level without regard for differences in local conditions may miss early opportunities to control disease spread, prematurely enforce public health restrictions, and contribute to fatigue with public health restrictions. Thus, we advocate for local behavioral data collection and geographically targeted public health policy for optimized resource use and efficient disease suppression.
Although county-level mask-wearing behavior varied across months, we observed little heterogeneity across counties in these temporal trends. The observed changes in masking behavior roughly correspond to national trends in new cases in the United States and self-reported worry about severe disease, as reported in the CTIS, though we did not determine causality or examine this relationship at the individual or county level. Because we modeled county-level averages, this observed correlation could be driven by a specific demographic group or subset of individuals modifying their masking behavior, rather than a uniform change in average mask uptake in a county’s population. The sharp decrease in masking in May 2021 is contemporaneous with many states lifting mask mandates (, Figure S9) and an announcement from the CDC that vaccinated individuals no longer have to wear masks outdoors (April 27, 2021 [ ]) or indoors (May 13, 2021 [ ]). It is plausible that these policy changes could have impacted masking behavior, both in vaccinated and in unvaccinated individuals, even though the change in CDC guidance did not apply to unvaccinated individuals [ , ]. More work is needed to explore the potential differences in mask wearing between vaccinated and unvaccinated individuals. Additional research could also focus on quantifying the impact of social norms on individuals’ masking behavior at fine spatiotemporal resolution in the United States.
Recent work has highlighted the potential for big data sources to provide a measurement of spatially disaggregated social phenomena (eg, ), while other research points to the challenges of inferring high-quality estimates of behavior from such high-volume data sources [ ]. In our work, we sought to steer away from “big data hubris” [ ] by applying rigorous statistical methods to manage concerns about representativeness and bias and by conducting an internal validation of our model-based estimates [ ]. In particular, we addressed representativeness by age, sex, and education to capture sociodemographic response bias. Motivated by an association between COVID-19–preventative behaviors [ ], we debiased our masking estimates based on vaccination data to address additional nonresponse bias and social desirability bias. Finally, we internally validated our population-scale masking estimates of self-reported behaviors with responses of community behaviors, which may be subject to less nonresponse and social desirability bias compared to self-reporting questions [ ]. We found that community-reported masking estimates agreed closely with bias-corrected self-reported masking behavior, highlighting that surveying participants about community behavior may be an avenue to reduce survey bias. We note, however, that this finding may not apply in all settings; self-reported masking behavior on a university campus closely matched observed masking levels, and questions about community masking were less accurate [ ]. Although these implications for analysis of surveys on human behavior may not apply universally, similar results have been found in other infectious disease applications, including disease surveillance (using the CTIS data [ , ]) and early outbreak detection in social networks [ , ]. Our results further emphasize the promise of human social sensing going forward to make big data sources more meaningful [ ].
Nevertheless, our approach has some limitations that are important to consider. We were unable to deal with all representation or response biases, including the exclusion of individuals under 18 years of age; a lack of representativeness due to factors other than age, sex, and education; recall bias; dishonest responses; and other characteristics that may be predictive of nonresponse or social desirability bias, such as political leanings or belief in COVID-19 conspiracies . Likewise, we could not account for how individuals with Facebook accounts may engage differently in COVID-19–preventive behaviors than non-Facebook users. It is unclear whether these biases that are unaccounted for would have a systematic or random effect on our results. However, we do expect their impact to be relatively small, particularly as our ground-truth–based debiasing approach adjusts masking estimates regardless of the source of nonresponse bias, and as supported by our community-reported masking analysis (since the community question captures some of the non-Facebook user population). We assumed that self-reported mask wearing is biased in the same magnitude and direction as the self-reported COVID-19 vaccination status—an assumption that should be tested in future research. Our approach also does not resolve issues of a small sample size; for example, the association we found between bias-corrected self-reported masking and community-reported masking is stronger between modeled estimates than between raw means. Although we attempted to correct for bias in our mask-wearing estimates, the point estimates for these values should be interpreted with caution. The goal of our work is not to produce point estimates of county-level mask-wearing behavior but instead to take advantage of the CTIS survey design and characterize the relative trends in mask-wearing behavior between and across counties. We advocate for additional observational studies with experimental designs that would allow for direct estimation of these quantities to improve behavioral surveillance estimates.
In summary, we produced the first accurate high-resolution spatiotemporal estimates of face mask wearing in the United States for the period from September 2020 through May 2021. Our work reveals that masking behavior is highly variable across the United States, suggesting that a one-size-fits-all approach to increasing mask-wearing behavior is likely to be ineffective. Instead, we identified regions of the country with higher and lower masking levels. These differences should be investigated going forward as public health organizations consider how to more effectively target these low-masking regions. For example, these communities may be more susceptible to mis- and disinformation regarding mitigation behaviors, which must be strategically confronted. Furthermore, this variability in behavior demonstrates the need to develop infectious disease dynamics models to analyze and predict how spatiotemporal trends in disease are affected by changes in human behavior, such as vaccination, contact patterns, and face mask wearing. Our analyses also address issues of survey bias, with the takeaway that, in the future, we should invest in a robust survey infrastructure that can recruit large representative samples with minimal bias, including using certain representative respondents as human social sensors to report on their communities.
The authors thank the Carnegie Mellon University (CMU) Delphi team for sharing the US COVID-19 Trends and Impact Survey (CTIS) data openly and freely and Alex Reinhart for his feedback on this work. We also thank the Outbreaks Near Me team at Boston Children’s Hospital and Momentive for data sharing and Benjamin Rader and John Brownstein for their feedback.
The research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number R01GM123007. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Data aggregated to the county-month scale and all code to analyze these data will be made available on GitHub . Individual survey responses cannot be shared by the authors, but researchers can refer to Ref. [ ] if they would like to enter an agreement for data usage with the Carnegie Mellon University (CMU) Delphi team.
JCT performed analyses, interpreted results, and drafted and edited the manuscript. ZS designed and guided analyses and edited the manuscript. SB designed the study, guided the analysis, interpreted results, and edited the manuscript. All authors have read and approved the final manuscript.
Conflicts of Interest
Supplementary materials.PDF File (Adobe PDF File), 29021 KB
Visualization of spatially heterogeneous data-processing effects. (A) Residuals following the binomial regression model. (B) Residuals following the binomial regression model with raking/sample rebalancing. (C) Residuals following the binomial regression model with raking/sample rebalancing and an offset for bias. Residuals are defined as the difference between the modeled and the observed masking estimates at each analysis stage, where negative values indicate model estimates were higher than observed values and positive residuals indicate model estimates were lower than observed values. All maps are shown for February 2021. N/A: not applicable.PNG File , 4292 KB
Bias-corrected masking behavior is spatially heterogeneous and higher in urban areas. (A) Map of bias-corrected masking behavior in October 2020 reveals high spatial heterogeneity. Masking proportions vary substantially even within a single state. Spatial heterogeneity does not notably vary over time (Multimedia Appendix 1, Figure S5). A selection of other months in the study period are shown in Multimedia Appendix 1 (Figures S6-S8). (B) Breakdown of county masking proportions over all survey months by the NCHS urban-rural classification. A direct relationship between the median masking proportion and population density is observed. N/A: not applicable; NCHS: National Center for Health Statistics.PNG File , 3213 KB
Bias-corrected masking behavior peaked in the winter of 2020-2021 and fell in the spring of 2021, mirroring new cases and increasing vaccinations. Top curves show the time series of the z-score of bias-corrected masking proportions for each county colored by the average masking proportion across the survey period. The inset plot shows z-scores of the 7-day rolling average of new cases (green), the proportion of individuals vaccinated nationally (orange), and the reported worry about severe illness from COVID-19 in CTIS respondents (purple). Z-scores are based on the mean and SD of each county’s masking estimates over the survey period. CTIS: COVID-19 Trends and Impact Survey.PNG File , 3253 KB
Community-reported masking gives a good estimate of bias-corrected self-reported masking. Community-reported masking refers to the CTIS question where individuals report how many people in their community are masking, which may decrease nonresponse and social desirability biases, compared to asking individuals to self-report their masking behavior. Point color denotes urban-rural classes. Comparisons of individual- and community-reported estimates at different analysis stages are shown in Multimedia Appendix 1 (Figures S11 and S12). CTIS: COVID-19 Trends and Impact Survey.PNG File , 3105 KB
- Funk S, Salathé M, Jansen VAA. Modelling the influence of human behaviour on the spread of infectious diseases: a review. J R Soc Interface 2010 Sep 06;7(50):1247-1256 [FREE Full text] [CrossRef] [Medline]
- Ferguson N. Capturing human behaviour. Nature 2007 Apr 12;446(7137):733. [CrossRef] [Medline]
- Del Valle S, Hethcote H, Hyman J, Castillo-Chavez C. Effects of behavioral changes in a smallpox attack model. Math Biosci 2005 Jun;195(2):228-251. [CrossRef] [Medline]
- Worby CJ, Chang H. Face mask use in the general population and optimal resource allocation during the COVID-19 pandemic. Nat Commun 2020 Aug 13;11(1):4049 [FREE Full text] [CrossRef] [Medline]
- Bayham J, Kuminoff NV, Gunn Q, Fenichel EP. Measured voluntary avoidance behaviour during the 2009 A/H1N1 epidemic. Proc Biol Sci 2015 Nov 07;282(1818):20150814 [FREE Full text] [CrossRef] [Medline]
- Verelst F, Willem L, Beutels P. Behavioural change models for infectious disease transmission: a systematic review (2010-2015). J R Soc Interface 2016 Dec;13(125):20160820 [FREE Full text] [CrossRef] [Medline]
- Ueki H, Furusawa Y, Iwatsuki-Horimoto K, Imai M, Kabata H, Nishimura H, et al. Effectiveness of face masks in preventing airborne transmission of SARS-CoV-2. mSphere 2020 Oct 28;5(5):e00637-20. [CrossRef]
- Eikenberry SE, Mancuso M, Iboi E, Phan T, Eikenberry K, Kuang Y, et al. To mask or not to mask: modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic. Infect Dis Model 2020;5:293-308 [FREE Full text] [CrossRef] [Medline]
- Stutt ROJH, Retkute R, Bradley M, Gilligan CA, Colvin J. A modelling framework to assess the likely effectiveness of facemasks in combination with 'lock-down' in managing the COVID-19 pandemic. Proc Math Phys Eng Sci 2020 Jun 10;476(2238):20200376 [FREE Full text] [CrossRef] [Medline]
- Abaluck J, Kwong LH, Styczynski A, Haque A, Kabir MA, Bates-Jefferys E, et al. Impact of community masking on COVID-19: a cluster-randomized trial in Bangladesh. Science 2022 Jan 14;375(6577):eabi9069 [FREE Full text] [CrossRef] [Medline]
- Fisher KA, Barile JP, Guerin RJ, Vanden Esschert KL, Jeffers A, Tian LH, et al. Factors associated with cloth face covering use among adults during the COVID-19 pandemic - United States, April and May 2020. MMWR Morb Mortal Wkly Rep 2020 Jul 17;69(28):933-937 [FREE Full text] [CrossRef] [Medline]
- Joo H, Miller GF, Sunshine G, Gakh M, Pike J, Havers FP, et al. Decline in COVID-19 hospitalization growth rates associated with statewide mask mandates - 10 states, March-October 2020. MMWR Morb Mortal Wkly Rep 2021 Feb 12;70(6):212-216 [FREE Full text] [CrossRef] [Medline]
- Guy GP, Lee FC, Sunshine G, McCord R, Howard-Williams M, Kompaniyets L, CDC COVID-19 Response Team‚ Mitigation Policy Analysis Unit, CDC Public Health Law Program. Association of state-issued mask mandates and allowing on-premises restaurant dining with county-level COVID-19 case and death growth rates - United States, March 1-December 31, 2020. MMWR Morb Mortal Wkly Rep 2021 Mar 12;70(10):350-354 [FREE Full text] [CrossRef] [Medline]
- Haischer MH, Beilfuss R, Hart MR, Opielinski L, Wrucke D, Zirgaitis G, et al. Who is wearing a mask? Gender-, age-, and location-related differences during the COVID-19 pandemic. PLoS One 2020 Oct 15;15(10):e0240785 [FREE Full text] [CrossRef] [Medline]
- Kapteyn A, Angrisani M, Bennett D, de BW, Darling J, Gutsche T. Tracking the effect of the COVID-19 pandemic on the lives of American households. Survey Res Methods 2020 Jun;14(2):186. [CrossRef]
- US Census Bureau. Measuring Household Experiences during the Coronavirus Pandemic. 2022. URL: https://www.census.gov/householdpulsedata [accessed 2023-01-06]
- Katz J, Sanger-Katz M, Quealy K. A Detailed Map of Who Is Wearing Masks in the U.S. 2020 Jul 17. URL: https://www.nytimes.com/interactive/2020/07/17/upshot/coronavirus-face-mask-map.html [accessed 2023-01-06]
- Jakubowski A, Egger D, Nekesa C, Lowe L, Walker M, Miguel E. Self-reported vs directly observed face mask use in Kenya. JAMA Netw Open 2021 Jul 01;4(7):e2118830 [FREE Full text] [CrossRef] [Medline]
- Bradley VC, Kuriwaki S, Isakov M, Sejdinovic D, Meng X, Flaxman S. Unrepresentative big surveys significantly overestimated US vaccine uptake. Nature 2021 Dec;600(7890):695-700 [FREE Full text] [CrossRef] [Medline]
- Favier C, Schmit D, Müller-Graf CDM, Cazelles B, Degallier N, Mondet B, et al. Influence of spatial heterogeneity on an emerging infectious disease: the case of dengue epidemics. Proc R Soc Biol Sci 2005 Jun 07;272(1568):1171-1177 [FREE Full text] [CrossRef] [Medline]
- Merler S, Ajelli M. The role of population heterogeneity and human mobility in the spread of pandemic influenza. Proc R Soc Biol Sci 2010 Feb 22;277(1681):557-565 [FREE Full text] [CrossRef] [Medline]
- Susswein Z, Valdano E, Brett T, Rohani P, Colizza V, Bansal S. Ignoring spatial heterogeneity in drivers of SARS-CoV-2 transmission in the US will impede sustained elimination. medRxiv Aug 10 [FREE Full text] [CrossRef] [Medline]
- Viboud C, Bjørnstad ON, Smith DL, Simonsen L, Miller MA, Grenfell BT. Synchrony, waves, and spatial hierarchies in the spread of influenza. Science 2006 Apr 21;312(5772):447-451. [CrossRef] [Medline]
- Cuadros DF, Xiao Y, Mukandavire Z, Correa-Agudelo E, Hernández A, Kim H, et al. Spatiotemporal transmission dynamics of the COVID-19 pandemic and its impact on critical healthcare capacity. Health Place 2020 Jul;64:102404 [FREE Full text] [CrossRef] [Medline]
- Truelove SA, Graham M, Moss WJ, Metcalf CJE, Ferrari MJ, Lessler J. Characterizing the impact of spatial clustering of susceptibility for measles elimination. Vaccine 2019 Jan 29;37(5):732-741 [FREE Full text] [CrossRef] [Medline]
- Masters NB, Eisenberg MC, Delamater PL, Kay M, Boulton ML, Zelner J. Fine-scale spatial clustering of measles nonvaccination that increases outbreak potential is obscured by aggregated reporting data. Proc Natl Acad Sci U S A 2020 Nov 10;117(45):28506-28514. [CrossRef] [Medline]
- Salomon JA, Reinhart A, Bilinski A, Chua EJ, La Motte-Kerr W, Rönn MM, et al. The US COVID-19 Trends and Impact Survey: continuous real-time measurement of COVID-19 symptoms, risks, protective behaviors, testing, and vaccination. Proc Natl Acad Sci U S A 2021 Dec 21;118(51):e2111454118 [FREE Full text] [CrossRef] [Medline]
- Callaghan T, Lueck JA, Trujillo KL, Ferdinand AO. Rural and urban differences in COVID-19 prevention behaviors. J Rural Health 2021 Mar;37(2):287-295 [FREE Full text] [CrossRef] [Medline]
- Bürkner P. brms: an R package for Bayesian multilevel models using Stan. J Stat Softw 2017;80(1):1-28. [CrossRef]
- Gabry J, Češnovar R. CmdStanR. URL: https://mc-stan.org/cmdstanr/index.html [accessed 2023-01-06]
- Vehtari A, Simpson D, Gelman A, Yao Y, Gabry J. Pareto smoothed importance sampling. arXivstat Feb.
- Pasek J. Anesrake: ANES Raking Implementation. 2018. URL: https://CRAN.R-project.org/package=anesrake [accessed 2023-01-06]
- Tiu A, Susswein Z, Merritt A, Bansal S. Characterizing the spatiotemporal heterogeneity of the COVID-19 vaccination landscape. Am J Epidemiol 2022 Sep 28;191(10):1792-1802 [FREE Full text] [CrossRef] [Medline]
- Tiu A, Bansal S. COVID-19 Vaccination Tracking. 2021. URL: https://github.com/bansallab/vaccinetracking [accessed 2023-01-06]
- Reinhart A, Tibshirani R. Big data, big problems: responding to "Are we there yet?". arXivstat Sep.
- Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using Lme4. J Stat Softw 2015;67(1):1-48. [CrossRef]
- Lamstein A, Johnson B, Trulia, Inc.. choroplethr: Simplify the Creation of Choropleth Maps in R. 2022. URL: https://CRAN.R-project.org/package=choroplethr [accessed 2023-01-06]
- Ingram DD, Franco SJ. 2013 NCHS Urban-Rural Classification Scheme for counties. Vital Health Stat 2 2014 Apr(166):1-73 [FREE Full text] [Medline]
- Latkin CA, Dayton L, Yi G, Colon B, Kong X. Mask usage, social distancing, racial, and gender correlates of COVID-19 vaccine intentions among adults in the US. PLoS One 2021 Feb 16;16(2):e0246970 [FREE Full text] [CrossRef] [Medline]
- Lam CN, Kaplan C, Saluja S. Relationship between mask wearing, testing, and vaccine willingness among Los Angeles County adults during the peak of the COVID-19 pandemic. Transl Behav Med 2022 Mar 17;12(3):480-485 [FREE Full text] [CrossRef] [Medline]
- Moore JS, Virk H, Summers JA. Correlation of population factors, compliance with masking and social distance, vaccination, and COVID-19 infection in Central Appalachia. South Med J 2022 Jul;115(7):420-421 [FREE Full text] [CrossRef] [Medline]
- Calamari LE, Tjaden AH, Edelstein SL, Weintraub WS, Santos R, Gibbs M, COVID-19 Community Research Partnership Study Group. Self-reported mask use among persons with or without SARS CoV-2 vaccination - United States, December 2020-August 2021. Prev Med Rep 2022 Aug;28:101857 [FREE Full text] [CrossRef] [Medline]
- Floyd CJ, Joachim GE, Boulton ML, Zelner J, Wagner AL. COVID-19 vaccination and mask wearing behaviors in the United States, August 2020 - June 2021. Expert Rev Vaccines 2022 Oct;21(10):1487-1493. [CrossRef] [Medline]
- The New York Times. Coronavirus (Covid-19) Data in the United States. 2021. URL: https://github.com/nytimes/covid-19-data [accessed 2023-01-06]
- Sun LH. CDC Says Fully Vaccinated Americans Can Go without Masks Outdoors, Except in Crowded Settings. 2021. URL: https://www.washingtonpost.com/health/2021/04/27/cdc-guidance-masks-outdoors/ [accessed 2023-01-06]
- Abutaleb Y, McGinley L. CDC Says Fully Vaccinated Americans No Longer Need Masks Indoors or Outdoors in Many Cases. 2021. URL: https://www.washingtonpost.com/health/2021/05/13/cdc-says-fully-vaccinated-americans-no-longer-need-masks-indoors-or-outdoors-most-cases/ [accessed 2023-01-06]
- Rader B, White LF, Burns MR, Chen J, Brilliant J, Cohen J, et al. Mask-wearing and control of SARS-CoV-2 transmission in the USA: a cross-sectional study. Lancet Digital Health 2021 Mar;3(3):e148-e157 [FREE Full text] [CrossRef] [Medline]
- Blake A. How Many Unvaccinated People Will Stop Wearing Masks Now?. 2021. URL: https://www.washingtonpost.com/politics/2021/05/14/how-many-unvaccinated-people-will-quietly-stop-wearing-masks-now/ [accessed 2023-01-06]
- Chetty R, Jackson MO, Kuchler T, Stroebel J, Hendren N, Fluegge RB, et al. Social capital I: measurement and associations with economic mobility. Nature 2022 Aug 01;608(7921):108-121 [FREE Full text] [CrossRef] [Medline]
- Lazer D, Kennedy R, King G, Vespignani A. The parable of Google Flu: traps in big data analysis. Science 2014 Mar 14;343(6176):1203-1205. [CrossRef] [Medline]
- Bansal S, Chowell G, Simonsen L, Vespignani A, Viboud C. Big data for infectious disease surveillance and modeling. J Infect Dis 2016 Dec 01;214(suppl_4):S375-S379 [FREE Full text] [CrossRef] [Medline]
- Bradburn NM, Sudman S, Wansink B. Asking Questions: The Definitive Guide to Questionnaire Design -- For Market Research, Political Polls, and Social and Health Questionnaires, 2nd, Revised Edition. San Francisco, CA: Jossey-Bass; 2004.
- Adaryukov J, Grunevski S, Reed DD, Pleskac TJ. I'm wearing a mask, but are they?: perceptions of self-other differences in COVID-19 health behaviors. PLoS One 2022;17(6):e0269625 [FREE Full text] [CrossRef] [Medline]
- Astley CM, Tuli G, Mc Cord KA, Cohn EL, Rader B, Varrelman TJ, et al. Global monitoring of the impact of the COVID-19 pandemic through online surveys sampled from the Facebook user base. Proc Natl Acad Sci U S A 2021 Dec 21;118(51):e2111455118 [FREE Full text] [CrossRef] [Medline]
- Christakis NA, Fowler JH. Social network sensors for early detection of contagious outbreaks. PLoS One 2010 Sep 15;5(9):e12948 [FREE Full text] [CrossRef] [Medline]
- Garcia-Herranz M, Moro E, Cebrian M, Christakis NA, Fowler JH. Using friends as sensors to detect global-scale contagious outbreaks. PLoS One 2014 Apr 9;9(4):e92413 [FREE Full text] [CrossRef] [Medline]
- Galesic M, Bruine de Bruin W, Dalege J, Feld SL, Kreuter F, Olsson H, et al. Human social sensing is an untapped resource for computational social science. Nature 2021 Jul 30;595(7866):214-222. [CrossRef] [Medline]
- Taube J, Bansal S. Characterizing Spatiotemporal Trends in Self-Reported Mask-Wearing Behavior in the United States. URL: https://github.com/bansallab/spatial_masking [accessed 2023-01-06]
- Delphi Research Group at Carnegie Mellon University. Delphi Epidata API. URL: https://cmu-delphi.github.io/delphi-epidata/symptom-survey/data-access.html [accessed 2023-01-06]
|CDC: Centers for Disease Control and Prevention|
|CTIS: COVID-19 Trends and Impact Survey|
|HHS: Health and Human Services|
|NCHS: National Center for Health Statistics|
Edited by T Sanchez, A Mavragani; submitted 23.08.22; peer-reviewed by S Fox, N Masters; comments to author 03.11.22; revised version received 22.11.22; accepted 16.12.22; published 06.03.23Copyright
©Juliana C Taube, Zachary Susswein, Shweta Bansal. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 06.03.2023.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.