Exploring the Utility of Google Mobility Data During the COVID-19 Pandemic in India: Digital Epidemiological Analysis

Background: Association between human mobility and disease transmission has been established for COVID-19, but quantifying the levels of mobility over large geographical areas is difficult. Google has released Community Mobility Reports (CMRs) containing data about the movement of people, collated from mobile devices. Objective: The aim of this study is to explore the use of CMRs to assess the role of mobility in spreading COVID-19 infection in India. Methods: In this ecological study, we analyzed CMRs to determine human mobility between March and October 2020. The data were compared for the phases before the lockdown (between March 14 and 25, 2020), during lockdown (March 25-June 7, 2020), and after the lockdown (June 8-October 15, 2020) with the reference periods (ie, January 3-February 6, 2020). Another data set depicting the burden of COVID-19 as per various disease severity indicators was derived from a crowdsourced API. The relationship between the two data sets was investigated using the Kendall tau correlation to depict the correlation between mobility and disease severity. Results: At the national level, mobility decreased from –38% to –77% for all areas but residential (which showed an increase of 24.6%) during the lockdown compared to the reference period. At the beginning of the unlock phase, the state of Sikkim (minimum cases: 7) with a –60% reduction in mobility depicted more mobility compared to –82% in Maharashtra (maximum cases: 1.59 million). Residential mobility was negatively correlated (–0.05 to –0.91) with all other measures of mobility. The magnitude of the correlations for intramobility indicators was comparatively low for the lockdown phase (correlation ≥0.5 for 12 indicators) compared to the other phases (correlation ≥0.5 for 45 and 18 indicators in the prelockdown and unlock phases, respectively). A high correlation coefficient between epidemiological and mobility indicators was observed for the lockdown and unlock phases compared to the prelockdown phase. Conclusions: Mobile-based open-source mobility data can be used to assess the effectiveness of social distancing in mitigating disease spread. CMR data depicted an association between mobility and disease severity, and we suggest using this technique to supplement future COVID-19 surveillance. (JMIR Public Health Surveill 2021;7(8):e29957) doi: 10.2196/29957


Introduction
Infectious diseases have caused profound disruptions throughout the history of humanity. Despite a decrease in the number of deaths attributed to contagious diseases, there has been a constant rise in the number of outbreaks over the past few years due to emerging and re-emerging infectious agents [1]. Influenza, dengue fever, and HIV/AIDS are the three leading contagious diseases that have infected millions of people globally [2]. In addition to these, approximately 215 different infectious agents have caused 12,102 outbreaks in 219 countries over the last 30 years [3]. In general, there have been significant advances in the treatment and curing of infectious diseases. However, infectious diseases pose a considerable challenge to the health system due to their frequency, infectivity, and mobility in today's extensively interconnected world. Therefore, early detection and prevention of infectious diseases continues to be a top priority among the global health community.
The current COVID-19 pandemic has disrupted and overwhelmed health systems worldwide. COVID-19 is an infectious disease that is caused by a newly discovered coronavirus, and the main route of transmission is thought to be through respiratory droplets [4]. The index case of COVID-19 was traced to December 1, 2019, in Wuhan, China [5]. The aggressive nature of the spread of COVID-19 led to its declaration as a "public health emergency of international concern" and then a pandemic by the World Health Organization (WHO) on January 30 and March 11, 2020, respectively. As per the WHO, 216 countries had reported more than 121 million cases and 2.6 million deaths due to COVID-19 as of March 17, 2021 [6]. There were 11.4 million confirmed cases in India alone, with 0.16 million deaths, and it is among the most severely affected countries to date [7]. Due to a lack of effective treatment strategy, nonpharmaceutical interventions (NPIs), such as restricted mobility, home quarantine, and lockdown measures, were enforced worldwide to halt interhuman transmission of the virus [8]. As India is the second most populous country in the world, with suboptimal investment, NPIs were seen as the most crucial part of pandemic mitigation. Hence, the Government of India also implemented a countrywide lockdown to halt disease progression on March 24, 2020 [9].
Research has demonstrated the association between mobility and disease transmission for various infectious diseases, such as cholera, dengue, influenza, Ebola, malaria, measles, and COVID-19 [10][11][12][13][14][15][16][17]. NPIs are intended to slow the rapid disease transmission and contain the disease burden until effective pharmacological management options become accessible [18,19]. Implementing NPIs in response to infectious disease outbreaks is not a new method to limit mobility; they have been used for centuries [20,21]. More recently, such measures were implemented during the containment of the severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS) epidemics, which occurred in the last decades [22,23].
Although the connection between mobility and disease has been known for centuries, establishing this causal association is challenging, as measuring and quantifying the levels of mobility at the population level is difficult. This can be attributed to the challenges in obtaining access to mobility and disease data. However, numerous mathematical models have demonstrated such associations between mobility and infectious disease transmission dynamics [24][25][26]. Moreover, during the current pandemic, the digital ecosystem has supplemented traditional surveillance to provide data about disease severity and mobility in real time.
Given the highly infectious nature of COVID-19, the importance of digital epidemiology could be felt in disease containment [24][25][26][27]. Digital epidemiology is a branch of epidemiology that uses data generated outside the public health system [28]. Google Flu and Google Trends have been successfully used to study various communicable and noncommunicable diseases [29,30]. On similar lines, Google released Community Mobility Report (CMR) data collated from people who accessed its applications using mobile and handheld devices. The restriction in mobility by the Indian government and the data availability provides researchers with an opportunity to empirically study the relationship between social activity, mobility, and COVID-19 incidence. However, there is a shortage of scientific literature that documents the use of these data for surveillance purposes. Very few researchers have tried to correlate mobility trends with the aggressiveness of the disease [31][32][33][34]. Sulyok and Walker [31] depicted negative correlations between CMR data and case incidence for major industrialized countries of Western Europe and North America. Wang and Yamamoto [32] also depicted that a model using CMR data can describe the combined effects of mobility at the local level and human activities on the transmission of COVID-19. Cot et al [33] analyzed Google and Apple mobility data. They concluded that a substantial decrease in the infection rate occurred 2-5 weeks after the onset of mobility reduction [33]. None of these studies explored the association of mobility with any other epidemiological indicators except disease incidence; meanwhile, it has been established that disease incidence alone is not an ideal measure for making comparisons [35]. Therefore, in this study, we attempt to understand and explore the role of mobility in spreading COVID-19 infection in India using mobility data from Google. During the pandemic, the central government has issued various health advisories; however, because health is a state responsibility, the final implementation of those instructions depends on the state itself. Therefore, we hypothesized that the states with strict enforcement of lockdown would witness fewer cases and vice versa. Hence, we have also examined the states with the maximum and minimum numbers of cases for changes in mobility as per CMR data.

Study Design
In this ecological study, we analyzed secondary data available in the public domain between March 14 and October 16, 2020.

Study Period
Many interventions were implemented in India at the national and subnational levels during the lockdown period and were subsequently eased out in a phased manner. To begin, India issued travel advisories and restricted international travel between January and March 2020. By early March, when case numbers started to increase, states scaled up movement restrictions. On March 25, India entered a nationwide lockdown to ramp up preparedness [36]. The mobility data were assessed for three significant periods, based on the implementation of social mobility restrictions by the Indian government to mitigate the pandemic [37]. Robust data for COVID-19 disease burden were available in the public domain from March 14, 2020. Hence, the three phases were labeled as prelockdown (March 14-24, 2020), lockdown (March 25-June 7, 2020), and unlock (June 8-October 15, 2020).

COVID-19 Data
The data sets for COVID-19 cases in India were crowdsourced and made freely available through an API by a volunteer group. The API maintains the records of confirmed, active, recovered, and deceased people for all the Indian states and union territories. The data in the API are gathered daily using state bulletins and official handles. After the data are validated, they are made available daily through Google Sheets [7].

Mobility Data
Google collects and stores individuals' commuting information through a GPS linked to Google Maps. These data are made available on the web in the public domain, after aggregating and anonymizing personally identifiable information, as "COVID-19 Community Mobility Reports" (Multimedia Appendix 1) [38]. A CMR compares the changes in activity and mobility during and after lockdown compared to before lockdown. At the start of the study, the mobility data for 135 countries were available from Google. The mobility data for India have been made available at the state and union territory levels since February 15, 2020. Multimedia Appendix 1 contains further details about this website. The CMR provides the percentage changes in activity for 6 key categories (groceries and pharmacies, parks, transit, retail and recreation, residential, and workplaces) compared to the baseline days before the advent of COVID-19 (5 weeks, from January 3 to February 6, 2020) [39]. Daily activity changes are compared to the corresponding baseline figure day. For example, data on Monday are compared to corresponding data from the baseline series for a Monday. Baseline day figures are calculated for each day of the week for each country and are calculated as the median value [38]. The values represent the relative changes in percentage compared to the baseline days, not the absolute number of visitors. For instance, a value of -50 in the workplaces data set on a Monday indicates a 50% drop compared to the Monday in the reference period. Similarly, a positive value indicates an increase in mobility compared to the reference period.

Primary Outcome Variables and Covariates
The frequency of daily infected cases, deaths, and recovered cases were the primary variables of this study. The disease burden data for India by individual states and union territories were depicted in cases per million (CPM), case fatality rate (CFR), and doubling rate (DR), which were calculated using the standard formulae [40][41][42]. We used census population data from the different states of India as a reference [43]. The mobility indicators pointing toward disease spread were the covariates of interest. A CMR provides data for 6 mobility indicators, used as covariates, which give information on people's movement. It was significant to assess the variability in people's mobility during the unlocking phase in response to the caseload of each state during the lockdown. For the principle of parsimony, we report the frequency of cases using the median and range values for the states with the maximum and minimum numbers of cases.

Data Analysis
We downloaded the mobility and COVID-19 data in the .csv format on October 16, 2020, and we replaced the state codes for the India COVID-19 data with state names using metadata. The mobility data at the national and state levels were filtered and stored. Subsequently, we merged the mobility and COVID-19 data for India and the respective states and union territories using the date variable and created a new spreadsheet. Finally, we arranged the data in separate spreadsheets for the national and state levels for further analysis. Subsequently, the relationship between mobility and COVID-19 spread for the prelockdown and unlock phases was investigated using the Kendall tau correlation. This approach is more general and consistent with the ranking system and is proportional to the number of concordant pairs minus the number of discordant pairs. The value of tau ranges from +1 to -1 for identically and oppositely ranking pairs, respectively. Because it is an initial empirical investigation of the relationship between mobility and epidemiological indicators, the emphasis is on the magnitude of the correlation rather than the P value. Further, we calculated and reported the 95% CI with all the point estimates to provide readers with an idea of the estimate range.

Ethical Clearance
Ethical clearance for the study was obtained from the Institutional Review Board of the Postgraduate Institute of Medical Education and Research, Chandigarh, India, vide letter INT/IEC/2020/SPL-1594.

Disease Burden During Different Phases of Lockdown
The line graphs display the mobility trend and rise in the number of cases during these phases (Figure 1). At the end of phase 1, as in, just before the national lockdown, the numbers of cumulative cases, cumulative deaths, and cumulative recoveries throughout India were recorded to be 567, 40, and 10, respectively. The lockdown was enforced for 75 days, until June 8, 2020, but the surge in the cumulative caseloads continued (Table 1). This was followed by sequential unlocking, after which a further surge was witnessed. As of October 15, 2020 (second unlock phase), the reported numbers of cumulative cases, cumulative deaths, and cumulative recoveries in India surpassed 7.5 million, 0.1 million, and 6.6 million, respectively, with marked interstate variations.             Table 3 depicts the changes in the mobility patterns in all 6 categories reported in CMRs for India and for the states of Maharashtra (most cases) and Sikkim (fewest cases). At the national level, mobility in 5 of the 6 categories was reduced during the lockdown period compared to the reference period, with the exception being residential areas. During the lockdown, maximum restrictions were seen at retail and recreation areas, followed by transit, parks, and workplaces. The leading drop of -77.2% (95% CI -78.7% to -75.8%) at the national level occurred for the retail and recreation category during the lockdown. In contrast, residential mobility increased by 24.6% (95% CI 23.4% to 25.8%) during the lockdown. During unlock, the areas with the lowest to highest mobility were residential, groceries and pharmacies, workplaces, transit, parks, and retail and recreation. With the maximum number of cases, Maharashtra State displayed the highest restriction in movement, with a drop of -82.4% (95% CI -83.3% to -81.5%) in the lockdown phase for retail and recreation. Sikkim depicted higher mobility compared to Maharashtra for all 6 categories of places reported in CMRs. The state of Sikkim displayed a drop of -65.4% (95% CI -67% to 63.9%) during lockdown for retail and recreation. The spiral bar charts in Figure 2 display the changes in mobility for the states of Sikkim and Maharashtra as well as at the national level across the different phases of the lockdown. Table 4 exhibits the intramobility correlation. In general, residential mobility was negatively correlated with all other measures of mobility. The magnitude of correlations for the intramobility indicators was comparatively low for the lockdown phase compared to the prelockdown and unlock stages.

Correlation Between Mobility and Epidemiological Indicators
A general trend of a high correlation coefficient between epidemiological and mobility indicators was observed for the lockdown and unlock phases compared to the prelockdown phase. With few exceptions, the correlation coefficients between epidemiological and mobility indicators for India and Maharashtra are similar. The highest correlation for India, Maharashtra, and Sikkim was observed in the unlock stage for retail and recreation and all epidemiological indicators. It was interesting to see a substantial increase in correlation between park visits and epidemiological indicators from the lockdown phase to the unlock phase. Only 7 cases were reported in Sikkim before unlock; therefore, intercorrelations for CFR and recovery are not available during the pre-unlock phases. Table 5 gives details of the correlation coefficients between mobility and epidemiological indicators. Initial exploration indicated that there are substantially high correlations between various epidemiological and Google mobility indicators. Figure 1 displays the cumulative rise in the frequency of cases with the mobility indicators. There was a rapid surge in the number of cases in the unlock phase after flat linear growth up to the lockdown stage.

Discussion
We used the CMRs provided by Google to assess the national and subnational patterns of mobility before, during, and after the COVID-19 pandemic lockdown enforced by the government of India and their correlations with disease severity. There are specific critical findings in our study. First, there were marked interstate variations in the disease burden during the three phases of our study period. By the end of the lockdown phase, although the CPM and DR continued to increase, disease severity, as depicted by the CFR, started to decrease. The CMR data depicted that mobility decreased during the lockdown and then increased again during the unlock phase. We observed intramobility solid patterns among the 6 mobility indicators. Residential mobility was seen to be inversely associated with mobility in public places. A significant correlation was seen between mobility and epidemiological indicators.
Inter-and intramobility networks play significant roles in disease transmission dynamics in the modern era [44]. We observed wide subnational variations in the disease burden, as depicted by various epidemiological indicators used in the study. The state of Maharashtra was among the most greatly affected states in the country. This can be attributed to its large population size, as it is the second most populous state in India after Uttar Pradesh. Moreover, COVID-19 was a relatively urban phenomenon during the study period, and Maharashtra is one of India's most urbanized states (more than 50% urbanized). More than half of the COVID-19 cases in Maharashtra were reported from four major cities: Mumbai, Thane, Pune, and Nagpur. In contrast, the proportion of urbanization in other populous states such as Uttar Pradesh is 22%. Also, Maharashtra attracts more people from other states for education and jobs; hence, it has a very high population density. This demographic profile has a significant impact on the COVID-19 transmission dynamics. However, if we consider total CPM, many other states, such as Goa, Delhi, and Andhra Pradesh, reported more cases than Maharashtra by the end of lockdown. Also, any state with an efficient and sensitive surveillance system in place will detect more cases during an epidemic. Maharashtra has always been among the top performers in India in this regard [45]. On the other hand, the state of Sikkim depicted the minimum number of cases throughout the country. Sikkim is a remote state in a hilly area, with a small population size and lower population density, fewer migrations, more rural areas, and less interstate trade and transit; this explains its lower number of cases during the study period.
In our study, residential mobility increased during the lockdown. It also correlated negatively with other measures of mobility. These findings are consistent with studies conducted by Saha et al [34,46], which found that people stayed at home during the lockdown. The mobility trends display that mobility started to decrease even before the government implemented the lockdown measure. Although legal enforcement was the prime reason for reducing mobility, people also restricted their movements voluntarily and avoided crowded places due to apprehensions regarding the disease [47][48][49]. Mobility in other places was reduced during the lockdown and then gradually rescaled during the unlock phase. This pattern is coherent with those in other studies, which found that people rescheduled or canceled travel and transport plans in the wake of public health emergencies [50,51].
We used the Kendall tau correlation to quantify the relationship between mobility and epidemiological indicators, as this correlation is more robust and consistent for nonnormal data. The mobility indicators depicted strong correlations with epidemiological characteristics for both the lockdown and unlock phases. This trend is consistent with many theoretical studies that have predicted the role of mobility for infectious diseases [10,52,53]. Previous studies have demonstrated the utility of mobility in the spread of COVID-19 data globally [31,54]. Our analysis indicates that mobility is a potential metric to monitor and predict disease outbreaks. However, the statistical analysis is exploratory and univariate and requires rigorous statistical evaluation before mobility can be adopted as an indicator of surveillance.
Moreover, our models depicted wide interstate variations in mobility patterns. We discussed variations only in the states of Maharashtra and Sikkim, as they reported the maximum and minimum numbers of cases, respectively, at the end of lockdown to understand the relationship between mobility and disease dynamics. Addressing diseases such as COVID-19 from a mathematical perspective can reveal the internal pattern and potential structure of pandemic control, and it can help provide insights into the transmission dynamics of such diseases and the potential role of different public health intervention strategies [55]. Lockdown interventions to prevent the spread of infection lead to different patterns of mobility. However, lockdown measures only serve their purpose when they ate strictly enforced. Our data suggest that the growth trajectory for the rise in cases was linear compared to the steep trajectory post lockdown. Many authors have previously discussed the impact of lockdown in controlling the spread of COVID-19 in India [56,57]. However, lockdown measures to save lives were recommended and championed by the WHO and other leading agencies. There are numerous discussions and debates in the literature regarding the appropriateness of total lockdown measures [58][59][60]. A group of medical researchers published the "Great Barrington Declaration," in which they emphasized the concept of "Focused Protection" as an alternative to lockdowns [61]. Simultaneously, other researchers disagreed and called for strict measures until a vaccine became available; they published the "John Snow Memorandum" [62]. However, it may take a long time to assess the overall strengths and shortcomings of the lockdown.
This study is the first pan-Indian empirical study quantifying the role of mobility in disease transmission. However, there are some obvious limitations to our study. The major limitation is the dynamic nature of COVID-19 and the mobility patterns. Therefore, it is challenging to obtain robust estimates unless disease transmission stabilizes. Moreover, an ecological study uses normative mobility data; this study may thus be impacted by ecological fallacy. The disease infection rate varies per gender, accessibility to health care, and literacy level; however, the data for the current study limit its generalization to these subgroups.
Similarly, no attempt can be made to examine the psychological and sociological issues affecting mobility. The data used by Google to generate the mobility estimates may have questionable concordance with the actual mobility rates. Mobile phones may not reflect the actual mobility in the community, especially in rural areas, where GPS-enabled smartphones are not used by many people. Similarly, apprehensions about data misuse may prevent many smartphone users from using maps, undermining the actual estimates. Less frequent GPS usage may be a reason why we could not find intermobility patterns for Sikkim in our analysis. Finally, per CMR, baseline dates do not account for the seasonality of movements. The lack of accounting of seasonality may also affect the accuracy and precision of estimates. Moreover, Google's CMR data do not directly equate to some specific COVID-19 control measures. We could not assess the reasons underlying the patterns observed in mobility.
To conclude, we can use mobile-based open-source mobility data to assess the effectiveness of social distancing. CMR data depicted an association between community mobility with disease severity indicators. We suggest that data related to community mobility can be of utility in future COVID-19 modeling studies. With the declaration of COVID-19 as a pandemic, mobility levels declined, which can be primarily attributed to legal enforcement or increased fear of disease leading to personal behavioral changes. Google's CMR depicts the effect of these measures on community movement. CMR can provide an effective tool for the authorities to evaluate the timing and impact of social distancing efforts, mainly related to movement restrictions. We recommend using these data whenever applicable to supplement the existing surveillance methods in any country. This approach does not involve any additional cost and can provide quick action points about the adherence to social distancing measures. This method can be used to forecast mass movements during nonpandemic conditions, such as the famous gatherings during Kumbh Mela in India, and can help us assess preparedness accordingly. An attempt can also be made to forecast mass movements, which is needed to make informed decisions. With the increase in mobile internet usage, the real-time data method is expected to increase accuracy. Future studies should focus on establishing the cultural, social, and economic issues that are responsible for some of the differences in adherence to social distancing measures.