Previous epidemic management research proves the importance of city-level information, but also highlights limited expertise in urban data applications during a pandemic outbreak. In this paper, we provide an overview of city-level information, in combination with analytical and operational capacity, that define urban intelligence for supporting response to disease outbreaks. We present five components (movement, facilities, people, information, and engagement) that have been previously investigated but remain siloed to successfully orchestrate an integrated pandemic response. Reflecting on the coronavirus disease (COVID-19) outbreak that was first identified in Wuhan, China, we discuss the opportunities, technical challenges, and foreseeable controversies for deploying urban intelligence during a pandemic. Finally, we emphasize the urgency of building urban intelligence through cross-disciplinary research and collaborative practice on a global scale.JMIR Public Health Surveill 2020;6(2):e18873
Cities have become the “locus of risks” due to increasing natural disasters, health pandemics, political protests, and organized crime . Megacities, which the United Nations defines as more than 10 million citizens, face increasing risks in environmental and population health, despite their economic prosperity and status as hubs for cultural exchange and technological innovation. The United Nations predicts that there will be 43 megacities by 2030, with the majority of them in developing countries. By 2050, the world population will be nearly 10 billion, with an estimated 68% living in urban areas [ ]. Due to their population density and connectivity, megacities are particularly vulnerable to infectious diseases as seen in the dengue, Zika, and severe acute respiratory syndrome epidemics. Meanwhile, the rapid development of information and communications technology (ICT), Internet of Things, cloud computing, and smartphone apps has enabled near real time information sharing. The large volume, velocity, and variety of urban data enable a deeper and holistic understanding of urban conditions and real time situations.
The unfolding of the current coronavirus disease (COVID-19) pandemic has drawn significant global attention. The outbreak was first identified in Wuhan, a megacity with more than 11 million people in China . The soaring number of confirmed cases and deaths immediately drew serious attention from the medical community to address the pandemic by employing different approaches. Although there is extensive scientific literature on the environmental, social, economic, and health aspects of urban epidemics, most studies focus on long-term planning and public policy research. Studies have revealed that the city-level endogenous differences, including geography, population characteristics, spatial structure, regional connectivity, and microclimate, are associated with variations in epidemic dynamics (eg, transmission potential and infection patterns) across cities [ ]. However, few studies have addressed how urban data and data science methodologies can be applied for pandemic response. As an interdisciplinary field, data science provides tools for better and timely information management and data use. In this article, we highlight the urgency to develop city data expertise and data science practice to better design information collection and integrate predictive analytics to implement real time responses in cities. Reflecting on the ongoing novel coronavirus pandemic, we explore two critical questions: What are currently available data in cities? What are the possible uses of urban data for the epidemic response?
First, we define urban intelligence as a capacity that analyzes city-level information using data science methods and explore its role in a pandemic response. In this context, we present five well-investigated urban research areas that are crucial components of urban intelligence in a disease outbreak. Second, reflecting on the current novel coronavirus outbreak, we summarize the opportunities and challenges in the preparation, containment, and recovery from a pandemic. Third, we discuss arguments and debates around uncertainty, privacy, information security, as well as the trade-off between timeliness and accuracy of data exchange during an outbreak.
Methods, Search Strategy, and Selection Criteria
Data for this viewpoint were identified by searches of PubMed, Social Sciences Citation Index, Science Citation Index, Scopus, and references from relevant articles using the search terms “urban intelligence,” “urban health,” and “pandemic response”. Reports, news articles, or websites were included only when they related directly to previously published work, or they were the only currently available information source at the moment of manuscript preparation. Only articles in English between 1965 and 2020 were included. One Chinese website was cited since it was the only available and most widely adopted media platform during the COVID-19 outbreak in Wuhan in February 2020.
Defining Urban Intelligence
Urban Intelligence Capacity
The concept of a “smart city” is arguably new, but there was considerable research done in urban intelligence during the aftermath of World War II. In 1965, Webber  proposed “intelligence centers that bring scientific morality into urban affairs” to address an increasing complexity of cities with interactional consequences among transportation, communication, organization, and social behavior. Sternberg describes intelligence as “complex analytics, modeling, optimization, and visualization in the operational business processes to make better operational decisions” [ ]. For cities in particular, Kitchin [ ] describes urban intelligence as a capacity “to monitor, manage and regulate city flows and processes, often in real-time, and mobile computing, ..., and uses rich seams of data that can used to better depict model and predict urban processes and simulate the likely outcomes of future urban development”. Day and Schuler [ ] provide an extended sociotechnical view on civic intelligence with inclusiveness and engagement as “the capacity that organizations and society use to make sense of information and events and craft responses to environmental and other challenges collectively”. In summary, intelligence is a capability to collect urban contextual and situational data as digital representations of the reality (input); perceive information from various sources of data (processing); generate knowledge (output); and direct responses, behaviors, or decisions within a specific environment (action).
shows the core components of urban intelligence. We identified three fundamental capacities enabling urban intelligence: city information resources, data science skills, and executive power to operate. Urban intelligence derives from information that requires in-depth knowledge of different sources and types of data in cities, as well as processes for their collection, management, and exchange. The transdisciplinary field is referred to as urban informatics, which encompasses the generation and application of data and related information technology in the context of cities, and lies at the intersection of people, places, and technology [ ]. A more specific definition describes it as a study of urban phenomena to address domain-specific urban challenges such as a pandemic response through a data science framework and computational techniques including sensing, data mining, information integration, modeling, analysis, and visualization [ ]. Experts in this field collect and analyze data by using a wide range of scientific, engineering, and computational methods, such as sensing (in situ, remote, or mobile sensing), imagery processing, natural language processing, statistical modeling, graph-based network analysis, machine learning, and geographical information system.
The second component of urban intelligence is the analytical capacity using data science. A definition of data science describes it as a discipline of knowledge extraction from data using computer science, statistics, and domain expertise . The distinction of data science from conventional statistics is its capacity for handling a much larger volume of heterogeneous and unstructured data [ ]. Most experts in big data computing, machine learning, and artificial intelligence are proficient in computer science and statistics, but there are few who also have domain knowledge. Domain expertise is essential for identifying actionable insights (feasibility for deployment and measurable improvement of actual operation), validating meaningful predictions (including its accuracy, sensitivity, and relevance to decision making), and evaluating potential impact (eg, expected and unexpected social, economic, and political consequences). The last, arguably the most crucial component for urban intelligence in the context of a pandemic, is the emergency operational and executive power for critical event preparedness and response.
Urban Intelligence During a Pandemic
We present five components of urban intelligence (movement, facilities, people, information, engagement) that are well-researched but remain too siloed to successfully orchestrate an integrated pandemic response.summarizes specific data sources, analytical tasks, and actions to take during a pandemic.
|Components||Data sources||Analytical tasks||Actions and operations|
|Movement||Air flights, ground transportation, GPS tracing, cellphone pings||Identify mobility hot spots and develop network algorithms to analyze spatial patterns and flows||Transportation control, checkpoints, identify quarantine zones, contact tracing|
|Facilities||Facility catalog, resource inventory, infrastructure performance||Model capacity and optimize medical staffing and resource triaging||Logistical distribution and human resources, capital planning|
|People||Population census, community survey||Quantify local population characteristics and neighborhood health baseline measures||Provide additional services to vulnerable population groups and communities|
|Information||User agreement and protocols for data exchange||Develop an information exchange and coordination pipeline during a pandemic||Integrate and manage data across various resources and agencies|
|Engagement||Digital platforms, news and social media, open data portal||Identify high influencers on social media and less active sectors or regions that require proactive outreach||Broadcast news and crowdsource local information|
Quantification of spatial connectivity and mapping real time human mobility at the intraurban scale provide actionable insights for a pandemic response. The impact of geography on epidemic dynamics and pandemic transmission hubs at the regional scale have been reported [, ], but there are limited investigations in inter- or intraurban connectivity and human mobility. The human movement between cities needs better data sources and quantification methods. Conventional data such as population census or community survey reveal regional connectivity and spatial structure of the human movement. Real time or near real time human mobility data during a pandemic is valuable since regional population movement may unmask abnormal behavior during a critical event. In a study on the COVID-19 outbreak in Wuhan, the research team measured intercity connectivity by using three data sources, including global flight bookings from the Official Aviation Guide, the prefecture-level daily passenger volume (by transportation modes) based on location-based services provided by Tencent (one of the largest information technology companies and the operator of WeChat), and the historical estimation of Spring Festival travelers reported by the municipal transportation department [ ]. Such information may guide national pandemic forecasting and regional interventions (eg, reschedule the flights and high-speed railway operations); however, it does not provide intracity human mobility for modeling complex spatial-temporal patterns. Relevant data includes human mobility trajectory mapping based on cell phone pings [ ], real time local population estimation using public Wi-Fi probe [ ], intracity mobility pattern detection using GPS loggers or General Transit Feed Specification data provided by buses, taxis, and bike-share operators [ , ], and spatial analysis of human-scale economic and social activities using geotagged social media feeds (eg, Twitter, Instagram, Foursquare, Yelp) [ - ]. During a pandemic, all these data provide more spatially specific indications for containment strategy and inform contact tracing.
City agencies collect and manage information on large data inventory of critical facilities and resources. A study from Johns Hopkins University evaluated hospital surge capacity for maintaining basic operation (eg, sanitation, food, communication, security) and standards of care during four catastrophic scenarios (pandemic influenza, radiation, explosive, and nerve gas attack). In a pandemic influenza scenario particularly, the top five critical facilities and resources are: isolation room or cohorting, respiratory therapist, face masks, antiviral agents for influenza, and dialysis . Besides public health systems, other facilities owned and operated by cities play supportive roles during a pandemic. The New York City Department of City Planning manages the City Planning Facilities Database with more than 35,000 city, state, or federal-owned facilities and program sites, including public schools, daycare service providers, public libraries, parks, sports stadiums, and recreational centers [ ]. The initial purpose of this data inventory is for public budget allocation, neighborhood funding evaluation, and capital planning, but data reporting the location and capacity of facilities provide valuable information for disaster planning and response. Private mapping or navigation application program interfaces, including Google Maps, TomTom, and Foursquare, provide near real time information on geolocation and operations (business hours and peak hours). Studies on sensor data applications and computing in the urban environment have identified points of interest (POI) as one of the critical measures for estimating local human activity and related exposure risk for disaster management [ , ]. Besides public-owned facilities, POI in cities, such as local clinics, drug stores, convenience stores, and grocery shops, become critical nodes and suppliers to ensure uninfected people’s well-being during the outbreak.
Neighborhood demographics including socioeconomic profiles provide essential baseline measures for identifying underlying infection risks based on population characteristics. One example is the New York City Community Health Profiles, a census on 59 community districts of population health, reporting more than 50 metrics on neighborhood environmental and population health along with social and behavioral indicators (eg, education, income, smoking, alcohol consumption) . The initial purpose of the data collection is to quantify neighborhood health and quality-of-life metrics, but population data by specific age groups (eg, infants, children, or older adults), health condition, and socioeconomic status identify vulnerable communities. Recent research and practice have proven that urban data sources with high spatial resolution can support better operations that target specific population groups such as children, older adults, and the homeless population [ ]. Beyond estimating the vulnerable population at risk for infections, ICT can provide educational and other care services for older adults or “left-behind” children at the community scale [ - ]. Under a data governance that protects information security and respects personal privacy, these additional data can inform city agencies, social institutions, and community-based organizations to provide local and targeted services during a pandemic.
During the Obama administration, the Open Government Partnership focused on the role of integrated data and urged city agencies to generate cross-cutting initiatives and data exchange protocols . As a result, interorganizational institutions for better information integration, such as the New York City Center for Innovation through Data Intelligence, were created [ ]. A citywide interagencies data exchange protocol is crucial to inform public and private sectors on who collects what data and for what purposes. Better data exchange will reduce response time and information discrepancy to mobilize resources and coordinate multiagency operations (eg, an outbreak in a public school may require both the health and education departments). Data exchange protocol also optimizes delegation of duties at different levels of urgency during a pandemic. During the 2009 H1N1 pandemic, NYC 311, a citywide agency managing nonemergency service requests, received and triaged approximately 54,000 phone calls regarding possible influenza [ ]. In a pandemic situation, interagency coordination does not only improve information exchange but also appropriately triages health care service response for a more efficient operation.
Active and productive engagement between city agencies and the general public plays a vital role during a pandemic outbreak. Government-citizen communication and social media analytics can raise people’s awareness, monitor public sentiment, and identify false alarms or fake news. During the 2009 H1N1 pandemic in Mexico City, Telmex, a major telecommunication operator in Mexico, managed more than 5 million phone calls, 140 million text messages, and 18 million email messages containing official communications from the Ministry of Health . Besides the traditional telecommunication services, social media platforms play a critical role in broadcasting news and promoting preventive actions to the general public. Crowdsourcing data collection provides a unique value for infectious disease surveillance [ ]. Crowdsourcing provides alternative information when no other data are available, improves the spatial-temporal resolution of disease analysis with geotagged high-frequency data, and increases public health awareness through the participatory process. During the COVID-19 outbreak, Ding Xiang Yuan (DXY), a leading digital health platform in China, provided a stage for broadcasting real time information and public engagement [ ]. The platform had three components: real time mapping of confirmed cases and deaths, dispelling rumors and fake news, and public education on prevention. By combining crowd-sourced data on the DXY platform with data from news sources and national health agency websites, researchers were able to gather information that would be otherwise difficult to obtain from aggregate data released by health authorities, such as the delay between symptom onset and detection by the health care system, reporting delays, and travel histories [ ]. Engagement also mitigates the massive societal disruption during the outbreak. Multiple service companies have emerged that offer platforms to support virtual offices and online education. Such platform-based services are critical for reducing unnecessary travel demands and physiological stress during the quarantine.
Information transparency remains a major issue, but there is a parallel issue of the disconnect between information and data. In recent years, Chinese public information platforms and mobile phone apps have generated massive amounts of data. During the COVID-19 outbreak, the majority of information was released in the format of news, texts, infographics, tables, or map images for public disclosure purposes only, leading to limited data for computation. Chinese public officials regularly shared information of high-speed rail departures and aircraft flights with suspected cases, but as infographics through social media (WeChat or Weibo). Even when information was released, no publicly available data was provided due to a lack of data standards or guidelines. Although crowdsourcing has become an accepted approach for collecting data on a large scale, the quality and consistency of data collection remain a challenge due to its participatory nature. Social media provides near real time information on newly confirmed cases, but it is necessary to consider the trade-offs between timeliness and accuracy. Machine learning and deep learning that train on retrospective data are promising for artificial intelligence-assisted diagnoses and other population health tasks, but they currently have limited real time practical value. As of early February 2020, almost 2 months since the COVID-19 outbreak, the most comprehensive research resource only had 1334 patient-level records .
Even for data-savvy cities such as New York City or Singapore, heterogeneous data sources produce messy and typically biased data from a lack of representativeness. These limitations create so-called “signal problems” that skew the understanding of reality . As the former Director of Analytics of New York City described, a fire hose of information is of no use unless it points to a fire [ ]. During a pandemic, information initiatives require tremendous cross-disciplinary knowledge, resources, network, and a political willingness to connect and link data with the right people with domain expertise for the right problems.
Insights do not guarantee actions. Data scientists are prone to be “paralyzed by analysis,” while ground operations in a real world urban environment must be responsive, proactive, and agile to act with incomplete data and missing information. Moral dilemmas around optimization criteria, the liability associated with uncertainties, and concerns around unexpected public reactions, engender social, technical, and political challenges for transforming insights into actions. Privacy threats, cybersecurity vulnerability, ethical controversies, and unanticipated societal impact further create risks for scaling urban intelligence.
The impact of a pandemic is far beyond public health and medical care. It brings large scale economic risks and social instability, especially for densely populated megacities. Expertise in the generation, collection, analytics, and application of urban data can bring tremendous value to support better response and prevention during a pandemic. Even when a situation gets effectively controlled, urban intelligence can provide continuous risk assessment and support economic recovery. As we continue to face an unprecedented pandemic, we need to identify and implement best practices in urban intelligence to define the critical role of cities in the global public health crisis.
LAC is funded by the National Institute of Health through NIBIB R01 EB017205.
YL led the drafting of the report. LC coordinated contributions and redrafting. All authors provided content and comments that informed the drafting process.
Conflicts of Interest
- Borraz O, Le Galès P. Urban governance in Europe: the government of what? Pôle Sud 2010;32(1):137-151. [CrossRef]
- United Nations. 2018 May 16. 68% of the world population projected to live in urban areas by 2050, says UN URL: https://tinyurl.com/y8opsvrs
- Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The Lancet 2020 Feb;395(10223):497-506. [CrossRef]
- Dalziel BD, Kissler S, Gog JR, Viboud C, Bjørnstad ON, Metcalf CJE, et al. Urbanization and humidity shape the intensity of influenza epidemics in U.S. cities. Science 2018 Oct 05;362(6410):75-79 [FREE Full text] [CrossRef] [Medline]
- Webber MM. The roles of intelligence systems in urban-systems planning. Journal of the American Institute of Planners 1965 Nov;31(4):289-296. [CrossRef]
- Chourabi H, Nam T, Walker S, Gil-Garcia JR, Mellouli M, Nahon K, et al. Understanding smart cities: an integrative framework. 2012 Jan Presented at: 2012 45th Hawaii International Conference on System Sciences; 4-7 January 2012; Maui, HI. [CrossRef]
- Kitchin R. The real-time city? Big data and smart urbanism. GeoJournal 2013 Nov 29;79(1):1-14. [CrossRef]
- Day P, Schuler D. Community practice in the network society: pathways toward civic intelligence. In: Purcell P, editor. Networked Neighbourhoods. London: Springer; 2006:19-46.
- Foth M, Choi JHJ, Satchell C. Urban informatics. 2011 Mar Presented at: ACM Conference on Computer supported Cooperative Work; 2011; Hangzhou, China. [CrossRef]
- Kontokosta CE. Urban informatics in the science and practice of planning. Journal of Planning Education and Research 2018 Aug 27:0739456X1879371. [CrossRef]
- Zinoviev D. Data Science Essentials in Python: Collect-Organize-Explore-Predict-Value. Raleigh, NC: Pragmatic Bookshelf; 2016.
- Dhar V. Data science and prediction. Commun ACM 2013 Dec;56(12):64-73. [CrossRef]
- Kissler SM, Gog JR, Viboud C, Charu V, Bjørnstad ON, Simonsen L, et al. Geographic transmission hubs of the 2009 influenza pandemic in the United States. Epidemics 2019 Mar;26:86-94. [CrossRef] [Medline]
- Wu JT, Leung K, Leung GM. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study. Lancet 2020 Feb 29;395(10225):689-697. [CrossRef] [Medline]
- González MC, Hidalgo CA, Barabási A. Understanding individual human mobility patterns. Nature 2008 Jun;453(7196):779-782. [CrossRef]
- Kontokosta CE, Johnson N. Urban phenology: toward a real-time census of the city using Wi-Fi data. Computers, Environment and Urban Systems 2017 Jul;64:144-153. [CrossRef]
- Jiang B, Yin J, Zhao S. Characterizing the human mobility pattern in a large street network. Phys Rev E 2009 Aug 31;80(2). [CrossRef]
- Motta G, Sacco D, Ma T, You L, Liu K. Personal mobility service system in urban areas: the IRMA Project. 2015 Jun 25 Presented at: 2015 IEEE Symposium on Service-Oriented System Engineering; 30 March-3 April 2015; San Francisco Bay, CA. [CrossRef]
- Gibbons J, Nara A, Appleyard B. Exploring the imprint of social media networks on neighborhood community through the lens of gentrification. Environment and Planning B: Urban Analytics and City Science 2017 Sep 12;45(3):470-488. [CrossRef]
- Shelton T, Poorthuis A, Zook M. Social media and the city: rethinking urban socio-spatial inequality using user-generated geographic information. Landscape and Urban Planning 2015 Oct;142:198-211. [CrossRef]
- Glaeser EL, Kim H, Luca M. Nowcasting the local economy: using Yelp data to measure economic activity. National Bureau of Economic Research 2017 Nov. [CrossRef]
- Bayram JD, Sauer LM, Catlett C, Levin S, Cole G, Kirsch TD, et al. Critical resources for hospital surge capacity: an expert consensus panel. PLoS Curr 2013 Oct 07;5. [CrossRef] [Medline]
- New York City Department of City Planning. 2020. NYC Facilities Explorer URL: https://capitalplanning.nyc.gov/facilities
- Ang L, Seng KP. Big sensor data applications in urban environments. Big Data Research 2016 Jun;4:1-12. [CrossRef]
- Bell WC, Dallas CE. Vulnerability of populations and the urban health care systems to nuclear weapon attack – examples from four American cities. Int J Health Geogr 2007;6(1):5. [CrossRef]
- The City of New York. New York City community health profiles URL: https://www1.nyc.gov/site/doh/data/data-publications/profiles.page
- Tatem AJ, Adamo S, Bharti N, Burgert CR, Castro M, Dorelien A, et al. Mapping populations at risk: improving spatial demographic data for infectious disease modeling and metric derivation. Popul Health Metr 2012 May 16;10(1):8. [CrossRef] [Medline]
- Asis MMB. Living with migration. Asian Population Studies 2006 Mar;2(1):45-67. [CrossRef]
- Jingzhong Y. Left-behind children: the social price of China's economic boom. Journal of Peasant Studies 2011 Jul;38(3):613-650. [CrossRef]
- Goulia P, Mantas C, Dimitroula D, Mantis D, Hyphantis T. General hospital staff worries, perceived sufficiency of information and associated psychological distress during the A/H1N1 influenza pandemic. BMC Infect Dis 2010 Nov 09;10:322 [FREE Full text] [CrossRef] [Medline]
- Fantuzzo J, Culhane DP. Actionable Intelligence: Using Integrated Data Systems to Achieve a More Effective, Efficient, and Ethical Government. London: Springer; 2015.
- The City of New York. 2020. Center for innovation through data intelligence URL: https://www1.nyc.gov/site/cidi/index.page
- Bell DM, Weisfuse IB, Hernandez-Avila M, Del Rio C, Bustamante X, Rodier G. Pandemic influenza as 21st century urban public health crisis. Emerg Infect Dis 2009 Dec;15(12):1963-1969. [CrossRef] [Medline]
- Chunara R, Smolinski MS, Brownstein JS. Why we need crowdsourced data in infectious disease surveillance. Curr Infect Dis Rep 2013 Aug;15(4):316-319 [FREE Full text] [CrossRef] [Medline]
- DXY. 2020. COVID-19 global pandemic real-time report URL: https://ncov.dxy.cn/ncovh5/view/pneumonia
- Sun K, Chen J, Viboud C. Early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study. 2020;2(4):e201-e208.
- Guan W, Ni Z, Hu Y, Liang W, Ou C, He J, China Medical Treatment Expert Group for Covid-19. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med 2020 Feb 28 [FREE Full text] [CrossRef] [Medline]
- Crawford K. The hidden biases in big data. Harvard Business Review 2013.
- Flowers M. Beyond open data: the data-driven city. In: Beyond Transparency: Open Data and the Future of Civic Innovation. San Francisco, CA: Code for America Press; 2013:185-198.
|COVID-19: coronavirus disease|
|DXY: Ding Xiang Yuan|
|ICT: information and communications technology|
|POI: point of interest|
Edited by T Sanchez; submitted 24.03.20; peer-reviewed by A Benis; accepted 05.04.20; published 14.04.20Copyright
©Yuan Lai, Wesley Yeung, Leo Anthony Celi. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 14.04.2020.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.