%0 Journal Article %@ 2369-2960 %I JMIR Publications %V 11 %N %P e56877 %T Assessing COVID-19 Mortality in Serbia’s Capital: Model-Based Analysis of Excess Deaths %A Cvijanovic,Dane %A Grubor,Nikola %A Rajovic,Nina %A Vucevic,Mira %A Miltenovic,Svetlana %A Laban,Marija %A Mostic,Tatjana %A Tasic,Radica %A Matejic,Bojana %A Milic,Natasa %K COVID-19 %K COVID-19 impact %K SARS-Cov-2 %K coronavirus %K respiratory %K infectious disease %K pulmonary %K pandemic %K excess mortality %K death rate %K death toll %K centralized health care %K urban %K Serbia %K dense population %K public health %K surveillance %D 2025 %7 17.4.2025 %9 %J JMIR Public Health Surveill %G English %X Background: Concerns have been raised about discrepancies in COVID-19 mortality data, particularly between preliminary and final datasets of vital statistics in Serbia. In the original preliminary dataset, released daily during the ongoing pandemic, there was an underestimation of deaths in contrast to those reported in the subsequently released yearly dataset of vital statistics. Objective: This study aimed to assess the accuracy of the final mortality dataset and justify its use in further analyses. In addition, we quantified the relative impact of COVID-19 on the death rate in the Serbian capital’s population. In the process, we aimed to explore whether any evidence of cause-of-death misattribution existed in the final published datasets. Methods: Data were sourced from the electronic databases of the Statistical Office of the Republic of Serbia. The dataset included yearly recorded deaths and the causes of death of all citizens currently living in the territory of Belgrade, the capital of the Republic of Serbia, from 2015 to 2021. Standardization and modeling techniques were utilized to quantify the direct impact of COVID-19 and to estimate excess deaths. To account for year-to-year trends, we used a mixed-effects hierarchical Poisson generalized linear regression model to predict mortality for 2020 and 2021. The model was fitted to the mortality data observed from 2015 to 2019 and used to generate mortality predictions for 2020 and 2021. Actual death rates were then compared to the obtained predictions and used to generate excess mortality estimates. Results: The total number of excess deaths, calculated from model estimates, was 3175 deaths (99% CI 1715-4094) for 2020 and 8321 deaths (99% CI 6975-9197) for 2021. The ratio of estimated excess deaths to reported COVID-19 deaths was 1.07. The estimated increase in mortality during 2020 and 2021 was 12.93% (99% CI 15.74%-17.33%) and 39.32% (99% CI 35.91%-39.32%) from the expected values, respectively. Those aged 0‐19 years experienced an average decrease in mortality of 22.43% and 23.71% during 2020 and 2021, respectively. For those aged up to 39 years, there was a slight increase in mortality (4.72%) during 2020. However, in 2021, even those aged 20‐39 years had an estimated increase in mortality of 32.95%. For people aged 60‐79 years, there was an estimated increase in mortality of 16.95% and 38.50% in 2020 and 2021, respectively. For those aged >80 years, the increase was estimated at 11.50% and 34.14% in 2020 and 2021, respectively. The model-predicted deaths matched the non-COVID-19 deaths recorded in the territory of Belgrade. This concordance between the predicted and recorded non-COVID-19 deaths provides evidence that the cause-of-death misattribution did not occur in the territory of Belgrade. Conclusions: The finalized mortality dataset for Belgrade can be safely used in COVID-19 impact analysis. Belgrade experienced a significant increase in mortality during 2020 and 2021, with most of the excess mortality attributable to SARS-CoV-2. Concerns about increased mortality from causes other than COVID-19 in Belgrade seem misplaced as their impact appears negligible. %R 10.2196/56877 %U https://publichealth.jmir.org/2025/1/e56877 %U https://doi.org/10.2196/56877 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 11 %N %P e69048 %T Balancing Human Mobility and Health Care Coverage in Sentinel Surveillance of Brazilian Indigenous Areas: Mathematical Optimization Approach %A Oliveira,Juliane Fonseca %A Vasconcelos,Adriano O %A Alencar,Andrêza L %A Cunha,Maria Célia S L %A Marcilio,Izabel %A Barral-Netto,Manoel %A P Ramos,Pablo Ivan %K representative sentinel surveillance %K early pathogen detection %K indigenous health %K human mobility %K surveillance network optimization %K infectious disease surveillance %K public health strategy %K Brazil %D 2025 %7 1.4.2025 %9 %J JMIR Public Health Surveill %G English %X Background: Optimizing sentinel surveillance site allocation for early pathogen detection remains a challenge, particularly in ensuring coverage of vulnerable and underserved populations. Objective: This study evaluates the current respiratory pathogen surveillance network in Brazil and proposes an optimized sentinel site distribution that balances Indigenous population coverage and national human mobility patterns. Methods: We compiled Indigenous Special Health District (Portuguese: Distrito Sanitário Especial Indígena [DSEI]) locations from the Brazilian Ministry of Health and estimated national mobility routes by using the Ford-Fulkerson algorithm, incorporating air, road, and water transportation data. To optimize sentinel site selection, we implemented a linear optimization algorithm that maximizes (1) Indigenous region representation and (2) human mobility coverage. We validated our approach by comparing results with Brazil’s current influenza sentinel network and analyzing the health attraction index from the Brazilian Institute of Geography and Statistics to assess the feasibility and potential benefits of our optimized surveillance network. Results: The current Brazilian network includes 199 municipalities, representing 3.6% (199/5570) of the country’s cities. The optimized sentinel site design, while keeping the same number of municipalities, ensures 100% coverage of all 34 DSEI regions while rearranging 108 (54.3%) of the 199 cities from the existing flu sentinel system. This would result in a more representative sentinel network, addressing gaps in 9 of 34 previously uncovered DSEI regions, which span 750,515 km² and have a population of 1.11 million. Mobility coverage would improve by 16.8 percentage points, from 52.4% (4,598,416 paths out of 8,780,046 total paths) to 69.2% (6,078,747 paths out of 8,780,046 total paths). Additionally, all newly selected cities serve as hubs for medium- or high-complexity health care, ensuring feasibility for pathogen surveillance. Conclusions: The proposed framework optimizes sentinel site allocation to enhance disease surveillance and early detection. By maximizing DSEI coverage and integrating human mobility patterns, this approach provides a more effective and equitable surveillance network, which would particularly benefit underserved Indigenous regions. %R 10.2196/69048 %U https://publichealth.jmir.org/2025/1/e69048 %U https://doi.org/10.2196/69048 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 11 %N %P e62828 %T Alternative Presentations of Overall and Statistical Uncertainty for Adults’ Understanding of the Results of a Randomized Trial of a Public Health Intervention: Parallel Web-Based Randomized Trials %A Holst,Christine %A Woloshin,Steven %A Oxman,Andrew D %A Rose,Christopher %A Rosenbaum,Sarah %A Munthe-Kaas,Heather Menzies %+ Centre for Epidemic Interventions Research, Norwegian Institute of Public Health, Myrens Verksted 3L, Oslo, 0213, Norway, 47 48234044, Christine.Holst@fhi.no %K communication %K Grading of Recommendations Assessment, Development, and Evaluation language %K GRADE language %K statistical uncertainty %K overall uncertainty %K randomized trial %D 2025 %7 18.3.2025 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Well-designed public health messages can help people make informed choices, while poorly designed messages or persuasive messages can confuse, lead to poorly informed decisions, and diminish trust in health authorities and research. Communicating uncertainties to the public about the results of health research is challenging, necessitating research on effective ways to disseminate this important aspect of randomized trials. Objective: This study aimed to evaluate people’s understanding of overall and statistical uncertainty when presented with alternative ways of expressing randomized trial results. Methods: Two parallel, web-based, individually randomized trials (3×2 factorial designs) were conducted in the United States and Norway. Participants were randomized to 1 of 6 versions of a text (summary) communicating results from a study examining the effects of wearing glasses to prevent COVID-19 infection. The summaries varied in how overall uncertainty (“Grading of Recommendations Assessment, Development and Evaluation [GRADE] language,” “plain language,” or “no explicit language”) and statistical uncertainty (whether a margin of error was shown or not) were presented. Participants completed a web-based questionnaire exploring 4 coprimary outcomes: 3 to measure understanding of overall uncertainty (benefits, harms, and sufficiency of evidence), and one to measure statistical uncertainty. Participants were adults who do not wear glasses recruited from web-based research panels in the United States and Norway. Results of the trials were analyzed separately and combined in a meta-analysis. Results: In the US and Norwegian trials, 730 and 497 individuals were randomized, respectively; data for 543 (74.4%) and 452 (90.9%) were analyzed. More participants had a correct understanding of uncertainty when presented with plain language (United States: 37/99, 37% and Norway: 40/76, 53%) than no explicit language (United States: 18/86, 21% and Norway: 34/80, 42%). Similar positive effect was seen for the GRADE language in the United States (26/79, 33%) but not in Norway (30/71, 42%). There were only small differences between groups for understanding the uncertainty of harms. Plain language improved correct understanding of evidence sufficiency (odds ratio 2.05, 95% CI 1.17-3.57), compared to no explicit language. The effect of GRADE language was inconclusive (odds ratio 1.34, 95% CI 0.79-2.28). The understanding of statistical uncertainty was improved when the participants were shown the margin of error compared to not being shown: Norway: 16/75, 21% to 24/71, 34% vs 1/71, 1% to 2/76, 3% and the United States: 21/101, 21% to 32/90, 36% vs 0/86, 0% to 3/79, 4%). Conclusions: Plain language, but not GRADE language, was better than no explicit language in helping people understand overall uncertainty of benefits and harms. Reporting margin of error improved understanding of statistical uncertainty around the effect of wearing glasses, but only for a minority of participants. Trial Registration: ClinicalTrials.gov NCT05642754; https://tinyurl.com/4mhjsm7s %M 40101228 %R 10.2196/62828 %U https://publichealth.jmir.org/2025/1/e62828 %U https://doi.org/10.2196/62828 %U http://www.ncbi.nlm.nih.gov/pubmed/40101228 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 11 %N %P e52972 %T COVID-19 Testing Equity in New York City During the First 2 Years of the Pandemic: Demographic Analysis of Free Testing Data %A Rosenfeld,Daniel %A Brennan,Sean %A Wallach,Andrew %A Long,Theodore %A Keeley,Chris %A Kurien,Sarah Joseph %K COVID-19 testing %K health disparities %K equity in testing %K New York City %K socioeconomic factors %K testing accessibility %K health care inequalities %K demographic analysis %K COVID-19 mortality %K coronavirus %K SARS-CoV-2 %K pandemic %K equitable testing %K cost %K poor neighborhood %K resources %D 2025 %7 13.3.2025 %9 %J JMIR Public Health Surveill %G English %X Background: COVID-19 has caused over 46,000 deaths in New York City, with a disproportional impact on certain communities. As part of the COVID-19 response, the city has directly administered over 6 million COVID-19 tests (in addition to millions of indirectly administered tests not covered in this analysis) at no cost to individuals, resulting in nearly half a million positive results. Given that the prevalence of testing, throughout the pandemic, has tended to be higher in more affluent areas, these tests were targeted to areas with fewer resources. Objective: This study aimed to evaluate the impact of New York City’s COVID-19 testing program; specifically, we aimed to review its ability to provide equitable testing in economically, geographically, and demographically diverse populations. Of note, in addition to the brick-and-mortar testing sites evaluated herein, this program conducted 2.1 million tests through mobile units to further address testing inequity. Methods: Testing data were collected from the in-house Microsoft SQL Server Management Studio 18 Clarity database, representing 6,347,533 total tests and 449,721 positive test results. These tests were conducted at 48 hospital system locations. Per capita testing rates by zip code tabulation area (ZCTA) and COVID-19 positivity rates by ZCTA were used as dependent variables in separate regressions. Median income, median age, the percentage of English-speaking individuals, and the percentage of people of color were used as independent demographic variables to analyze testing patterns across several intersecting identities. Negative binomial regressions were run in a Jupyter Notebook using Python. Results: Per capita testing inversely correlated with median income geographically. The overall pseudo r2 value was 0.1101 when comparing hospital system tests by ZCTA against the selected variables. The number of tests significantly increased as median income fell (SE 1.00000155; P<.001). No other variables correlated at a significant level with the number of tests (all P values were >.05). When considering positive test results by ZCTA, the number of positive test results also significantly increased as median income fell (SE 1.57e–6; P<.001) and as the percentage of female residents fell (SE 0.957; P=.001). The number of positive test results by ZCTA rose at a significant level alongside the percentage of English-only speakers (SE 0.271; P=.03). Conclusions: New York City’s COVID-19 testing program was able to improve equity through the provision of no-cost testing, which focused on areas of the city that were disproportionately impacted by COVID-19 and had fewer resources. By detecting higher numbers of positive test results in resource-poor neighborhoods, New York City was able to deploy additional resources, such as those for contact tracing and isolation and quarantine support (eg, free food delivery and free hotel stays), early during the COVID-19 pandemic. Equitable deployment of testing is feasible and should be considered early in future epidemics or pandemics. %R 10.2196/52972 %U https://publichealth.jmir.org/2025/1/e52972 %U https://doi.org/10.2196/52972 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 11 %N %P e64914 %T Characterizing US Spatial Connectivity and Implications for Geographical Disease Dynamics and Metapopulation Modeling: Longitudinal Observational Study %A Pullano,Giulia %A Alvarez-Zuzek,Lucila Gisele %A Colizza,Vittoria %A Bansal,Shweta %K geographical disease dynamics %K spatial connectivity %K mobility data %K metapopulation modeling %K COVID-19 %K human mobility %K infectious diseases %K social distancing %K epidemic %K mobile apps %K SafeGraph %K SARS-CoV-2 %K coronavirus %K pandemic %K spatio-temporal %K US %K public health %K mobile health %K mHealth %K digital health %K health informatics %D 2025 %7 18.2.2025 %9 %J JMIR Public Health Surveill %G English %X Background: Human mobility is expected to be a critical factor in the geographic diffusion of infectious diseases, and this assumption led to the implementation of social distancing policies during the early fight against the COVID-19 emergency in the United States. Yet, because of substantial data gaps in the past, what still eludes our understanding are the following questions: (1) How does mobility contribute to the spread of infection within the United States at local, regional, and national scales? (2) How do seasonality and shifts in behavior affect mobility over time? (3) At what geographic level is mobility homogeneous across the United States? Objective: This study aimed to address the questions that are critical for developing accurate transmission models, predicting the spatial propagation of disease across scales, and understanding the optimal geographical and temporal scale for the implementation of control policies. Methods: We analyzed high-resolution mobility data from mobile app usage from SafeGraph Inc, mapping daily connectivity between the US counties to grasp spatial clustering and temporal stability. Integrating this into a spatially explicit transmission model, we replicated SARS-CoV-2’s first wave invasion, assessing mobility’s spatiotemporal impact on disease predictions. Results: Analysis from 2019 to 2021 showed that mobility patterns remained stable, except for a decline in April 2020 due to lockdowns, which reduced daily movements from 45 million to approximately 25 million nationwide. Despite this reduction, intercounty connectivity remained seasonally stable, largely unaffected during the early COVID-19 phase, with a median Spearman coefficient of 0.62 (SD 0.01) between daily connectivity and gravity networks. We identified 104 geographic clusters of US counties with strong internal mobility connectivity and weaker links to counties outside these clusters. These clusters were stable over time, largely overlapping state boundaries (normalized mutual information=0.82) and demonstrating high temporal stability (normalized mutual information=0.95). Our findings suggest that intercounty connectivity is relatively static and homogeneous at the substate level. Furthermore, while county-level, daily mobility data best captures disease invasion, static mobility data aggregated to the cluster level also effectively models spatial diffusion. Conclusions: Our work demonstrates that intercounty mobility was negligibly affected outside the lockdown period in April 2020, explaining the broad spatial distribution of COVID-19 outbreaks in the United States during the early phase of the pandemic. Such geographically dispersed outbreaks place a significant strain on national public health resources and necessitate complex metapopulation modeling approaches for predicting disease dynamics and control design. We thus inform the design of such metapopulation models to balance high disease predictability with low data requirements. %R 10.2196/64914 %U https://publichealth.jmir.org/2025/1/e64914 %U https://doi.org/10.2196/64914 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 11 %N %P e65439 %T Consistency of Daily Number of Reported COVID-19 Cases in 191 Countries From 2020 to 2022: Comparative Analysis of 2 Major Data Sources %A Liu,Han %A Zong,Huiying %A Yang,Yang %A Schwebel,David C %A Xie,Bin %A Ning,Peishan %A Rao,Zhenzhen %A Li,Li %A Hu,Guoqing %K COVID-19 %K pandemic %K data consistency %K World Health Organization %K data quality %D 2025 %7 6.2.2025 %9 %J JMIR Public Health Surveill %G English %X Background: The COVID-19 pandemic represents one of the most challenging public health emergencies in recent world history, causing about 7.07 million deaths globally by September 24, 2024. Accurate, timely, and consistent data are critical for early response to situations like the COVID-19 pandemic. Objective: This study aimed to evaluate consistency of daily reported COVID-19 cases in 191 countries from the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) and the World Health Organization (WHO) dashboards during 2020‐2022. Methods: We retrieved data concerning new daily COVID-19 cases in 191 countries covered by both data sources from January 22, 2020, to December 31, 2022. The ratios of numbers of daily reported cases from the 2 sources were calculated to measure data consistency. We performed simple linear regression to examine significant changes in the ratio of numbers of daily reported cases during the study period. Results: Of 191 WHO member countries, only 60 displayed excellent data consistency in the number of daily reported COVID-19 cases between the WHO and JHU CSSE dashboards (mean ratio 0.9-1.1). Data consistency changed greatly across the 191 countries from 2020 to 2022 and differed across 4 types of countries, categorized by income. Data inconsistency between the 2 data sources generally decreased slightly over time, both for the 191 countries combined and within the 4 types of income-defined countries. The absolute relative difference between the 2 data sources increased in 84 countries, particularly for Malta (R2=0.25), Montenegro (R2=0.30), and the United States (R2=0.29), but it decreased significantly in 40 countries. Conclusions: The inconsistency between the 2 data sources warrants further research. Construction of public health surveillance and data collection systems for public health emergencies like the COVID-19 pandemic should be strengthened in the future. %R 10.2196/65439 %U https://publichealth.jmir.org/2025/1/e65439 %U https://doi.org/10.2196/65439 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 11 %N %P e56820 %T Mapping Key Populations to Develop Improved HIV and AIDS Interventions: Multiphase Cross-Sectional Observational Mapping Study Using a District and City Approach %A Januraga,Pande Putu %A Lukitosari,Endang %A Luhukay,Lanny %A Hasby,Rizky %A Sutrisna,Aang %+ Center for Public Health Innovation, Faculty of Medicine, Udayana University, Jl PB Sudirman, Denpasar, 80232, Indonesia, 62 81246180389, januraga@unud.ac.id %K Indonesia %K key population %K mapping %K pandemic %K HIV %K AIDS %K hotspot %D 2025 %7 30.1.2025 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Indonesia’s vast archipelago and substantial population size present unique challenges in addressing its multifaceted HIV epidemic, with 90% of its 514 districts and cities reporting cases. Identifying key populations (KPs) is essential for effectively targeting interventions and allocating resources to address the changing dynamics of the epidemic. Objective: We examine the 2022 mapping of Indonesia’s KPs to develop improved HIV and AIDS interventions. Methods: In 2022, a district-based mapping of KPs was conducted across 201 districts and cities chosen for their HIV program intensity. This multiphase process included participatory workshops for hotspot identification, followed by direct hotspot observation, then followed by a second direct observation in selected hotspots for quality control. Data from 49,346 informants (KPs) were collected and analyzed. The results from individual hotspots were aggregated at the district or city level, and a formula was used to estimate the population size. Results: The mapping initiative identified 18,339 hotspots across 201 districts and cities, revealing substantial disparities in hotspot distribution. Of the 18,339 hotspots, 16,964 (92.5%) were observed, of which 1822 (10.74%) underwent a second review to enhance data accuracy. The findings mostly aligned with local stakeholders’ estimates, but showed a lower median. Interviews indicated a shift in KP dynamics, with a median decline in hotspot attendance since the pandemic, and there was notable variation in mapping results across district categories. In “comprehensive” areas, the average results for men who have sex with men (MSM), people who inject drugs, transgender women, and female sex workers (FSWs) were 1008 (median 694, IQR 317-1367), 224 (median 114, IQR 59-202), 196 (median 167, IQR 81-265), and 775 (median 573, IQR 352-1131), respectively. “Medium” areas had lower averages: MSM at 381 (median 199, IQR 91-454), people who inject drugs at 51 (median 54, IQR 15-63), transgender women at 101 (median 55, IQR 29-127), and FSWs at 304 (median 231, IQR 118-425). “Basic” areas showed the lowest averages: MSM at 161 (median 73, IQR 49-285), people who inject drugs at 7 (median 7, IQR 7-7), transgender women at 59 (median 26, IQR 12-60), and FSWs at 161 (median 131, IQR 59-188). Comparisons with ongoing outreach programs revealed substantial differences: the mapped MSM population was >50% lower than program coverage; the estimates for people who inject drugs were twice as high as the program coverage. Conclusions: The mapping results highlight significant variations in hotspots and KPs across districts and cities and underscore the necessity of adaptive HIV prevention strategies. The findings informed programmatic decisions, such as reallocating resources to underserved districts and recalibrating outreach strategies to better match KP dynamics. Developing strategies beyond identified hotspots, integrating mapping data into planning, and adopting a longitudinal approach to understand KP behavior over time are critical for effective HIV and AIDS prevention and control. %M 39883483 %R 10.2196/56820 %U https://publichealth.jmir.org/2025/1/e56820 %U https://doi.org/10.2196/56820 %U http://www.ncbi.nlm.nih.gov/pubmed/39883483 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 13 %N %P e59452 %T The Social Construction of Categorical Data: Mixed Methods Approach to Assessing Data Features in Publicly Available Datasets %A Willem,Theresa %A Wollek,Alessandro %A Cheslerean-Boghiu,Theodor %A Kenney,Martha %A Buyx,Alena %+ Institute of History and Ethics in Medicine, School of Medicine and Health, Technical University of Munich, Ismaningerstraße 22, Munich, 81675, Germany, 49 89 4140 4041, theresa.willem@tum.de %K machine learning %K categorical data %K social context dependency %K mixed methods %K dermatology %K dataset analysis %D 2025 %7 28.1.2025 %9 Original Paper %J JMIR Med Inform %G English %X Background: In data-sparse areas such as health care, computer scientists aim to leverage as much available information as possible to increase the accuracy of their machine learning models’ outputs. As a standard, categorical data, such as patients’ gender, socioeconomic status, or skin color, are used to train models in fusion with other data types, such as medical images and text-based medical information. However, the effects of including categorical data features for model training in such data-scarce areas are underexamined, particularly regarding models intended to serve individuals equitably in a diverse population. Objective: This study aimed to explore categorical data’s effects on machine learning model outputs, rooted the effects in the data collection and dataset publication processes, and proposed a mixed methods approach to examining datasets’ data categories before using them for machine learning training. Methods: Against the theoretical background of the social construction of categories, we suggest a mixed methods approach to assess categorical data’s utility for machine learning model training. As an example, we applied our approach to a Brazilian dermatological dataset (Dermatological and Surgical Assistance Program at the Federal University of Espírito Santo [PAD-UFES] 20). We first present an exploratory, quantitative study that assesses the effects when including or excluding each of the unique categorical data features of the PAD-UFES 20 dataset for training a transformer-based model using a data fusion algorithm. We then pair our quantitative analysis with a qualitative examination of the data categories based on interviews with the dataset authors. Results: Our quantitative study suggests scattered effects of including categorical data for machine learning model training across predictive classes. Our qualitative analysis gives insights into how the categorical data were collected and why they were published, explaining some of the quantitative effects that we observed. Our findings highlight the social constructedness of categorical data in publicly available datasets, meaning that the data in a category heavily depend on both how these categories are defined by the dataset creators and the sociomedico context in which the data are collected. This reveals relevant limitations of using publicly available datasets in contexts different from those of the collection of their data. Conclusions: We caution against using data features of publicly available datasets without reflection on the social construction and context dependency of their categorical data features, particularly in data-sparse areas. We conclude that social scientific, context-dependent analysis of available data features using both quantitative and qualitative methods is helpful in judging the utility of categorical data for the population for which a model is intended. %M 39874567 %R 10.2196/59452 %U https://medinform.jmir.org/2025/1/e59452 %U https://doi.org/10.2196/59452 %U http://www.ncbi.nlm.nih.gov/pubmed/39874567 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 9 %N %P e58981 %T Methods to Adjust for Confounding in Test-Negative Design COVID-19 Effectiveness Studies: Simulation Study %A Rowley,Elizabeth AK %A Mitchell,Patrick K %A Yang,Duck-Hye %A Lewis,Ned %A Dixon,Brian E %A Vazquez-Benitez,Gabriela %A Fadel,William F %A Essien,Inih J %A Naleway,Allison L %A Stenehjem,Edward %A Ong,Toan C %A Gaglani,Manjusha %A Natarajan,Karthik %A Embi,Peter %A Wiegand,Ryan E %A Link-Gelles,Ruth %A Tenforde,Mark W %A Fireman,Bruce %+ , Westat, 1600 Research Blvd, Rockville, MD, 20850, United States, 1 301 251 1500, ELIZABETHROWLEY@WESTAT.COM %K disease risk score %K propensity score %K vaccine effectiveness %K COVID-19 %K simulation study %K usefulness %K comorbidity %K assessment %D 2025 %7 27.1.2025 %9 Original Paper %J JMIR Form Res %G English %X Background: Real-world COVID-19 vaccine effectiveness (VE) studies are investigating exposures of increasing complexity accounting for time since vaccination. These studies require methods that adjust for the confounding that arises when morbidities and demographics are associated with vaccination and the risk of outcome events. Methods based on propensity scores (PS) are well-suited to this when the exposure is dichotomous, but present challenges when the exposure is multinomial. Objective: This simulation study aimed to investigate alternative methods to adjust for confounding in VE studies that have a test-negative design. Methods: Adjustment for a disease risk score (DRS) is compared with multivariable logistic regression. Both stratification on the DRS and direct covariate adjustment of the DRS are examined. Multivariable logistic regression with all the covariates and with a limited subset of key covariates is considered. The performance of VE estimators is evaluated across a multinomial vaccination exposure in simulated datasets. Results: Bias in VE estimates from multivariable models ranged from –5.3% to 6.1% across 4 levels of vaccination. Standard errors of VE estimates were unbiased, and 95% coverage probabilities were attained in most scenarios. The lowest coverage in the multivariable scenarios was 93.7% (95% CI 92.2%-95.2%) and occurred in the multivariable model with key covariates, while the highest coverage in the multivariable scenarios was 95.3% (95% CI 94.0%-96.6%) and occurred in the multivariable model with all covariates. Bias in VE estimates from DRS-adjusted models was low, ranging from –2.2% to 4.2%. However, the DRS-adjusted models underestimated the standard errors of VE estimates, with coverage sometimes below the 95% level. The lowest coverage in the DRS scenarios was 87.8% (95% CI 85.8%-89.8%) and occurred in the direct adjustment for the DRS model. The highest coverage in the DRS scenarios was 94.8% (95% CI 93.4%-96.2%) and occurred in the model that stratified on DRS. Although variation in the performance of VE estimates occurred across modeling strategies, variation in performance was also present across exposure groups. Conclusions: Overall, models using a DRS to adjust for confounding performed adequately but not as well as the multivariable models that adjusted for covariates individually. %M 39869907 %R 10.2196/58981 %U https://formative.jmir.org/2025/1/e58981 %U https://doi.org/10.2196/58981 %U http://www.ncbi.nlm.nih.gov/pubmed/39869907 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 14 %N %P e64316 %T Interventions to Maintain HIV/AIDS, Tuberculosis, and Malaria Service Delivery During Public Health Emergencies in Low- and Middle-Income Countries: Protocol for a Systematic Review %A Kabwama,Steven Ndugwa %A Wanyenze,Rhoda K. %A Lindgren,Helena %A Razaz,Neda %A Ssenkusu,John M %A Alfvén,Tobias %+ Department of Global Public Health, Karolinska Institutet, Tomtebodavägen 18A, Solna, Stockholm, 17177, Sweden, 46 707578093, steven.ndugwa.kabwama@ki.se %K service availability %K emergencies %K tuberculosis %K malaria %K systematic reviews %K health services %K emergencies %K HIV %K AIDS %K public health emergency %K low- and middle-income countries %K qualitative reviews %K qualitative %K policies %K communities %K health facilities %K emergency %K implement %K implementation %D 2025 %7 15.1.2025 %9 Protocol %J JMIR Res Protoc %G English %X Background: Although existing disease preparedness and response frameworks provide guidance about strengthening emergency response capacity, little attention is paid to health service continuity during emergency responses. During the 2014 Ebola outbreak, there were 11,325 reported deaths due to the Ebola virus and yet disruption in access to care caused more than 10,000 additional deaths due to measles, HIV/AIDS, tuberculosis, and malaria. Low- and middle-income countries account for the largest disease burden due to HIV, tuberculosis, and malaria and yet previous responses to health emergencies showed that HIV, tuberculosis, and malaria service delivery can be significantly disrupted. To date, there has not been a systematic synthesis of interventions implemented to maintain the delivery of these services during emergencies. Objective: This study aimed to synthesize the interventions implemented to maintain HIV/AIDS, tuberculosis, and malaria services during public health emergencies in low- and middle-income countries. Methods: The systematic review was registered in the international register for prospective systematic reviews. It will include activities undertaken to improve human health either through preventing the occurrence of HIV, tuberculosis, or malaria, reducing the severity among patients, or promoting the restoration of functioning lost as a result of experiencing HIV, tuberculosis, or malaria during health emergencies. These will include policy-level (eg, development of guidelines), health facility–level (eg, service rescheduling), and community-level interventions (eg, community drug distribution). Service delivery will be in terms of improving access, availability, use, and coverage. We will report on any interventions to maintain services along the care cascade for HIV, tuberculosis, or malaria. Peer-reviewed study databases including MEDLINE, Web of Science, Embase, Cochrane, and Global Index Medicus will be searched. Reference lists from global reports on HIV/AIDS, tuberculosis, or malaria will also be searched. We will use the GRADE-CERQual (Grading of Recommendations Assessment, Development, and Evaluation—Confidence in Evidence from Reviews of Qualitative Research) approach to report on the quality of evidence in each paper. The information from the studies will be synthesized at the disease or condition level (HIV/AIDS, tuberculosis, and malaria), implementation level (policy, health facility, and community), and outcomes (improving access, availability, use, or coverage). We will use the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist to report findings and discuss implications for strengthening preparedness and response, as well as strengthening health systems in low- and middle-income countries. Results: The initial search for published literature was conducted between January 2023 and March 2023 and yielded 8119 studies. At the time of publication, synthesis and interpretation of results were being concluded. Final results will be published in 2025. Conclusions: The findings will inform the development of national and global guidance to minimize disruption of services for patients with HIV/AIDS, tuberculosis, and malaria during public health emergencies. Trial Registration: PROSPERO CRD42023408967; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=408967 International Registered Report Identifier (IRRID): PRR1-10.2196/64316 %M 39813677 %R 10.2196/64316 %U https://www.researchprotocols.org/2025/1/e64316 %U https://doi.org/10.2196/64316 %U http://www.ncbi.nlm.nih.gov/pubmed/39813677 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 17 %N %P e56495 %T Nowcasting to Monitor Real-Time Mpox Trends During the 2022 Outbreak in New York City: Evaluation Using Reportable Disease Data Stratified by Race or Ethnicity %A Rohrer,Rebecca %A Wilson,Allegra %A Baumgartner,Jennifer %A Burton,Nicole %A Ortiz,Ray R %A Dorsinville,Alan %A Jones,Lucretia E %A Greene,Sharon K %K data quality %K epidemiology %K forecasting %K infectious disease %K morbidity and mortality trends %K mpox %K nowcasting %K public health practice %K surveillance %D 2025 %7 14.1.2025 %9 %J Online J Public Health Inform %G English %X Background: Applying nowcasting methods to partially accrued reportable disease data can help policymakers interpret recent epidemic trends despite data lags and quickly identify and remediate health inequities. During the 2022 mpox outbreak in New York City, we applied Nowcasting by Bayesian Smoothing (NobBS) to estimate recent cases, citywide and stratified by race or ethnicity (Black or African American, Hispanic or Latino, and White). However, in real time, it was unclear if the estimates were accurate. Objective: We evaluated the accuracy of estimated mpox case counts across a range of NobBS implementation options. Methods: We evaluated NobBS performance for New York City residents with a confirmed or probable mpox diagnosis or illness onset from July 8 through September 30, 2022, as compared with fully accrued cases. We used the exponentiated average log score (average score) to compare moving window lengths, stratifying or not by race or ethnicity, diagnosis and onset dates, and daily and weekly aggregation. Results: During the study period, 3305 New York City residents were diagnosed with mpox (median 4, IQR 3-5 days from diagnosis to diagnosis report). Of these, 812 (25%) had missing onset dates, and of these, 230 (28%) had unknown race or ethnicity. The median lag in days from onset to onset report was 10 (IQR 7-14). For daily hindcasts by diagnosis date, the average score was 0.27 for the 14-day moving window used in real time. Average scores improved (increased) with longer moving windows (maximum: 0.47 for 49-day window). Stratifying by race or ethnicity improved performance, with an overall average score of 0.38 for the 14-day moving window (maximum: 0.57 for 49 day-window). Hindcasts for White patients performed best, with average scores of 0.45 for the 14-day window and 0.75 for the 49-day window. For unstratified, daily hindcasts by onset date, the average score ranged from 0.16 for the 42-day window to 0.30 for the 14-day window. Performance was not improved by weekly aggregation. Hindcasts underestimated diagnoses in early August after the epidemic peaked, then overestimated diagnoses in late August as the epidemic waned. Estimates were most accurate during September when cases were low and stable. Conclusions: Performance was better when hindcasting by diagnosis date than by onset date, consistent with shorter lags and higher completeness for diagnoses. For daily hindcasts by diagnosis date, longer moving windows performed better, but direct comparisons are limited because longer windows could only be assessed after case counts in this outbreak had stabilized. Stratification by race or ethnicity improved performance and identified differences in epidemic trends across patient groups. Contributors to differences in performance across strata might include differences in case volume, epidemic trends, delay distributions, and interview success rates. Health departments need reliable nowcasting and rapid evaluation tools, particularly to promote health equity by ensuring accurate estimates within all strata. %R 10.2196/56495 %U https://ojphi.jmir.org/2025/1/e56495 %U https://doi.org/10.2196/56495 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 9 %N %P e59230 %T Quantifying the Regional Disproportionality of COVID-19 Spread: Modeling Study %A Sasaki,Kenji %A Ikeda,Yoichi %A Nakano,Takashi %K infectious disease %K COVID-19 %K epidemiology %K public health %K SARS-CoV-2 %K pandemic %K inequality measure %K information theory %K Kullback-Leibler divergence %D 2025 %7 3.1.2025 %9 %J JMIR Form Res %G English %X Background: The COVID-19 pandemic has caused serious health, economic, and social consequences worldwide. Understanding how infectious diseases spread can help mitigate these impacts. The Theil index, a measure of inequality rooted in information theory, is useful for identifying geographic disproportionality in COVID-19 incidence across regions. Objective: This study focused on capturing the degrees of regional disproportionality in incidence rates of infectious diseases over time. Using the Theil index, we aim to assess regional disproportionality in the spread of COVID-19 and detect epicenters where the number of infected individuals was disproportionately concentrated. Methods: To quantify the degree of disproportionality in the incidence rates, we applied the Theil index to the publicly available data of daily confirmed COVID-19 cases in the United States over a 1100-day period. This index measures relative disproportionality by comparing daily regional case distributions with population proportions, thereby identifying regions where infections are disproportionately concentrated. Results: Our analysis revealed a dynamic pattern of regional disproportionality in the confirmed cases by monitoring variations in regional contributions to the Theil index as the pandemic progressed. Over time, the index reflected a transition from localized outbreaks to widespread transmission, with high values corresponding to concentrated cases in some regions. We also found that the peaks in the Theil index often preceded surges in confirmed cases, suggesting its potential utility as an early warning signal. Conclusions: This study demonstrated that the Theil index is one of the effective indices for quantifying regional disproportionality in COVID-19 incidence rates. Although the Theil index alone cannot fully capture all aspects of pandemic dynamics, it serves as a valuable tool when used alongside other indicators such as infection and hospitalization rates. This approach allows policy makers to monitor regional disproportionality efficiently, offering insights for early intervention and targeted resource allocation. %R 10.2196/59230 %U https://formative.jmir.org/2025/1/e59230 %U https://doi.org/10.2196/59230 %0 Journal Article %@ 1929-073X %I JMIR Publications %V 13 %N %P e54240 %T Close-Up on Ambulance Service Estimation in Indonesia: Monte Carlo Simulation Study %A Brice,Syaribah N %A Boutilier,Justin J %A Palmer,Geraint %A Harper,Paul R %A Knight,Vincent %A Tuson,Mark %A Gartner,Daniel %+ School of Mathematics, Cardiff University, Senghennydd Road, Cardiff, CF24 4AG, United Kingdom, 44 (0)29 2087 4811, BriceSN@cardiff.ac.uk %K emergency medical services %K ambulance services %K hospital emergency services %K Southeast Asian countries %K low-and-middle-income countries %K EMS %K survey %D 2024 %7 13.12.2024 %9 Original Paper %J Interact J Med Res %G English %X Background: Emergency medical services have a pivotal role in giving timely and appropriate responses to emergency events caused by medical, natural, or human-caused disasters. To provide adequate resources for the emergency services, such as ambulances, it is necessary to understand the demand for such services. In Indonesia, estimates of demand for emergency services cannot be obtained easily due to a lack of published literature or official reports concerning the matter. Objective: This study aimed to ascertain an estimate of the annual volume of hospital emergency visits and the corresponding demand for ambulance services in the city of Jakarta. Methods: In this study, we addressed the problem of emergency services demand estimation when aggregated detailed data are not available or are not part of the routine data collection. We used survey data together with the local Office of National Statistics reports and sample data from hospital emergency departments to establish parameter estimation. This involved estimating 4 parameters: the population of each area per period (day and night), the annual per capita hospital emergency visits, the probability of an emergency taking place in each period, and the rate of ambulance need per area. Monte Carlo simulation and naïve methods were used to generate an estimation for the mean ambulance needs per area in Jakarta. Results: The results estimated that the total annual ambulance need in Jakarta is between 83,000 and 241,000. Assuming the rate of ambulance usage in Jakarta at 9.3%, we estimated the total annual hospital emergency visits in Jakarta at around 0.9-2.6 million. The study also found that the estimation from using the simulation method was smaller than the average (naïve) methods (P<.001). Conclusions: The results provide an estimation of the annual emergency services needed for the city of Jakarta. In the absence of aggregated routinely collected data on emergency medical service usage in Jakarta, our results provide insights into whether the current emergency services, such as ambulances, have been adequately provided. %M 39671572 %R 10.2196/54240 %U https://www.i-jmr.org/2024/1/e54240 %U https://doi.org/10.2196/54240 %U http://www.ncbi.nlm.nih.gov/pubmed/39671572 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e53218 %T Updated Surveillance Metrics and History of the COVID-19 Pandemic (2020-2023) in Canada: Longitudinal Trend Analysis %A Wu,Scott A %A Soetikno,Alan G %A Ozer,Egon A %A Welch,Sarah B %A Liu,Yingxuan %A Havey,Robert J %A Murphy,Robert L %A Hawkins,Claudia %A Mason,Maryann %A Post,Lori A %A Achenbach,Chad J %A Lundberg,Alexander L %+ Buehler Center for Health Policy and Economics, Robert J Havey, MD Institute for Global Health, Northwestern University, 420 E Superior, Chicago, IL, 60611, United States, 1 3125031706, lori.post@northwestern.edu %K SARS-CoV-2 %K COVID-19 %K Canada %K pandemic %K surveillance %K transmission %K acceleration %K deceleration %K dynamic panel %K generalized method of moments %K GMM %K Arellano-Bond %K 7-day lag %K k %K metrics %K epidemiology %K dynamic %K genomic %K historical context %K outbreak threshold %D 2024 %7 5.12.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: This study provides an update on the status of the COVID-19 pandemic in Canada, building upon our initial analysis conducted in 2020 by incorporating an additional 2 years of data. Objective: This study aims to (1) summarize the status of the pandemic in Canada when the World Health Organization (WHO) declared the end of the public health emergency for the COVID-19 pandemic on May 5, 2023; (2) use dynamic and genomic surveillance methods to describe the history of the pandemic in Canada and situate the window of the WHO declaration within the broader history; and (3) provide historical context for the course of the pandemic in Canada. Methods: This longitudinal study analyzed trends in traditional surveillance data and dynamic panel estimates for COVID-19 transmissions and deaths in Canada from June 2020 to May 2023. We also used sequenced SARS-CoV-2 variants from the Global Initiative on Sharing All Influenza Data (GISAID) to identify the appearance and duration of variants of concern. For these sequences, we used Nextclade nomenclature to collect clade designations and Pangolin nomenclature for lineage designations of SARS-CoV-2. We used 1-sided t tests of dynamic panel regression coefficients to measure the persistence of COVID-19 transmissions around the WHO declaration. Finally, we conducted a 1-sided t test for whether provincial and territorial weekly speed was greater than an outbreak threshold of 10. We ran the test iteratively with 6 months of data across the sample period. Results: Canada’s speed remained below the outbreak threshold for 8 months by the time of the WHO declaration ending the COVID-19 emergency of international concern. Acceleration and jerk were also low and stable. While the 1-day persistence coefficient remained statistically significant and positive (1.074; P<.001), the 7-day coefficient was negative and small in magnitude (–0.080; P=.02). Furthermore, shift parameters for either of the 2 most recent weeks around May 5, 2023, were negligible (0.003 and 0.018, respectively, with P values of .75 and .31), meaning the clustering effect of new COVID-19 cases had remained stable in the 2 weeks around the WHO declaration. From December 2021 onward, Omicron was the predominant variant of concern in sequenced viral samples. The rolling 1-sided t test of speed equal to 10 became entirely insignificant from mid-October 2022 onward. Conclusions: While COVID-19 continues to circulate in Canada, the rate of transmission remained well below the threshold of an outbreak for 8 months ahead of the WHO declaration. Both standard and enhanced surveillance metrics confirm that the pandemic had largely ended in Canada by the time of the WHO declaration. These results can inform future public health interventions and strategies in Canada, as well as contribute to the global understanding of the trajectory of the COVID-19 pandemic. %M 39471286 %R 10.2196/53218 %U https://publichealth.jmir.org/2024/1/e53218 %U https://doi.org/10.2196/53218 %U http://www.ncbi.nlm.nih.gov/pubmed/39471286 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 12 %N %P e53622 %T Distributed Statistical Analyses: A Scoping Review and Examples of Operational Frameworks Adapted to Health Analytics %A Camirand Lemyre,Félix %A Lévesque,Simon %A Domingue,Marie-Pier %A Herrmann,Klaus %A Ethier,Jean-François %K distributed algorithms %K generalized linear models %K horizontally partitioned data %K GLMs %K learning health systems %K distributed analysis %K federated analysis %K data science %K data custodians %K algorithms %K statistics %K synthesis %K review methods %K searches %K scoping %D 2024 %7 14.11.2024 %9 %J JMIR Med Inform %G English %X Background: Data from multiple organizations are crucial for advancing learning health systems. However, ethical, legal, and social concerns may restrict the use of standard statistical methods that rely on pooling data. Although distributed algorithms offer alternatives, they may not always be suitable for health frameworks. Objective: This study aims to support researchers and data custodians in three ways: (1) providing a concise overview of the literature on statistical inference methods for horizontally partitioned data, (2) describing the methods applicable to generalized linear models (GLMs) and assessing their underlying distributional assumptions, and (3) adapting existing methods to make them fully usable in health settings. Methods: A scoping review methodology was used for the literature mapping, from which methods presenting a methodological framework for GLM analyses with horizontally partitioned data were identified and assessed from the perspective of applicability in health settings. Statistical theory was used to adapt methods and derive the properties of the resulting estimators. Results: From the review, 41 articles were selected and 6 approaches were extracted to conduct standard GLM-based statistical analysis. However, these approaches assumed evenly and identically distributed data across nodes. Consequently, statistical procedures were derived to accommodate uneven node sample sizes and heterogeneous data distributions across nodes. Workflows and detailed algorithms were developed to highlight information sharing requirements and operational complexity. Conclusions: This study contributes to the field of health analytics by providing an overview of the methods that can be used with horizontally partitioned data by adapting these methods to the context of heterogeneous health data and clarifying the workflows and quantities exchanged by the methods discussed. Further analysis of the confidentiality preserved by these methods is needed to fully understand the risk associated with the sharing of summary statistics. %R 10.2196/53622 %U https://medinform.jmir.org/2024/1/e53622 %U https://doi.org/10.2196/53622 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e54503 %T Assessment of the Effective Sensitivity of SARS-CoV-2 Sample Pooling Based on a Large-Scale Screening Experience: Retrospective Analysis %A Cabrera Alvargonzalez,Jorge J %A Larrañaga,Ana %A Martinez,Javier %A Pérez Castro,Sonia %A Rey Cao,Sonia %A Daviña Nuñez,Carlos %A Del Campo Pérez,Víctor %A Duran Parrondo,Carmen %A Suarez Luque,Silvia %A González Alonso,Elena %A Silva Tojo,Alfredo José %A Porteiro,Jacobo %A Regueiro,Benito %+ Microbiology Department, Complexo Hospitalario Universitario de Vigo, Servicio Galego de Saude, Estrada de Clara Campoamor, 341, Vigo, 36312, Spain, 34 986811111, jorge.julio.cabrera.alvargonzalez@sergas.es %K pooling %K sensitivity %K SARS-CoV-2 %K PCR %K saliva %K screening %K surveillance %K COVID-19 %K nonsymptomatic %K transmission control %D 2024 %7 24.9.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The development of new large-scale saliva pooling detection strategies can significantly enhance testing capacity and frequency for asymptomatic individuals, which is crucial for containing SARS-CoV-2. Objective: This study aims to implement and scale-up a SARS-CoV-2 screening method using pooled saliva samples to control the virus in critical areas and assess its effectiveness in detecting asymptomatic infections. Methods: Between August 2020 and February 2022, our laboratory received a total of 928,357 samples. Participants collected at least 1 mL of saliva using a self-sampling kit and registered their samples via a smartphone app. All samples were directly processed using AutoMate 2550 for preanalytical steps and then transferred to Microlab STAR, managed with the HAMILTON Pooling software for pooling. The standard pool preset size was 20 samples but was adjusted to 5 when the prevalence exceeded 2% in any group. Real-time polymerase chain reaction (RT-PCR) was conducted using the Allplex SARS-CoV-2 Assay until July 2021, followed by the Allplex SARS-CoV-2 FluA/FluB/RSV assay for the remainder of the study period. Results: Of the 928,357 samples received, 887,926 (95.64%) were fully processed into 56,126 pools. Of these pools, 4863 tested positive, detecting 5720 asymptomatic infections. This allowed for a comprehensive analysis of pooling’s impact on RT-PCR sensitivity and false-negative rate (FNR), including data on positive samples per pool (PPP). We defined Ctref as the minimum cycle threshold (Ct) of each data set from a sample or pool and compared these Ctref results from pooled samples with those of the individual tests (ΔCtP). We then examined their deviation from the expected offset due to dilution [ΔΔCtP = ΔCtP – log2]. In this work, the ΔCtP and ΔΔCtP were 2.23 versus 3.33 and –0.89 versus 0.23, respectively, comparing global results with results for pools with 1 positive sample per pool. Therefore, depending on the number of genes used in the test and the size of the pool, we can evaluate the FNR and effective sensitivity (1 – FNR) of the test configuration. In our scenario, with a maximum of 20 samples per pool and 3 target genes, statistical observations indicated an effective sensitivity exceeding 99%. From an economic perspective, the focus is on pooling efficiency, measured by the effective number of persons that can be tested with 1 test, referred to as persons per test (PPT). In this study, the global PPT was 8.66, reflecting savings of over 20 million euros (US $22 million) based on our reagent prices. Conclusions: Our results demonstrate that, as expected, pooling reduces the sensitivity of RT-PCR. However, with the appropriate pool size and the use of multiple target genes, effective sensitivity can remain above 99%. Saliva pooling may be a valuable tool for screening and surveillance in asymptomatic individuals and can aid in controlling SARS-CoV-2 transmission. Further studies are needed to assess the effectiveness of these strategies for SARS-CoV-2 and their application to other microorganisms or biomarkers detected by PCR. %M 39316785 %R 10.2196/54503 %U https://publichealth.jmir.org/2024/1/e54503 %U https://doi.org/10.2196/54503 %U http://www.ncbi.nlm.nih.gov/pubmed/39316785 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e48289 %T Further Exploring the Public Health Implications of the Network Scale-Up Method: Cross-Sectional Survey Study %A Jing,Liwei %A Yu,Hongmei %A Lu,Qing %K network scale-up method %K public health implications %K people who inject drugs %K popularity ratio %K information transmission rate %K PWID %D 2024 %7 23.8.2024 %9 %J JMIR Public Health Surveill %G English %X Background: The decline in the number of new HIV infections among adults has slowed down, gradually becoming the biggest obstacle to achieving the 2030 target of ending the HIV/AIDS epidemic. Thus, a political declaration to ensure that 90% of people at high risk of HIV infection can access comprehensive prevention services was proposed by the United Nations General Assembly. Therefore, obtaining an accurate estimated size of high-risk populations is required as a prior condition to plan and implement HIV prevention services. The network scale-up method (NSUM) was recommended by the United Nations Programme on HIV/AIDS and the World Health Organization to estimate the sizes of populations at high risk of HIV infection; however, we found that the NSUM also revealed underlying population characteristics of female sex workers in addition to being used to estimate the population size. Such information on underlying population characteristics is very useful in improving the planning and implementation of HIV prevention services. This is especially relevant for people who inject drugs, where in addition to stigma and discrimination, criminalization further hinders access to HIV prevention services. Objective: We aimed to conduct a further exploration of the public health implications of the NSUM by using it to estimate the population size, popularity ratio, and information transmission rate among people who inject drugs. Methods: A stratified 2-stage cluster survey of the general population and a respondent-driven sampling survey of people who inject drugs were conducted in the urban district of Taiyuan, China, in 2021. Results: The estimated size of the population of people who inject drugs in Taiyuan was 1241.9 (95% CI 1009.2‐1474.9), corresponding to 4.4×10−2% (95% CI 3.6×10−2% to 5.2×10−2%) of the adult population aged 15‐64 years. The estimated popularity ratio of people who inject drugs was 53.6% (95% CI 47.2%‐60.1%), and the estimated information transmission rate was 87.9% (95% CI 86.5%‐89.3%). Conclusions: In addition to being used to estimate the size of the population of people who inject drugs, the NSUM revealed that they have smaller-sized personal social networks while concealing their drug use, and these underlying population characteristics are extremely useful for planning appropriate service delivery approaches with the fewest barriers for people who inject drugs to access HIV prevention services. Therefore, more cost-effectiveness brings new public health implications for the NSUM, which makes it even more promising for its application. %R 10.2196/48289 %U https://publichealth.jmir.org/2024/1/e48289 %U https://doi.org/10.2196/48289 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e57742 %T Risk Index of Regional Infection Expansion of COVID-19: Moving Direction Entropy Study Using Mobility Data and Its Application to Tokyo %A Ohsawa,Yukio %A Sun,Yi %A Sekiguchi,Kaira %A Kondo,Sae %A Maekawa,Tomohide %A Takita,Morihito %A Tanimoto,Tetsuya %A Kami,Masahiro %+ School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan, 81 358417012, ohsawa@sys.t.u-tokyo.ac.jp %K suppressing the spread of infection %K index for risk assessment %K local regions %K diversity of mobility %K mobility data %K moving direction entropy %K MDE %K social network model %K COVID-19 %K influenza %K sexually transmitted diseases %D 2024 %7 21.8.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Policies, such as stay home, bubbling, and stay with your community, recommending that individuals reduce contact with diverse communities, including families and schools, have been introduced to mitigate the spread of the COVID-19 pandemic. However, these policies are violated if individuals from various communities gather, which is a latent risk in a real society where people move among various unreported communities. Objective: We aimed to create a physical index to assess the possibility of contact between individuals from diverse communities, which serves as an indicator of the potential risk of SARS-CoV-2 spread when considered and combined with existing indices. Methods: Moving direction entropy (MDE), which quantifies the diversity of moving directions of individuals in each local region, is proposed as an index to evaluate a region’s risk of contact of individuals from diverse communities. MDE was computed for each inland municipality in Tokyo using mobility data collected from smartphones before and during the COVID-19 pandemic. To validate the hypothesis that the impact of intercommunity contact on infection expansion becomes larger for a virus with larger infectivity, we compared the correlations of the expansion of infectious diseases with indices, including MDE and the densities of supermarkets, restaurants, etc. In addition, we analyzed the temporal changes in MDE in municipalities. Results: This study had 4 important findings. First, the MDE values for local regions showed significant invariance between different periods according to the Spearman rank correlation coefficient (>0.9). Second, MDE was found to correlate with the rate of infection cases of COVID-19 among local populations in 53 inland regions (average of 0.76 during the period of expansion). The density of restaurants had a similar correlation with COVID-19. The correlation between MDE and the rate of infection was smaller for influenza than for COVID-19, and tended to be even smaller for sexually transmitted diseases (order of infectivity). These findings support the hypothesis. Third, the spread of COVID-19 was accelerated in regions with high-rank MDE values compared to those with high-rank restaurant densities during and after the period of the governmental declaration of emergency (P<.001). Fourth, the MDE values tended to be high and increased during the pandemic period in regions where influx or daytime movement was present. A possible explanation for the third and fourth findings is that policymakers and living people have been overlooking MDE. Conclusions: We recommend monitoring the regional values of MDE to reduce the risk of infection spread. To aid in this monitoring, we present a method to create a heatmap of MDE values, thereby drawing public attention to behaviors that facilitate contact between communities during a highly infectious disease pandemic. %M 39037745 %R 10.2196/57742 %U https://publichealth.jmir.org/2024/1/e57742 %U https://doi.org/10.2196/57742 %U http://www.ncbi.nlm.nih.gov/pubmed/39037745 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e55183 %T The Mediating Role of Human Mobility in Temporal-Lagged Relationships Between Risk Perception and COVID-19 Dynamics in Taiwan: Statistical Modeling for Comparing the Pre-Omicron and Omicron Eras %A Chang,Min-Chien %A Wen,Tzai-Hung %K human mobility %K risk perception %K COVID-19 %K Omicron %K Taiwan %K pandemic %K disease transmission %K pandemic dynamics %K global threats %K infectious disease %K behavioural health %K public health %K surveillance %D 2024 %7 20.8.2024 %9 %J JMIR Public Health Surveill %G English %X Background: The COVID-19 pandemic has profoundly impacted all aspects of human life for over 3 years. Understanding the evolution of public risk perception during these periods is crucial. Few studies explore the mechanisms for reducing disease transmission due to risk perception. Thus, we hypothesize that changes in human mobility play a mediating role between risk perception and the progression of the pandemic. Objective: The study aims to explore how various forms of human mobility, including essential, nonessential, and job-related behaviors, mediate the temporal relationships between risk perception and pandemic dynamics. Methods: We used distributed-lag linear structural equation models to compare the mediating impact of human mobility across different virus variant periods. These models examined the temporal dynamics and time-lagged effects among risk perception, changes in mobility, and virus transmission in Taiwan, focusing on two distinct periods: (1) April-August 2021 (pre-Omicron era) and (2) February-September 2022 (Omicron era). Results: In the pre-Omicron era, our findings showed that an increase in public risk perception correlated with significant reductions in COVID-19 cases across various types of mobility within specific time frames. Specifically, we observed a decrease of 5.59 (95% CI −4.35 to −6.83) COVID-19 cases per million individuals after 7 weeks in nonessential mobility, while essential mobility demonstrated a reduction of 10.73 (95% CI −9.6030 to −11.8615) cases after 8 weeks. Additionally, job-related mobility resulted in a decrease of 3.96 (95% CI −3.5039 to −4.4254) cases after 11 weeks. However, during the Omicron era, these effects notably diminished. A reduction of 0.85 (95% CI −1.0046 to −0.6953) cases through nonessential mobility after 10 weeks and a decrease of 0.69 (95% CI −0.7827 to −0.6054) cases through essential mobility after 12 weeks were observed. Conclusions: This study confirms that changes in mobility serve as a mediating factor between heightened risk perception and pandemic mitigation in both pre-Omicron and Omicron periods. This suggests that elevating risk perception is notably effective in impeding virus progression, especially when vaccines are unavailable or their coverage remains limited. Our findings provide significant value for health authorities in devising policies to address the global threats posed by emerging infectious diseases. %R 10.2196/55183 %U https://publichealth.jmir.org/2024/1/e55183 %U https://doi.org/10.2196/55183 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e53719 %T Handling Missing Data in COVID-19 Incidence Estimation: Secondary Data Analysis %A Pham,Hai-Thanh %A Do,Toan %A Baek,Jonggyu %A Nguyen,Cong-Khanh %A Pham,Quang-Thai %A Nguyen,Hoa L %A Goldberg,Robert %A Pham,Quang Loc %A Giang,Le Minh %K imputation method %K COVID-19 incidence rate %K crude bias %K crude RMSE %K root mean square error %K percentage change %K pandemic %K Vietnam %K surveillance %K population health %K analytical method %D 2024 %7 20.8.2024 %9 %J JMIR Public Health Surveill %G English %X Background: The COVID-19 pandemic has revealed significant challenges in disease forecasting and in developing a public health response, emphasizing the need to manage missing data from various sources in making accurate forecasts. Objective: We aimed to show how handling missing data can affect estimates of the COVID-19 incidence rate (CIR) in different pandemic situations. Methods: This study used data from the COVID-19/SARS-CoV-2 surveillance system at the National Institute of Hygiene and Epidemiology, Vietnam. We separated the available data set into 3 distinct periods: zero COVID-19, transition, and new normal. We randomly removed 5% to 30% of data that were missing completely at random, with a break of 5% at each time point in the variable daily caseload of COVID-19. We selected 7 analytical methods to assess the effects of handling missing data and calculated statistical and epidemiological indices to measure the effectiveness of each method. Results: Our study examined missing data imputation performance across 3 study time periods: zero COVID-19 (n=3149), transition (n=1290), and new normal (n=9288). Imputation analyses showed that K-nearest neighbor (KNN) had the lowest mean absolute percentage change (APC) in CIR across the range (5% to 30%) of missing data. For instance, with 15% missing data, KNN resulted in 10.6%, 10.6%, and 9.7% average bias across the zero COVID-19, transition, and new normal periods, compared to 39.9%, 51.9%, and 289.7% with the maximum likelihood method. The autoregressive integrated moving average model showed the greatest mean APC in the mean number of confirmed cases of COVID-19 during each COVID-19 containment cycle (CCC) when we imputed the missing data in the zero COVID-19 period, rising from 226.3% at the 5% missing level to 6955.7% at the 30% missing level. Imputing missing data with median imputation methods had the lowest bias in the average number of confirmed cases in each CCC at all levels of missing data. In detail, in the 20% missing scenario, while median imputation had an average bias of 16.3% for confirmed cases in each CCC, which was lower than the KNN figure, maximum likelihood imputation showed a bias on average of 92.4% for confirmed cases in each CCC, which was the highest figure. During the new normal period in the 25% and 30% missing data scenarios, KNN imputation had average biases for CIR and confirmed cases in each CCC ranging from 21% to 32% for both, while maximum likelihood and moving average imputation showed biases on average above 250% for both CIR and confirmed cases in each CCC. Conclusions: Our study emphasizes the importance of understanding that the specific imputation method used by investigators should be tailored to the specific epidemiological context and data collection environment to ensure reliable estimates of the CIR. %R 10.2196/53719 %U https://publichealth.jmir.org/2024/1/e53719 %U https://doi.org/10.2196/53719 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e53371 %T Social Determinants of Health Phenotypes and Cardiometabolic Condition Prevalence Among Patients in a Large Academic Health System: Latent Class Analysis %A Howell,Carrie R %A Zhang,Li %A Clay,Olivio J %A Dutton,Gareth %A Horton,Trudi %A Mugavero,Michael J %A Cherrington,Andrea L %K social determinants of health %K electronic medical record %K phenotypes %K diabetes %K obesity %K cardiovascular disease %K obese %K social determinants %K social determinant %K cardiometabolic %K risk factors %K risk factor %K latent class analysis %K cardiometabolic disease %K EMR %K EHR %K electronic medical record %K electronic health record %D 2024 %7 7.8.2024 %9 %J JMIR Public Health Surveill %G English %X Background: Adverse social determinants of health (SDoH) have been associated with cardiometabolic disease; however, disparities in cardiometabolic outcomes are rarely the result of a single risk factor. Objective: This study aimed to identify and characterize SDoH phenotypes based on patient-reported and neighborhood-level data from the institutional electronic medical record and evaluate the prevalence of diabetes, obesity, and other cardiometabolic diseases by phenotype status. Methods: Patient-reported SDoH were collected (January to December 2020) and neighborhood-level social vulnerability, neighborhood socioeconomic status, and rurality were linked via census tract to geocoded patient addresses. Diabetes status was coded in the electronic medical record using International Classification of Diseases codes; obesity was defined using measured BMI ≥30 kg/m2. Latent class analysis was used to identify clusters of SDoH (eg, phenotypes); we then examined differences in the prevalence of cardiometabolic conditions based on phenotype status using prevalence ratios (PRs). Results: Complete data were available for analysis for 2380 patients (mean age 53, SD 16 years; n=1405, 59% female; n=1198, 50% non-White). Roughly 8% (n=179) reported housing insecurity, 30% (n=710) reported resource needs (food, health care, or utilities), and 49% (n=1158) lived in a high-vulnerability census tract. We identified 3 patient SDoH phenotypes: (1) high social risk, defined largely by self-reported SDoH (n=217, 9%); (2) adverse neighborhood SDoH (n=1353, 56%), defined largely by adverse neighborhood-level measures; and (3) low social risk (n=810, 34%), defined as low individual- and neighborhood-level risks. Patients with an adverse neighborhood SDoH phenotype had higher prevalence of diagnosed type 2 diabetes (PR 1.19, 95% CI 1.06‐1.33), hypertension (PR 1.14, 95% CI 1.02‐1.27), peripheral vascular disease (PR 1.46, 95% CI 1.09‐1.97), and heart failure (PR 1.46, 95% CI 1.20‐1.79). Conclusions: Patients with the adverse neighborhood SDoH phenotype had higher prevalence of poor cardiometabolic conditions compared to phenotypes determined by individual-level characteristics, suggesting that neighborhood environment plays a role, even if individual measures of socioeconomic status are not suboptimal. %R 10.2196/53371 %U https://publichealth.jmir.org/2024/1/e53371 %U https://doi.org/10.2196/53371 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e59604 %T Molecular Evolutionary Dynamics of Coxsackievirus A6 Causing Hand, Foot, and Mouth Disease From 2021 to 2023 in China: Genomic Epidemiology Study %A Chen,Yu %A Chen,Shouhang %A Shen,Yuanfang %A Li,Zhi %A Li,Xiaolong %A Zhang,Yaodong %A Zhang,Xiaolong %A Wang,Fang %A Jin,Yuefei %K coxsackievirus A6 %K hand, foot, and mouth disease %K evolution %K molecular epidemiology %K China %K CV-A6 %K HFMD %D 2024 %7 31.7.2024 %9 %J JMIR Public Health Surveill %G English %X Background: Hand, foot, and mouth disease (HFMD) is a global public health concern, notably within the Asia-Pacific region. Recently, the primary pathogen causing HFMD outbreaks across numerous countries, including China, is coxsackievirus (CV) A6, one of the most prevalent enteroviruses in the world. It is a new variant that has undergone genetic recombination and evolution, which might not only induce modifications in the clinical manifestations of HFMD but also heighten its pathogenicity because of nucleotide mutation accumulation. Objective: The study assessed the epidemiological characteristics of HFMD in China and characterized the molecular epidemiology of the major pathogen (CV-A6) causing HFMD. We attempted to establish the association between disease progression and viral genetic evolution through a molecular epidemiological study. Methods: Surveillance data from the Chinese Center for Disease Control and Prevention from 2021 to 2023 were used to analyze the epidemiological seasons and peaks of HFMD in Henan, China, and capture the results of HFMD pathogen typing. We analyzed the evolutionary characteristics of all full-length CV-A6 sequences in the NCBI database and the isolated sequences in Henan. To characterize the molecular evolution of CV-A6, time-scaled tree and historical population dynamics regarding CV-A6 sequences were estimated. Additionally, we analyzed the isolated strains for mutated or missing amino acid sites compared to the prototype CV-A6 strain. Results: The 2021-2023 epidemic seasons for HFMD in Henan usually lasted from June to August, with peaks around June and July. The monthly case reporting rate during the peak period ranged from 20.7% (4854/23,440) to 35% (12,135/34,706) of the total annual number of cases. Analysis of the pathogen composition of 2850 laboratory-confirmed cases identified 8 enterovirus serotypes, among which CV-A6 accounted for the highest proportion (652/2850, 22.88%). CV-A6 emerged as the major pathogen for HFMD in 2022 (203/732, 27.73%) and 2023 (262/708, 37.01%). We analyzed all CV-A6 full-length sequences in the NCBI database and the evolutionary features of viruses isolated in Henan. In China, the D3 subtype gradually appeared from 2011, and by 2019, all CV-A6 virus strains belonged to the D3 subtype. The VP1 sequences analyzed in Henan showed that its subtypes were consistent with the national subtypes. Furthermore, we analyzed the molecular evolutionary features of CV-A6 using Bayesian phylogeny and found that the most recent common ancestor of CV-A6 D3 dates back to 2006 in China, earlier than the 2011 HFMD outbreak. Moreover, the strains isolated in 2023 had mutations at several amino acid sites compared to the original strain. Conclusions: The CV-A6 virus may have been introduced and circulating covertly within China prior to the large-scale HFMD outbreak. Our laboratory testing data confirmed the fluctuation and periodic patterns of CV-A6 prevalence. Our study provides valuable insights into understanding the evolutionary dynamics of CV-A6. %R 10.2196/59604 %U https://publichealth.jmir.org/2024/1/e59604 %U https://doi.org/10.2196/59604 %0 Journal Article %@ 2369-2960 %I %V 10 %N %P e55011 %T Socioeconomic Disparities in Six Common Cancer Survival Rates in South Korea: Population-Wide Retrospective Cohort Study %A Lee,JinWook %A Park,JuWon %A Kim,Nayeon %A Nari,Fatima %A Bae,Seowoo %A Lee,Hyeon Ji %A Lee,Mingyu %A Jun,Jae Kwan %A Choi,Kui Son %A Suh,Mina %K cancer survival %K income level %K socioeconomic status %K deprivation index %K inequality %K nationwide analysis %K cancer %K South Korea %K public health %D 2024 %7 22.7.2024 %9 %J JMIR Public Health Surveill %G English %X Background: In South Korea, the cancer incidence rate has increased by 56.5% from 2001 to 2021. Nevertheless, the 5-year cancer survival rate from 2017 to 2021 increased by 17.9% compared with that from 2001 to 2005. Cancer survival rates tend to decline with lower socioeconomic status, and variations exist in the survival rates among different cancer types. Analyzing socioeconomic patterns in the survival of patients with cancer can help identify high-risk groups and ensure that they benefit from interventions. Objective: The aim of this study was to analyze differences in survival rates among patients diagnosed with six types of cancer—stomach, colorectal, liver, breast, cervical, and lung cancers—based on socioeconomic status using Korean nationwide data. Methods: This study used the Korea Central Cancer Registry database linked to the National Health Information Database to follow up with patients diagnosed with cancer between 2014 and 2018 until December 31, 2021. Kaplan-Meier curves stratified by income status were generated, and log-rank tests were conducted for each cancer type to assess statistical significance. Hazard ratios with 95% CIs for any cause of overall survival were calculated using Cox proportional hazards regression models with the time since diagnosis. Results: The survival rates for the six different types of cancer were as follows: stomach cancer, 69.6% (96,404/138,462); colorectal cancer, 66.6% (83,406/125,156); liver cancer, 33.7% (23,860/70,712); lung cancer, 30.4% (33,203/109,116); breast cancer, 91.5% (90,730/99,159); and cervical cancer, 78% (12,930/16,580). When comparing the medical aid group to the highest income group, the hazard ratios were 1.72 (95% CI 1.66‐1.79) for stomach cancer, 1.60 (95% CI 1.54‐1.56) for colorectal cancer, 1.51 (95% CI 1.45‐1.56) for liver cancer, 1.56 (95% CI 1.51‐1.59) for lung cancer, 2.19 (95% CI 2.01‐2.38) for breast cancer, and 1.65 (95% CI 1.46‐1.87) for cervical cancer. A higher deprivation index and advanced diagnostic stage were associated with an increased risk of mortality. Conclusions: Socioeconomic status significantly mediates disparities in cancer survival in several cancer types. This effect is particularly pronounced in less fatal cancers such as breast cancer. Therefore, considering the type of cancer and socioeconomic factors, social and medical interventions such as early cancer detection and appropriate treatment are necessary for vulnerable populations. %R 10.2196/55011 %U https://publichealth.jmir.org/2024/1/e55011 %U https://doi.org/10.2196/55011 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e54309 %T Association Between Socioeconomic Inequalities in Pain and All-Cause Mortality in the China Health and Retirement Longitudinal Study: Longitudinal Cohort Study %A Zhang,Zhuo %A Xue,Dongmei %A Bian,Ying %+ Institute of Chinese Medical Sciences, State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Room 1048, Building E12, Avenida da Universidade, Taipa, 999078, Macao, 853 88228537, bianyingum@163.com %K pain %K equality %K all-cause mortality %K concentration index %K decomposition %D 2024 %7 12.7.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Few studies focus on the equality of pain, and the relationship between pain and death is inconclusive. Investigating the distribution of pain and potential mortality risks is crucial for ameliorating painful conditions and devising targeted intervention measures. Objective: Our study aimed to investigate the association between inequalities in pain and all-cause mortality in China. Methods: Longitudinal cohort data from waves 1 and 2 of the China Health and Retirement Longitudinal Study (2011-2013) were used in this study. Pain was self-reported at baseline, and death information was obtained from the 2013 follow-up survey. The concentration index and its decomposition were used to explain the inequality of pain, and the association between pain and death was analyzed with a Cox proportional risk model. Results: A total of 16,747 participants were included, with an average age of 59.57 (SD 9.82) years. The prevalence of pain was 32.54% (8196/16,747). Among participants with pain, the main pain type was moderate pain (1973/5426, 36.36%), and the common pain locations were the waist (3232/16,747, 19.3%), legs (2476/16,747, 14.78%) and head (2250/16,747, 13.44%). We found that the prevalence of pain was concentrated in participants with low economic status (concentration index –0.066, 95% CI –0.078 to –0.054). Educational level (36.49%), location (36.87%), and economic status (25.05%) contributed significantly to the inequality of pain. In addition, Cox regression showed that pain was associated with an increased risk of all-cause mortality (hazard ratio 1.30, 95% CI 1.06-1.61). Conclusions: The prevalence of pain in Chinese adults is concentrated among participants with low economic status, and pain increases the risk of all-cause death. Our results highlight the importance of socioeconomic factors in reducing deaths due to pain inequalities by implementing targeted interventions. %M 38872381 %R 10.2196/54309 %U https://publichealth.jmir.org/2024/1/e54309 %U https://doi.org/10.2196/54309 %U http://www.ncbi.nlm.nih.gov/pubmed/38872381 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 8 %N %P e55013 %T Nonrepresentativeness of Human Mobility Data and its Impact on Modeling Dynamics of the COVID-19 Pandemic: Systematic Evaluation %A Liu,Chuchu %A Holme,Petter %A Lehmann,Sune %A Yang,Wenchuan %A Lu,Xin %+ College of Systems Engineering, National University of Defense Technology, No 137 Yanwachi Street, Changsha, 410073, China, 86 18627561577, xin.lu.lab@outlook.com %K human mobility %K data representativeness %K population composition %K COVID-19 %K epidemiological modeling %D 2024 %7 28.6.2024 %9 Original Paper %J JMIR Form Res %G English %X Background: In recent years, a range of novel smartphone-derived data streams about human mobility have become available on a near–real-time basis. These data have been used, for example, to perform traffic forecasting and epidemic modeling. During the COVID-19 pandemic in particular, human travel behavior has been considered a key component of epidemiological modeling to provide more reliable estimates about the volumes of the pandemic’s importation and transmission routes, or to identify hot spots. However, nearly universally in the literature, the representativeness of these data, how they relate to the underlying real-world human mobility, has been overlooked. This disconnect between data and reality is especially relevant in the case of socially disadvantaged minorities. Objective: The objective of this study is to illustrate the nonrepresentativeness of data on human mobility and the impact of this nonrepresentativeness on modeling dynamics of the epidemic. This study systematically evaluates how real-world travel flows differ from census-based estimations, especially in the case of socially disadvantaged minorities, such as older adults and women, and further measures biases introduced by this difference in epidemiological studies. Methods: To understand the demographic composition of population movements, a nationwide mobility data set from 318 million mobile phone users in China from January 1 to February 29, 2020, was curated. Specifically, we quantified the disparity in the population composition between actual migrations and resident composition according to census data, and shows how this nonrepresentativeness impacts epidemiological modeling by constructing an age-structured SEIR (Susceptible-Exposed-Infected- Recovered) model of COVID-19 transmission. Results: We found a significant difference in the demographic composition between those who travel and the overall population. In the population flows, 59% (n=20,067,526) of travelers are young and 36% (n=12,210,565) of them are middle-aged (P<.001), which is completely different from the overall adult population composition of China (where 36% of individuals are young and 40% of them are middle-aged). This difference would introduce a striking bias in epidemiological studies: the estimation of maximum daily infections differs nearly 3 times, and the peak time has a large gap of 46 days. Conclusions: The difference between actual migrations and resident composition strongly impacts outcomes of epidemiological forecasts, which typically assume that flows represent underlying demographics. Our findings imply that it is necessary to measure and quantify the inherent biases related to nonrepresentativeness for accurate epidemiological surveillance and forecasting. %M 38941609 %R 10.2196/55013 %U https://formative.jmir.org/2024/1/e55013 %U https://doi.org/10.2196/55013 %U http://www.ncbi.nlm.nih.gov/pubmed/38941609 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e57807 %T Spatiotemporal Epidemiological Trends of Mpox in Mainland China: Spatiotemporal Ecological Comparison Study %A Ma,Shuli %A Ge,Jie %A Qin,Lei %A Chen,Xiaoting %A Du,Linlin %A Qi,Yanbo %A Bai,Li %A Han,Yunfeng %A Xie,Zhiping %A Chen,Jiaxin %A Jia,Yuehui %+ School of Public Health, Qiqihar Medical University, 333 Bukui Street, Jianhua District, Qiqihar, 161000, China, 86 0452 2663409, superyuehui@163.com %K mpox %K spatiotemporal analysis %K emergencies %K prevention and control %K public health %D 2024 %7 19.6.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The World Health Organization declared mpox an international public health emergency. Since January 1, 2022, China has been ranked among the top 10 countries most affected by the mpox outbreak globally. However, there is a lack of spatial epidemiological studies on mpox, which are crucial for accurately mapping the spatial distribution and clustering of the disease. Objective: This study aims to provide geographically accurate visual evidence to determine priority areas for mpox prevention and control. Methods: Locally confirmed mpox cases were collected between June and November 2023 from 31 provinces of mainland China excluding Taiwan, Macao, and Hong Kong. Spatiotemporal epidemiological analyses, including spatial autocorrelation and regression analyses, were conducted to identify the spatiotemporal characteristics and clustering patterns of mpox attack rate and its spatial relationship with sociodemographic and socioeconomic factors. Results: From June to November 2023, a total of 1610 locally confirmed mpox cases were reported in 30 provinces in mainland China, resulting in an attack rate of 11.40 per 10 million people. Global spatial autocorrelation analysis showed that in July (Moran I=0.0938; P=.08), August (Moran I=0.1276; P=.08), and September (Moran I=0.0934; P=.07), the attack rates of mpox exhibited a clustered pattern and positive spatial autocorrelation. The Getis-Ord Gi* statistics identified hot spots of mpox attack rates in Beijing, Tianjin, Shanghai, Jiangsu, and Hainan. Beijing and Tianjin were consistent hot spots from June to October. No cold spots with low mpox attack rates were detected by the Getis-Ord Gi* statistics. Local Moran I statistics identified a high-high (HH) clustering of mpox attack rates in Guangdong, Beijing, and Tianjin. Guangdong province consistently exhibited HH clustering from June to November, while Beijing and Tianjin were identified as HH clusters from July to September. Low-low clusters were mainly located in Inner Mongolia, Xinjiang, Xizang, Qinghai, and Gansu. Ordinary least squares regression models showed that the cumulative mpox attack rates were significantly and positively associated with the proportion of the urban population (t0.05/2,1=2.4041 P=.02), per capita gross domestic product (t0.05/2,1=2.6955; P=.01), per capita disposable income (t0.05/2,1=2.8303; P=.008), per capita consumption expenditure (PCCE; t0.05/2,1=2.7452; P=.01), and PCCE for health care (t0.05/2,1=2.5924; P=.01). The geographically weighted regression models indicated a positive association and spatial heterogeneity between cumulative mpox attack rates and the proportion of the urban population, per capita gross domestic product, per capita disposable income, and PCCE, with high R2 values in north and northeast China. Conclusions: Hot spots and HH clustering of mpox attack rates identified by local spatial autocorrelation analysis should be considered key areas for precision prevention and control of mpox. Specifically, Guangdong, Beijing, and Tianjin provinces should be prioritized for mpox prevention and control. These findings provide geographically precise and visualized evidence to assist in identifying key areas for targeted prevention and control. %M 38896444 %R 10.2196/57807 %U https://publichealth.jmir.org/2024/1/e57807 %U https://doi.org/10.2196/57807 %U http://www.ncbi.nlm.nih.gov/pubmed/38896444 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e51498 %T Population Behavior Changes Underlying Phasic Shifts of SARS-CoV-2 Exposure Settings Across 3 Omicron Epidemic Waves in Hong Kong: Prospective Cohort Study %A Chan,Chin Pok %A Lee,Shui Shan %A Kwan,Tsz Ho %A Wong,Samuel Yeung Shan %A Yeoh,Eng-Kiong %A Wong,Ngai Sze %+ JC School of Public Health and Primary Care, Faculty of Medicine, The Chinese University of Hong Kong, Rm 204, School of Public Health Building, Prince of Wales Hospital, New Territories, Hong Kong, China, 86 22528862, candy_wong@cuhk.edu.hk %K exposure risk %K contact setting %K social distancing %K epidemic control %K participatory surveillance %K SARS-CoV-2 %K COVID-19 %D 2024 %7 19.6.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Exposure risk was shown to have affected individual susceptibility and the epidemic spread of COVID-19. The dynamics of risk by and across exposure settings alongside the variations following the implementation of social distancing interventions are understudied. Objective: This study aims to examine the population’s trajectory of exposure risk in different settings and its association with SARS-CoV-2 infection across 3 consecutive Omicron epidemic waves in Hong Kong. Methods: From March to June 2022, invitation letters were posted to 41,132 randomly selected residential addresses for the recruitment of households into a prospective population cohort. Through web-based monthly surveys coupled with email reminders, a representative from each enrolled household self-reported incidents of SARS-CoV-2 infections, COVID-19 vaccination uptake, their activity pattern in the workplace, and daily and social settings in the preceding month. As a proxy of their exposure risk, the reported activity trend in each setting was differentiated into trajectories based on latent class growth analyses. The associations of different trajectories of SARS-CoV-2 infection overall and by Omicron wave (wave 1: February-April; wave 2: May-September; wave 3: October-December) in 2022 were evaluated by using Cox proportional hazards models and Kaplan-Meier analysis. Results: In total, 33,501 monthly responses in the observation period of February-December 2022 were collected from 5321 individuals, with 41.7% (2221/5321) being male and a median age of 46 (IQR 34-57) years. Against an expanding COVID-19 vaccination coverage from 81.9% to 95.9% for 2 doses and 20% to 77.7% for 3 doses, the cumulative incidence of SARS-CoV-2 infection escalated from <0.2% to 25.3%, 32.4%, and 43.8% by the end of waves 1, 2, and 3, respectively. Throughout February-December 2022, 52.2% (647/1240) of participants had worked regularly on-site, 28.7% (356/1240) worked remotely, and 19.1% (237/1240) showed an assorted pattern. For daily and social settings, 4 and 5 trajectories were identified, respectively, with 11.5% (142/1240) and 14.6% (181/1240) of the participants gauged to have a high exposure risk. Compared to remote working, working regularly on-site (adjusted hazard ratio [aHR] 1.47, 95% CI 1.19-1.80) and living in a larger household (aHR 1.12, 95% CI 1.06-1.18) were associated with a higher risk of SARS-CoV-2 infection in wave 1. Those from the highest daily exposure risk trajectory (aHR 1.46, 95% CI 1.07-2.00) and the second highest social exposure risk trajectory (aHR 1.52, 95% CI 1.18-1.97) were also at an increased risk of infection in waves 2 and 3, respectively, relative to the lowest risk trajectory. Conclusions: In an infection-naive population, SARS-CoV-2 transmission was predominantly initiated at the workplace, accelerated in the household, and perpetuated in the daily and social environments, as stringent restrictions were scaled down. These patterns highlight the phasic shift of exposure settings, which is important for informing the effective calibration of targeted social distancing measures as an alternative to lockdown. %M 38896447 %R 10.2196/51498 %U https://publichealth.jmir.org/2024/1/e51498 %U https://doi.org/10.2196/51498 %U http://www.ncbi.nlm.nih.gov/pubmed/38896447 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e51585 %T Sleep Health Analysis Through Sleep Symptoms in 35,808 Individuals Across Age and Sex Differences: Comparative Symptom Network Study %A Gauld,Christophe %A Hartley,Sarah %A Micoulaud-Franchi,Jean-Arthur %A Royant-Parola,Sylvie %+ Hospices Civils de Lyon, 59 Bd Pinel, Lyon, 69000, France, 33 671675095, gauldchristophe@gmail.com %K symptom %K epidemiology %K age %K sex %K diagnosis %K network approach %K sleep %K sleep health %D 2024 %7 11.6.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Sleep health is a multidimensional construct that includes objective and subjective parameters and is influenced by individual sleep-related behaviors and sleep disorders. Symptom network analysis allows modeling of the interactions between variables, enabling both the visualization of relationships between different factors and the identification of the strength of those relationships. Given the known influence of sex and age on sleep health, network analysis can help explore sets of mutually interacting symptoms relative to these demographic variables. Objective: This study aimed to study the centrality of symptoms and compare age and sex differences regarding sleep health using a symptom network approach in a large French population that feels concerned about their sleep. Methods: Data were extracted from a questionnaire provided by the Réseau Morphée health network. A network analysis was conducted on 39 clinical variables related to sleep disorders and sleep health. After network estimation, statistical analyses consisted of calculating inferences of centrality, robustness (ie, testifying to a sufficient effect size), predictability, and network comparison. Sleep clinical variable centralities within the networks were analyzed by both sex and age using 4 age groups (18-30, 31-45, 46-55, and >55 years), and local symptom-by-symptom correlations determined. Results: Data of 35,808 participants were obtained. The mean age was 42.7 (SD 15.7) years, and 24,964 (69.7%) were women. Overall, there were no significant differences in the structure of the symptom networks between sexes or age groups. The most central symptoms across all groups were nonrestorative sleep and excessive daytime sleepiness. In the youngest group, additional central symptoms were chronic circadian misalignment and chronic sleep deprivation (related to sleep behaviors), particularly among women. In the oldest group, leg sensory discomfort and breath abnormality complaint were among the top 4 central symptoms. Symptoms of sleep disorders thus became more central with age than sleep behaviors. The high predictability of central nodes in one of the networks underlined its importance in influencing other nodes. Conclusions: The absence of structural difference between networks is an important finding, given the known differences in sleep between sexes and across age groups. These similarities suggest comparable interactions between clinical sleep variables across sexes and age groups and highlight the implication of common sleep and wake neural circuits and circadian rhythms in understanding sleep health. More precisely, nonrestorative sleep and excessive daytime sleepiness are central symptoms in all groups. The behavioral component is particularly central in young people and women. Sleep-related respiratory and motor symptoms are prominent in older people. These results underscore the importance of comprehensive sleep promotion and screening strategies tailored to sex and age to impact sleep health. %M 38861716 %R 10.2196/51585 %U https://publichealth.jmir.org/2024/1/e51585 %U https://doi.org/10.2196/51585 %U http://www.ncbi.nlm.nih.gov/pubmed/38861716 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e56643 %T Population Percentage and Population Size of Men Who Have Sex With Men in the United States, 2017-2021: Meta-Analysis of 5 Population-Based Surveys %A Bennett,Brady W %A DuBose,Stephanie %A Huang,Ya-Lin A %A Johnson,Christopher H %A Hoover,Karen W %A Wiener,Jeffrey %A Purcell,David W %A Sullivan,Patrick S %+ Department of Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA, 30322, United States, 1 706 766 1510, brady.bennett@emory.edu %K sexual behavior %K sexual identity %K sexual attraction %K men who have sex with men %K population estimates %K MSM %K men who have sex with other men %K national surveys %K census %K United States %D 2024 %7 11.6.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Male-to-male sexual transmission continues to account for the greatest proportion of new HIV diagnoses in the United States. However, calculating population-specific surveillance metrics for HIV and other sexually transmitted infections requires regularly updated estimates of the number and proportion of men who have sex with men (MSM) in the United States, which are not collected by census surveys. Objective: The purpose of this analysis was to estimate the number and percentage of MSM in the United States from population-based surveys. Methods: We used data from 5 population-based surveys to calculate weighted estimates of the proportion of MSM in the United States and pooled these estimates using meta-analytic procedures. We estimated the proportion of MSM using sexual behavior–based questions (encompassing anal or oral sex) for 3 recall periods—past 12 months, past 5 years, and lifetime. In addition, we estimated the proportion of MSM using self-reported identity and attraction survey responses. The total number of MSM and non-MSM in the United States were calculated from estimates of the percentage of MSM who reported sex with another man in the past 12 months. Results: The percentage of MSM varied by recall period: 3.3% (95% CI 1.7%-4.9%) indicated sex with another male in the past 12 months, 4.7% (95% CI 0.0%-33.8%) in the past 5 years, and 6.2% (95% CI 2.9%-9.5%) in their lifetime. There were comparable percentages of men who identified as gay or bisexual (3.4%, 95% CI 2.2%-4.6%) or who indicated that they are attracted to other men (4.9%, 95% CI 3.1%-6.7%) based on pooled estimates. Our estimate of the total number of MSM in the United States is 4,230,000 (95% CI 2,179,000-6,281,000) based on the history of recent sexual behavior (sex with another man in the past 12 months). Conclusions: We calculated the pooled percentage and number of MSM in the United States from a meta-analysis of population-based surveys collected from 2017 to 2021. These estimates update and expand upon those derived from the Centers for Disease Control and Prevention in 2012 by including estimates of the percentage of MSM based on sexual identity and sexual attraction. The percentage and number of MSM in the United States is an important indicator for calculating population-specific disease rates and eligibility for preventive interventions such as pre-exposure prophylaxis. %M 38861303 %R 10.2196/56643 %U https://publichealth.jmir.org/2024/1/e56643 %U https://doi.org/10.2196/56643 %U http://www.ncbi.nlm.nih.gov/pubmed/38861303 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e55014 %T Multimorbidity and its Associated Factors in Korean Shift Workers: Population-Based Cross-Sectional Study %A Hong,Hye Chong %A Kim,Young Man %+ College of Nursing, Jeonbuk National University, 567 Baekje-daero, Deokjin-gu, Jeonju, 54896, Republic of Korea, 82 10 3498 5078, ymk@jbnu.ac.kr %K chronic disease %K multimorbidity %K shift work schedule %K shift workers %K population-based study %K Korea %K network analysis %K logistic regression %K cross-sectional study %K public health %D 2024 %7 10.6.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Multimorbidity is a crucial factor that influences premature death rates, poor health, depression, quality of life, and use of health care. Approximately one-fifth of the global workforce is involved in shift work, which is associated with increased risk for several chronic diseases and multimorbidity. About 12% to 14% of wage workers in Korea are shift workers. However, the prevalence of multimorbidity and its associated factors in Korean shift workers are rarely reported. Objective: This study aimed to assess multimorbidity prevalence, examine the factors associated with multimorbidity, and identify multimorbidity patterns among shift workers in Korea. Methods: This study is a population-based cross-sectional study using Korea National Health and Nutrition Examination Survey data from 2016 to 2020. The study included 1704 (weighted n=2,697,228) Korean shift workers aged 19 years and older. Multimorbidity was defined as participants having 2 or more chronic diseases. Demographic and job-related variables, including regular work status, average working hours per week, and shift work type, as well as health behaviors, including BMI, smoking status, alcohol use, physical activity, and sleep duration, were included in the analysis. A survey-corrected logistic regression analysis was performed to identify factors influencing multimorbidity among the workers, and multimorbidity patterns were identified with a network analysis. Results: The overall prevalence of multimorbidity was 13.7% (302/1704). Logistic regression indicated that age, income, regular work, and obesity were significant factors influencing multimorbidity. Network analysis results revealed that chronic diseases clustered into three groups: (1) cardiometabolic multimorbidity (hypertension, dyslipidemia, diabetes, coronary heart disease, and stroke), (2) musculoskeletal multimorbidity (arthritis and osteoporosis), and (3) unclassified diseases (depression, chronic liver disease, thyroid disease, asthma, cancer, and chronic kidney disease). Conclusions: The findings revealed that several socioeconomic and behavioral factors were associated with multimorbidity among shift workers, indicating the need for policy development related to work schedule modification. Further organization-level screening and intervention programs are needed to prevent and manage multimorbidity among shift workers. We also recommend longitudinal studies to confirm the effects of job-related factors and health behaviors on multimorbidity among shift workers in the future. %M 38857074 %R 10.2196/55014 %U https://publichealth.jmir.org/2024/1/e55014 %U https://doi.org/10.2196/55014 %U http://www.ncbi.nlm.nih.gov/pubmed/38857074 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e53860 %T Trend of Mortality Due to Congenital Anomalies in Children Younger Than 5 Years in Eastern China, 2012-2021: Surveillance Data Analysis %A Dong,Wen-Hong %A Guo,Jun-Xia %A Wang,Lei %A Zheng,Shuang-Shuang %A Zhu,Bing-Quan %A Shao,Jie %+ Department of Child Health Care, Children’s Hospital, Zhejiang University School of Medicine, No. 3333 Binsheng Road, Binjiang District, Hangzhou, 310052, China, 86 571 88873597, ispring2003@163.com %K under-five years %K congenital anomalies %K mortality %K death cause %K rank %D 2024 %7 3.6.2024 %9 Rapid Surveillance Report %J JMIR Public Health Surveill %G English %X Background: As one of the leading causes of child mortality, deaths due to congenital anomalies (CAs) have been a prominent obstacle to meet Sustainable Development Goal 3.2. Objective: We conducted this study to understand the death burden and trend of under-5 CA mortality (CAMR) in Zhejiang, one of the provinces with the best medical services and public health foundations in Eastern China. Methods: We used data retrieved from the under-5 mortality surveillance system in Zhejiang from 2012 to 2021. CAMR by sex, residence, and age group for each year was calculated and standardized according to 2020 National Population Census sex- and residence-specific live birth data in China. Poisson regression models were used to estimate the annual average change rate (AACR) of CAMR and to obtain the rate ratio between subgroups after adjusting for sex, residence, and age group when appropriate. Results: From 2012 to 2021, a total of 1753 children died from CAs, and the standardized CAMR declined from 121.2 to 62.6 per 100,000 live births with an AACR of –9% (95% CI –10.7% to –7.2%; P<.001). The declining trend was also observed in female and male children, urban and rural children, and neonates and older infants, and the AACRs were –9.7%, –8.5%, –8.5%, –9.2%, –12%, and –6.3%, respectively (all P<.001). However, no significant reduction was observed in children aged 1-4 years (P=.22). Generally, the CAMR rate ratios for male versus female children, rural versus urban children, older infants versus neonates, and older children versus neonates were 1.18 (95% CI 1.08-1.30; P<.001), 1.20 (95% CI 1.08-1.32; P=.001), 0.66 (95% CI 0.59-0.73; P<.001), and 0.20 (95% CI 0.17-0.24; P<.001), respectively. Among all broad CA groups, circulatory system malformations, mainly deaths caused by congenital heart diseases, accounted for 49.4% (866/1753) of deaths and ranked first across all years, although it declined yearly with an AACR of –9.8% (P<.001). Deaths due to chromosomal abnormalities tended to grow in recent years, although the AACR was not significant (P=.90). Conclusions: CAMR reduced annually, with cardiovascular malformations ranking first across all years in Zhejiang, China. Future research and practices should focus more on the prevention, early detection, long-term management of CAs and comprehensive support for families with children with CAs to improve their survival chances. %M 38829691 %R 10.2196/53860 %U https://publichealth.jmir.org/2024/1/e53860 %U https://doi.org/10.2196/53860 %U http://www.ncbi.nlm.nih.gov/pubmed/38829691 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e46737 %T Predicting Lung Cancer Survival to the Future: Population-Based Cancer Survival Modeling Study %A Meng,Fan-Tsui %A Jhuang,Jing-Rong %A Peng,Yan-Teng %A Chiang,Chun-Ju %A Yang,Ya-Wen %A Huang,Chi-Yen %A Huang,Kuo-Ping %A Lee,Wen-Chung %+ Institute of Health Data Analytics and Statistics, College of Public Health, National Taiwan University, Room 536, No 17, Xuzhou Road, Taipei, 100, Taiwan, 886 233668036, wenchung@ntu.edu.tw %K lung cancer %K survival %K survivorship-period-cohort model %K prediction %K prognosis %K early diagnosis %K lung cancer screening %K survival trend %K population-based %K population health %K public health %K surveillance %K low-dose computed tomography %D 2024 %7 31.5.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Lung cancer remains the leading cause of cancer-related mortality globally, with late diagnoses often resulting in poor prognosis. In response, the Lung Ambition Alliance aims to double the 5-year survival rate by 2025. Objective: Using the Taiwan Cancer Registry, this study uses the survivorship-period-cohort model to assess the feasibility of achieving this goal by predicting future survival rates of patients with lung cancer in Taiwan. Methods: This retrospective study analyzed data from 205,104 patients with lung cancer registered between 1997 and 2018. Survival rates were calculated using the survivorship-period-cohort model, focusing on 1-year interval survival rates and extrapolating to predict 5-year outcomes for diagnoses up to 2020, as viewed from 2025. Model validation involved comparing predicted rates with actual data using symmetric mean absolute percentage error. Results: The study identified notable improvements in survival rates beginning in 2004, with the predicted 5-year survival rate for 2020 reaching 38.7%, marking a considerable increase from the most recent available data of 23.8% for patients diagnosed in 2013. Subgroup analysis revealed varied survival improvements across different demographics and histological types. Predictions based on current trends indicate that achieving the Lung Ambition Alliance’s goal could be within reach. Conclusions: The analysis demonstrates notable improvements in lung cancer survival rates in Taiwan, driven by the adoption of low-dose computed tomography screening, alongside advances in diagnostic technologies and treatment strategies. While the ambitious target set by the Lung Ambition Alliance appears achievable, ongoing advancements in medical technology and health policies will be crucial. The study underscores the potential impact of continued enhancements in lung cancer management and the importance of strategic health interventions to further improve survival outcomes. %M 38819904 %R 10.2196/46737 %U https://publichealth.jmir.org/2024/1/e46737 %U https://doi.org/10.2196/46737 %U http://www.ncbi.nlm.nih.gov/pubmed/38819904 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e40796 %T Attrition Rates in HIV Viral Load Monitoring and Factors Associated With Overdue Testing Among Children Within South Africa’s Antiretroviral Treatment Program: Retrospective Descriptive Analysis %A Haeri Mazanderani,Ahmad %A Radebe,Lebohang %A Sherman,Gayle G %+ Centre for HIV & STIs, National Institute for Communicable Diseases, National Health Laboratory Service, 1 Modderfontein Road, Sandringham, Johannesburg, 2031, South Africa, 27 826428609, ahmadh@nicd.ac.za %K HIV %K monitoring %K viral load %K suppression %K overdue %K retention %K VL test %K attrition %K child %K youth %K pediatric %K paediatric %K sexually transmitted %K sexual transmission %K virological failure %K South Africa %K infant %K adolescent %K big data %K descriptive analysis %K laboratory data %D 2024 %7 14.5.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Numerous studies in South Africa have reported low HIV viral load (VL) suppression and high attrition rates within the pediatric HIV treatment program. Objective: Using routine laboratory data, we evaluated HIV VL monitoring, including mobility and overdue VL (OVL) testing, within 5 priority districts in South Africa. Methods: We performed a retrospective descriptive analysis of National Health Laboratory Service (NHLS) data for children and adolescents aged 1-15 years having undergone HIV VL testing between May 1, 2019, and April 30, 2020, from 152 facilities within the City of Johannesburg, City of Tshwane, eThekwini, uMgungundlovu, and Zululand. HIV VL test–level data were deduplicated to patient-level data using the NHLS CDW (Corporate Data Warehouse) probabilistic record-linking algorithm and then further manually deduplicated. An OVL was defined as no subsequent VL determined within 18 months of the last test. Variables associated with the last VL test, including age, sex, VL findings, district type, and facility type, are described. A multivariate logistic regression analysis was performed to identify variables associated with an OVL test. Results: Among 21,338 children and adolescents aged 1-15 years who had an HIV VL test, 72.70% (n=15,512) had a follow-up VL test within 18 months. Furthermore, 13.33% (n=2194) of them were followed up at a different facility, of whom 3.79% (n=624) were in a different district and 1.71% (n=281) were in a different province. Among patients with a VL of ≥1000 RNA copies/mL of plasma, the median time to subsequent testing was 6 (IQR 4-10) months. The younger the age of the patient, the greater the proportion with an OVL, ranging from a peak of 52% among 1-year-olds to a trough of 21% among 14-year-olds. On multivariate analysis, 2 consecutive HIV VL findings of ≥1000 RNA copies/mL of plasma were associated with an increased adjusted odds ratio (AOR) of having an OVL (AOR 2.07, 95% CI 1.71-2.51). Conversely, patients examined at a hospital (AOR 0.86, 95% CI 0.77-0.96), those with ≥2 previous tests (AOR 0.78, 95% CI 0.70-0.86), those examined in a rural district (AOR 0.63, 95% CI 0.54-0.73), and older age groups of 5-9 years (AOR 0.56, 95% CI 0.47-0.65) and 10-14 years (AOR 0.51, 95% CI 0.44-0.59) compared to 1-4 years were associated with a significantly decreased odds of having an OVL test. Conclusions: Considerable attrition occurs within South Africa’s pediatric HIV treatment program, with over one-fourth of children having an OVL test 18 months subsequent to their previous test. In particular, younger children and those with virological failure were found to be at increased risk of having an OVL test. Improved HIV VL monitoring is essential for improving outcomes within South Africa’s pediatric antiretroviral treatment program. %M 38743934 %R 10.2196/40796 %U https://publichealth.jmir.org/2024/1/e40796 %U https://doi.org/10.2196/40796 %U http://www.ncbi.nlm.nih.gov/pubmed/38743934 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e49841 %T Defining the Subtypes of Long COVID and Risk Factors for Prolonged Disease: Population-Based Case-Crossover Study %A Resendez,Skyler %A Brown,Steven H %A Ruiz Ayala,Hugo Sebastian %A Rangan,Prahalad %A Nebeker,Jonathan %A Montella,Diane %A Elkin,Peter L %+ Department of Biomedical Informatics, University at Buffalo, State University of New York, 77 Goodell Street, Suite 540, Buffalo, NY, 14203, United States, 1 5073581341, elkinp@buffalo.edu %K long COVID %K PASC %K postacute sequelae of COVID-19 %K public health %K policy initiatives %K pandemic %K diagnosis %K COVID-19 treatment %K long COVID cause %K health care support %K public safety %K COVID-19 %K Veterans Affairs %K United States %K COVID-19 testing %K clinician %K mobile phone %D 2024 %7 30.4.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: There have been over 772 million confirmed cases of COVID-19 worldwide. A significant portion of these infections will lead to long COVID (post–COVID-19 condition) and its attendant morbidities and costs. Numerous life-altering complications have already been associated with the development of long COVID, including chronic fatigue, brain fog, and dangerous heart rhythms. Objective: We aim to derive an actionable long COVID case definition consisting of significantly increased signs, symptoms, and diagnoses to support pandemic-related clinical, public health, research, and policy initiatives. Methods: This research employs a case-crossover population-based study using International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) data generated at Veterans Affairs medical centers nationwide between January 1, 2020, and August 18, 2022. In total, 367,148 individuals with ICD-10-CM data both before and after a positive COVID-19 test were selected for analysis. We compared ICD-10-CM codes assigned 1 to 7 months following each patient’s positive test with those assigned up to 6 months prior. Further, 350,315 patients had novel codes assigned during this window of time. We defined signs, symptoms, and diagnoses as being associated with long COVID if they had a novel case frequency of ≥1:1000, and they significantly increased in our entire cohort after a positive test. We present odds ratios with CIs for long COVID signs, symptoms, and diagnoses, organized by ICD-10-CM functional groups and medical specialty. We used our definition to assess long COVID risk based on a patient’s demographics, Elixhauser score, vaccination status, and COVID-19 disease severity. Results: We developed a long COVID definition consisting of 323 ICD-10-CM diagnosis codes grouped into 143 ICD-10-CM functional groups that were significantly increased in our 367,148 patient post–COVID-19 population. We defined 17 medical-specialty long COVID subtypes such as cardiology long COVID. Patients who were COVID-19–positive developed signs, symptoms, or diagnoses included in our long COVID definition at a proportion of at least 59.7% (268,320/449,450, based on a denominator of all patients who were COVID-19–positive). The long COVID cohort was 8 years older with more comorbidities (2-year Elixhauser score 7.97 in the patients with long COVID vs 4.21 in the patients with non–long COVID). Patients who had a more severe bout of COVID-19, as judged by their minimum oxygen saturation level, were also more likely to develop long COVID. Conclusions: An actionable, data-driven definition of long COVID can help clinicians screen for and diagnose long COVID, allowing identified patients to be admitted into appropriate monitoring and treatment programs. This long COVID definition can also support public health, research, and policy initiatives. Patients with COVID-19 who are older or have low oxygen saturation levels during their bout of COVID-19, or those who have multiple comorbidities should be preferentially watched for the development of long COVID. %M 38687984 %R 10.2196/49841 %U https://publichealth.jmir.org/2024/1/e49841 %U https://doi.org/10.2196/49841 %U http://www.ncbi.nlm.nih.gov/pubmed/38687984 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 13 %N %P e53837 %T Investigating SARS-CoV-2 Incidence and Morbidity in Ponce, Puerto Rico: Protocol and Baseline Results From a Community Cohort Study %A Major,Chelsea G %A Rodríguez,Dania M %A Sánchez-González,Liliana %A Rodríguez-Estrada,Vanessa %A Morales-Ortíz,Tatiana %A Torres,Carolina %A Pérez-Rodríguez,Nicole M %A Medina-Lópes,Nicole A %A Alexander,Neal %A Mabey,David %A Ryff,Kyle %A Tosado-Acevedo,Rafael %A Muñoz-Jordán,Jorge %A Adams,Laura E %A Rivera-Amill,Vanessa %A Rolfes,Melissa %A Paz-Bailey,Gabriela %+ Division of Vector Borne Diseases, Centers for Disease Control and Prevention, 1324 Calle Cañada, San Juan, 00920, Puerto Rico, 1 787 706 2254, lhi5@cdc.gov %K cohort studies %K COVID-19 %K epidemiologic studies %K Hispanic or Latino %K incidence %K prospective studies %K research methodology %K SARS-CoV-2 %K seroprevalence %D 2024 %7 19.4.2024 %9 Protocol %J JMIR Res Protoc %G English %X Background: A better understanding of SARS-CoV-2 infection risk among Hispanic and Latino populations and in low-resource settings in the United States is needed to inform control efforts and strategies to improve health equity. Puerto Rico has a high poverty rate and other population characteristics associated with increased vulnerability to COVID-19, and there are limited data to date to determine community incidence. Objective: This study describes the protocol and baseline seroprevalence of SARS-CoV-2 in a prospective community-based cohort study (COPA COVID-19 [COCOVID] study) to investigate SARS-CoV-2 infection incidence and morbidity in Ponce, Puerto Rico. Methods: In June 2020, we implemented the COCOVID study within the Communities Organized to Prevent Arboviruses project platform among residents of 15 communities in Ponce, Puerto Rico, aged 1 year or older. Weekly, participants answered questionnaires on acute symptoms and preventive behaviors and provided anterior nasal swab samples for SARS-CoV-2 polymerase chain reaction testing; additional anterior nasal swabs were collected for expedited polymerase chain reaction testing from participants that reported 1 or more COVID-19–like symptoms. At enrollment and every 6 months during follow-up, participants answered more comprehensive questionnaires and provided venous blood samples for multiantigen SARS-CoV-2 immunoglobulin G antibody testing (an indicator of seroprevalence). Weekly follow-up activities concluded in April 2022 and 6-month follow-up visits concluded in August 2022. Primary study outcome measures include SARS-CoV-2 infection incidence and seroprevalence, relative risk of SARS-CoV-2 infection by participant characteristics, SARS-CoV-2 household attack rate, and COVID-19 illness characteristics and outcomes. In this study, we describe the characteristics of COCOVID participants overall and by SARS-CoV-2 seroprevalence status at baseline. Results: We enrolled a total of 1030 participants from 388 households. Relative to the general populations of Ponce and Puerto Rico, our cohort overrepresented middle-income households, employed and middle-aged adults, and older children (P<.001). Almost all participants (1021/1025, 99.61%) identified as Latino/a, 17.07% (175/1025) had annual household incomes less than US $10,000, and 45.66% (463/1014) reported 1 or more chronic medical conditions. Baseline SARS-CoV-2 seroprevalence was low (16/1030, 1.55%) overall and increased significantly with later study enrollment time (P=.003). Conclusions: The COCOVID study will provide a valuable opportunity to better estimate the burden of SARS-CoV-2 and associated risk factors in a primarily Hispanic or Latino population, assess the limitations of surveillance, and inform mitigation measures in Puerto Rico and other similar populations. International Registered Report Identifier (IRRID): RR1-10.2196/53837 %M 38640475 %R 10.2196/53837 %U https://www.researchprotocols.org/2024/1/e53837 %U https://doi.org/10.2196/53837 %U http://www.ncbi.nlm.nih.gov/pubmed/38640475 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e46360 %T Projected Time for the Elimination of Cervical Cancer Under Various Intervention Scenarios: Age-Period-Cohort Macrosimulation Study %A Chen,Yi-Chu %A Chen,Yun-Yuan %A Su,Shih-Yung %A Jhuang,Jing-Rong %A Chiang,Chun-Ju %A Yang,Ya-Wen %A Lin,Li-Ju %A Wu,Chao-Chun %A Lee,Wen-Chung %+ Institute of Health Data Analytics, College of Public Health, National Taiwan University, Room 536, No 17, Xuzhou Road, Taipei, 100, Taiwan, 886 223511955, wenchung@ntu.edu.tw %K age-period-cohort model %K population attributable fraction %K macrosimulation %K cancer screening %K human papillomavirus %K HPV %K cervical cancer %K intervention %K women %K cervical screening %K public health intervention %D 2024 %7 18.4.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The World Health Organization aims for the global elimination of cervical cancer, necessitating modeling studies to forecast long-term outcomes. Objective: This paper introduces a macrosimulation framework using age-period-cohort modeling and population attributable fractions to predict the timeline for eliminating cervical cancer in Taiwan. Methods: Data for cervical cancer cases from 1997 to 2016 were obtained from the Taiwan Cancer Registry. Future incidence rates under the current approach and various intervention strategies, such as scaled-up screening (cytology based or human papillomavirus [HPV] based) and HPV vaccination, were projected. Results: Our projections indicate that Taiwan could eliminate cervical cancer by 2050 with either 70% compliance in cytology-based or HPV-based screening or 90% HPV vaccination coverage. The years projected for elimination are 2047 and 2035 for cytology-based and HPV-based screening, respectively; 2050 for vaccination alone; and 2038 and 2033 for combined screening and vaccination approaches. Conclusions: The age-period-cohort macrosimulation framework offers a valuable policy analysis tool for cervical cancer control. Our findings can inform strategies in other high-incidence countries, serving as a benchmark for global efforts to eliminate the disease. %M 38635315 %R 10.2196/46360 %U https://publichealth.jmir.org/2024/1/e46360 %U https://doi.org/10.2196/46360 %U http://www.ncbi.nlm.nih.gov/pubmed/38635315 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e38170 %T Comparing Contact Tracing Through Bluetooth and GPS Surveillance Data: Simulation-Driven Approach %A Qian,Weicheng %A Cooke,Aranock %A Stanley,Kevin Gordon %A Osgood,Nathaniel David %+ Department of Computer Science, University of Saskatchewan, 110 Science Place, Saskatoon, SK, S7N 5C9, Canada, 1 3069661947, weicheng.qian@usask.ca %K smartphone-based sensing %K proximity contact data %K transmission models %K agent-based simulation %K health informatics %K mobile phone %D 2024 %7 17.4.2024 %9 Original Paper %J J Med Internet Res %G English %X Background: Accurate and responsive epidemiological simulations of epidemic outbreaks inform decision-making to mitigate the impact of pandemics. These simulations must be grounded in quantities derived from measurements, among which the parameters associated with contacts between individuals are notoriously difficult to estimate. Digital contact tracing data, such as those provided by Bluetooth beaconing or GPS colocating, can provide more precise measures of contact than traditional methods based on direct observation or self-reporting. Both measurement modalities have shortcomings and are prone to false positives or negatives, as unmeasured environmental influences bias the data. Objective: We aim to compare GPS colocated versus Bluetooth beacon–derived proximity contact data for their impacts on transmission models’ results under community and types of diseases. Methods: We examined the contact patterns derived from 3 data sets collected in 2016, with participants comprising students and staff from the University of Saskatchewan in Canada. Each of these 3 data sets used both Bluetooth beaconing and GPS localization on smartphones running the Ethica Data (Avicenna Research) app to collect sensor data about every 5 minutes over a month. We compared the structure of contact networks inferred from proximity contact data collected with the modalities of GPS colocating and Bluetooth beaconing. We assessed the impact of sensing modalities on the simulation results of transmission models informed by proximate contacts derived from sensing data. Specifically, we compared the incidence number, attack rate, and individual infection risks across simulation results of agent-based susceptible-exposed-infectious-removed transmission models of 4 different contagious diseases. We have demonstrated their differences with violin plots, 2-tailed t tests, and Kullback-Leibler divergence. Results: Both network structure analyses show visually salient differences in proximity contact data collected between GPS colocating and Bluetooth beaconing, regardless of the underlying population. Significant differences were found for the estimated attack rate based on distance threshold, measurement modality, and simulated disease. This finding demonstrates that the sensor modality used to trace contact can have a significant impact on the expected propagation of a disease through a population. The violin plots of attack rate and Kullback-Leibler divergence of individual infection risks demonstrated discernible differences for different sensing modalities, regardless of the underlying population and diseases. The results of the t tests on attack rate between different sensing modalities were mostly significant (P<.001). Conclusions: We show that the contact networks generated from these 2 measurement modalities are different and generate significantly different attack rates across multiple data sets and pathogens. While both modalities offer higher-resolution portraits of contact behavior than is possible with most traditional contact measures, the differential impact of measurement modality on the simulation outcome cannot be ignored and must be addressed in studies only using a single measure of contact in the future. %M 38422493 %R 10.2196/38170 %U https://www.jmir.org/2024/1/e38170 %U https://doi.org/10.2196/38170 %U http://www.ncbi.nlm.nih.gov/pubmed/38422493 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e49433 %T Chronic Disease Patterns and Their Relationship With Health-Related Quality of Life in South Korean Older Adults With the 2021 Korean National Health and Nutrition Examination Survey: Latent Class Analysis %A Lee,Mi-Sun %A Lee,Hooyeon %+ Department of Preventive Medicine, College of Medicine, The Catholic University of Korea, 222, Banpo-daero, Seocho-gu, Seoul, 06591, Republic of Korea, 82 2 3147 8381, hylee@catholic.ac.kr %K chronic disease %K latent class analysis %K multimorbidity %K older adults %K quality of life %D 2024 %7 10.4.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Improved life expectancy has increased the prevalence of older adults living with multimorbidities, which likely deteriorates their health-related quality of life (HRQoL). Understanding which chronic conditions frequently co-occur can facilitate person-centered care tailored to the needs of individuals with specific multimorbidity profiles. Objective: The study objectives were to (1) examine the prevalence of multimorbidity among Korean older adults (ie, those aged 65 years and older), (2) investigate chronic disease patterns using latent class analysis, and (3) assess which chronic disease patterns are more strongly associated with HRQoL. Methods: A sample of 1806 individuals aged 65 years and older from the 2021 Korean National Health and Nutrition Examination Survey was analyzed. Latent class analysis was conducted to identify the clustering pattern of chronic diseases. HRQoL was assessed by an 8-item health-related quality of life scale (HINT-8). Multiple linear regression was used to analyze the association with the total score of the HINT-8. Logistic regression analysis was performed to evaluate the odds ratio of having problems according to the HINT-8 items. Results: The prevalence of multimorbidity in the sample was 54.8%. Three chronic disease patterns were identified: relatively healthy, cardiometabolic condition, arthritis, allergy, or asthma. The total scores of the HINT-8 were the highest in participants characterized as arthritis, allergy, or asthma group, indicating the lowest quality of life. Conclusions: Current health care models are disease-oriented, meaning that the management of chronic conditions applies to a single condition and may not be relevant to those with multimorbidities. Identifying chronic disease patterns and their impact on overall health and well-being is critical for guiding integrated care. %M 38598275 %R 10.2196/49433 %U https://publichealth.jmir.org/2024/1/e49433 %U https://doi.org/10.2196/49433 %U http://www.ncbi.nlm.nih.gov/pubmed/38598275 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e49527 %T Estimated Number of Injection-Involved Overdose Deaths in US States From 2000 to 2020: Secondary Analysis of Surveillance Data %A Hall,Eric William %A Sullivan,Patrick Sean %A Bradley,Heather %+ OHSU-PSU School of Public Health, Oregon Health and Science University, 1810 SW 5th Avenue, Suite 510, Portland, OR, 97201, United States, 1 503 494 4966, halleri@ohsu.edu %K death rate %K death %K drug abuse %K drugs %K injection drug use %K injection %K mortality %K National Vital Statistics System %K overdose death rate %K overdose %K state %K substance abuse %K Treatment Episode Dataset-Admission %K treatment %D 2024 %7 5.4.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: In the United States, both drug overdose mortality and injection-involved drug overdose mortality have increased nationally over the past 25 years. Despite documented geographic differences in overdose mortality and substances implicated in overdose mortality trends, injection-involved overdose mortality has not been summarized at a subnational level. Objective: We aimed to estimate the annual number of injection-involved overdose deaths in each US state from 2000 to 2020. Methods: We conducted a stratified analysis that used data from drug treatment admissions (Treatment Episodes Data Set–Admissions; TEDS-A) and the National Vital Statistics System (NVSS) to estimate state-specific percentages of reported drug overdose deaths that were injection-involved from 2000 to 2020. TEDS-A collects data on the route of administration and the type of substance used upon treatment admission. We used these data to calculate the percentage of reported injections for each drug type by demographic group (race or ethnicity, sex, and age group), year, and state. Additionally, using NVSS mortality data, the annual number of overdose deaths involving selected drug types was identified by the following specific multiple-cause-of-death codes: heroin or synthetic opioids other than methadone (T40.1, T40.4), natural or semisynthetic opioids and methadone (T40.2, T40.3), cocaine (T40.5), psychostimulants with abuse potential (T43.6), sedatives (T42.3, T42.4), and others (T36-T59.0). We used the probabilities of injection with the annual number of overdose deaths, by year, primary substance, and demographic groups to estimate the number of overdose deaths that were injection-involved. Results: In 2020, there were 91,071 overdose deaths among adults recorded in the United States, and 93.1% (84,753/91,071) occurred in the 46 jurisdictions that reported data to TEDS-A. Slightly less than half (38,253/84,753, 45.1%; 95% CI 41.1%-49.8%) of those overdose deaths were estimated to be injection-involved, translating to 38,253 (95% CI 34,839-42,181) injection-involved overdose deaths in 2020. There was large variation among states in the estimated injection-involved overdose death rate (median 14.72, range 5.45-31.77 per 100,000 people). The national injection-involved overdose death rate increased by 323% (95% CI 255%-391%) from 2010 (3.78, 95% CI 3.33-4.31) to 2020 (15.97, 95% CI 14.55-17.61). States in which the estimated injection-involved overdose death rate increased faster than the national average were disproportionately concentrated in the Northeast region. Conclusions: Although overdose mortality and injection-involved overdose mortality have increased dramatically across the country, these trends have been more pronounced in some regions. A better understanding of state-level trends in injection-involved mortality can inform the prioritization of public health strategies that aim to reduce overdose mortality and prevent downstream consequences of injection drug use. %M 38578676 %R 10.2196/49527 %U https://publichealth.jmir.org/2024/1/e49527 %U https://doi.org/10.2196/49527 %U http://www.ncbi.nlm.nih.gov/pubmed/38578676 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e47422 %T Timely Pulmonary Tuberculosis Diagnosis Based on the Epidemiological Disease Spectrum: Population-Based Prospective Cohort Study in the Republic of Korea %A Ko,Yousang %A Park,Jae Seuk %A Min,Jinsoo %A Kim,Hyung Woo %A Koo,Hyeon-Kyoung %A Oh,Jee Youn %A Jeong,Yun-Jeong %A Lee,Eunhye %A Yang,Bumhee %A Kim,Ju Sang %A Lee,Sung-Soon %A Kwon,Yunhyung %A Yang,Jiyeon %A Han,Ji yeon %A Jang,You Jin %A Kim,Jinseob %+ Division of Pulmonary, Allergy and Critical Care Medicine, Department of Internal Medicine, Hallym University Kangdong Sacred Heart Hospital, Sung-an ro 150, Kangdonggu, Seoul, 05355/82, Republic of Korea, 82 2224 2561, koyus@naver.com %K pulmonary tuberculosis %K disease spectrum %K timely diagnosis %K patient delay %K health care delay %K risk factor %K epidemiological disease %K tuberculosis %K treatment %K TB %K PTB disease spectrum %K mortality %K early diagnosis %D 2024 %7 1.4.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Timely pulmonary tuberculosis (PTB) diagnosis is a global health priority for interrupting transmission and optimizing treatment outcomes. The traditional dichotomous time-divided approach for addressing time delays in diagnosis has limited clinical application because the time delay significantly varies depending on each community in question. Objective: We aimed to reevaluate the diagnosis time delay based on the PTB disease spectrum using a novel scoring system that was applied at the national level in the Republic of Korea. Methods: The Pulmonary Tuberculosis Spectrum Score (PTBSS) was developed based on previously published proposals related to the disease spectrum, and its validity was assessed by examining both all-cause and PTB-related mortality. In our analysis, we integrated the PTBSS into the Korea Tuberculosis Cohort Registry. We evaluated various time delays, including patient, health care, and overall delays, and their system-associated variables in line with each PTBSS. Furthermore, we reclassified the scores into distinct categories of mild (PTBSS=0-1), moderate (PBTBSS=2-3), and severe (PBTBSS=4-6) using a multivariate regression approach. Results: Among the 14,031 Korean patients with active PTB whose data were analyzed from 2018 to 2020, 37% (n=5191), 38% (n=5328), and 25% (n=3512) were classified as having a mild, moderate, and severe disease status, respectively, according to the PTBSS. This classification can therefore reflect the disease spectrum of PTB by considering the correlation of the score with mortality. The time delay patterns differed according to the PTBSS. In health care delays according to the PTBSS, greater PTB disease progression was associated with a shorter diagnosis period, since the condition is microbiologically easy to diagnose. However, with respect to patient delays, the change in elapsed time showed a U-shaped pattern as PTB progressed. This means that a remarkable patient delay in the real-world setting might occur at both apical ends of the spectrum (ie, in both mild and severe cases of PTB). Independent risk factors for a severe PTB pattern were age (adjusted odds ratio 1.014) and male sex (adjusted odds ratio 1.422), whereas no significant risk factor was found for mild PTB. Conclusions: Timely PTB diagnosis should be accomplished. This can be improved with use of the PTBSS, a simple and intuitive scoring system, which can be more helpful in clinical and public health applications compared to the traditional dichotomous time-only approach. %M 38557939 %R 10.2196/47422 %U https://publichealth.jmir.org/2024/1/e47422 %U https://doi.org/10.2196/47422 %U http://www.ncbi.nlm.nih.gov/pubmed/38557939 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e48738 %T Triangulating Truth and Reaching Consensus on Population Size, Prevalence, and More: Modeling Study %A Fellows,Ian E %A Corcoran,Carl %A McIntyre,Anne F %+ Division of Global HIV & TB, Global Health Center, Centers for Disease Control and Prevention, 1600 Clifton Rd NE, Atlanta, GA, 30333, United States, 1 (404) 713 3545, ruu2@cdc.gov %K HIV %K epidemiology %K population size estimation %K key populations %K Bayesian models %K consensus estimation %K statistical tool %K prevalence %K Bayesian model %K population %K estimate %K consensus %K population size %D 2024 %7 19.3.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Population size, prevalence, and incidence are essential metrics that influence public health programming and policy. However, stakeholders are frequently tasked with setting performance targets, reporting global indicators, and designing policies based on multiple (often incongruous) estimates of these variables, and they often do so in the absence of a formal, transparent framework for reaching a consensus estimate. Objective: This study aims to describe a model to synthesize multiple study estimates while incorporating stakeholder knowledge, introduce an R Shiny app to implement the model, and demonstrate the model and app using real data. Methods: In this study, we developed a Bayesian hierarchical model to synthesize multiple study estimates that allow the user to incorporate the quality of each estimate as a confidence score. The model was implemented as a user-friendly R Shiny app aimed at practitioners of population size estimation. The underlying Bayesian model was programmed in Stan for efficient sampling and computation. Results: The app was demonstrated using biobehavioral survey-based population size estimates (and accompanying confidence scores) of female sex workers and men who have sex with men from 3 survey locations in a country in sub-Saharan Africa. The consensus results incorporating confidence scores are compared with the case where they are absent, and the results with confidence scores are shown to perform better according to an app-supplied metric for unaccounted-for variation. Conclusions: The utility of the triangulator model, including the incorporation of confidence scores, as a user-friendly app is demonstrated using a use case example. Our results offer empirical evidence of the model’s effectiveness in producing an accurate consensus estimate and emphasize the significant impact that the accessible model and app offer for public health. It offers a solution to the long-standing problem of synthesizing multiple estimates, potentially leading to more informed and evidence-based decision-making processes. The Triangulator has broad utility and flexibility to be adapted and used in various other contexts and regions to address similar challenges. %M 38502183 %R 10.2196/48738 %U https://publichealth.jmir.org/2024/1/e48738 %U https://doi.org/10.2196/48738 %U http://www.ncbi.nlm.nih.gov/pubmed/38502183 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e50743 %T Estimation of the Population Size of Street- and Venue-Based Female Sex Workers and Sexually Exploited Minors in Rwanda in 2022: 3-Source Capture-Recapture %A Tuyishime,Elysee %A Remera,Eric %A Kayitesi,Catherine %A Malamba,Samuel %A Sangwayire,Beata %A Habimana Kabano,Ignace %A Ruisenor-Escudero,Horacio %A Oluoch,Tom %A Unna Chukwu,Angela %+ Division of Global HIV and TB, Global Health Center, US Centers for Disease Control and Prevention, 337H+F58, 30 KG 7 Avenue, Kigali, P.O. Box 28, Rwanda, 250 788381437, obx5@cdc.gov %K population size %K female sex workers %K capture-recapture %K 3-source %K Rwanda %K HIV %K surveillance %K population %K epidemiology %K prevention %K AIDS %K sexually transmitted disease %K STD %K minor %K young adult %K sexually exploited minor %K children %D 2024 %7 15.3.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: HIV surveillance among key populations is a priority in all epidemic settings. Female sex workers (FSWs) globally as well as in Rwanda are disproportionately affected by the HIV epidemic; hence, the Rwanda HIV and AIDS National Strategic Plan (2018-2024) has adopted regular surveillance of population size estimation (PSE) of FSWs every 2-3 years. Objective: We aimed at estimating, for the fourth time, the population size of street- and venue-based FSWs and sexually exploited minors aged ≥15 years in Rwanda. Methods: In August 2022, the 3-source capture-recapture method was used to estimate the population size of FSWs and sexually exploited minors in Rwanda. The field work took 3 weeks to complete, with each capture occasion lasting for a week. The sample size for each capture was calculated using shinyrecap with inputs drawn from previously conducted estimation exercises. In each capture round, a stratified multistage sampling process was used, with administrative provinces as strata and FSW hotspots as the primary sampling unit. Different unique objects were distributed to FSWs in each capture round; acceptance of the unique object was marked as successful capture. Sampled FSWs for the subsequent capture occasions were asked if they had received the previously distributed unique object in order to determine recaptures. Statistical analysis was performed in R (version 4.0.5), and Bayesian Model Averaging was performed to produce the final PSE with a 95% credibility set (CS). Results: We sampled 1766, 1848, and 1865 FSWs and sexually exploited minors in each capture round. There were 169 recaptures strictly between captures 1 and 2, 210 recaptures exclusively between captures 2 and 3, and 65 recaptures between captures 1 and 3 only. In all 3 captures, 61 FSWs were captured. The median PSE of street- and venue-based FSWs and sexually exploited minors in Rwanda was 37,647 (95% CS 31,873-43,354), corresponding to 1.1% (95% CI 0.9%-1.3%) of the total adult females in the general population. Relative to the adult females in the general population, the western and northern provinces ranked first and second with a higher concentration of FSWs, respectively. The cities of Kigali and eastern province ranked third and fourth, respectively. The southern province was identified as having a low concentration of FSWs. Conclusions: We provide, for the first time, both the national and provincial level population size estimate of street- and venue-based FSWs in Rwanda. Compared with the previous 2 rounds of FSW PSEs at the national level, we observed differences in the street- and venue-based FSW population size in Rwanda. Our study might not have considered FSWs who do not want anyone to know they are FSWs due to several reasons, leading to a possible underestimation of the true PSE. %M 38488847 %R 10.2196/50743 %U https://publichealth.jmir.org/2024/1/e50743 %U https://doi.org/10.2196/50743 %U http://www.ncbi.nlm.nih.gov/pubmed/38488847 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e48186 %T Improving the Efficiency of Inferences From Hybrid Samples for Effective Health Surveillance Surveys: Comprehensive Review of Quantitative Methods %A Fahimi,Mansour %A Hair,Elizabeth C %A Do,Elizabeth K %A Kreslake,Jennifer M %A Yan,Xiaolu %A Chan,Elisa %A Barlas,Frances M %A Giles,Abigail %A Osborn,Larry %+ Marketing Systems Group, 755 Business Center Drive, Suite 200, Horsham, PA, 19044, United States, 1 2156202880, mfahimi@m-s-g.com %K hybrid samples %K composite estimation %K optimal composition factor %K unequal weighting effect %K composite weighting %K weighting %K surveillance %K sample survey %K data collection %K risk factor %D 2024 %7 7.3.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Increasingly, survey researchers rely on hybrid samples to improve coverage and increase the number of respondents by combining independent samples. For instance, it is possible to combine 2 probability samples with one relying on telephone and another on mail. More commonly, however, researchers are now supplementing probability samples with those from online panels that are less costly. Setting aside ad hoc approaches that are void of rigor, traditionally, the method of composite estimation has been used to blend results from different sample surveys. This means individual point estimates from different surveys are pooled together, 1 estimate at a time. Given that for a typical study many estimates must be produced, this piecemeal approach is computationally burdensome and subject to the inferential limitations of the individual surveys that are used in this process. Objective: In this paper, we will provide a comprehensive review of the traditional method of composite estimation. Subsequently, the method of composite weighting is introduced, which is significantly more efficient, both computationally and inferentially when pooling data from multiple surveys. With the growing interest in hybrid sampling alternatives, we hope to offer an accessible methodology for improving the efficiency of inferences from such sample surveys without sacrificing rigor. Methods: Specifically, we will illustrate why the many ad hoc procedures for blending survey data from multiple surveys are void of scientific integrity and subject to misleading inferences. Moreover, we will demonstrate how the traditional approach of composite estimation fails to offer a pragmatic and scalable solution in practice. By relying on theoretical and empirical justifications, in contrast, we will show how our proposed methodology of composite weighting is both scientifically sound and inferentially and computationally superior to the old method of composite estimation. Results: Using data from 3 large surveys that have relied on hybrid samples composed of probability-based and supplemental sample components from online panels, we illustrate that our proposed method of composite weighting is superior to the traditional method of composite estimation in 2 distinct ways. Computationally, it is vastly less demanding and hence more accessible for practitioners. Inferentially, it produces more efficient estimates with higher levels of external validity when pooling data from multiple surveys. Conclusions: The new realities of the digital age have brought about a number of resilient challenges for survey researchers, which in turn have exposed some of the inefficiencies associated with the traditional methods this community has relied upon for decades. The resilience of such challenges suggests that piecemeal approaches that may have limited applicability or restricted accessibility will prove to be inadequate and transient. It is from this perspective that our proposed method of composite weighting has aimed to introduce a durable and accessible solution for hybrid sample surveys. %M 38451620 %R 10.2196/48186 %U https://publichealth.jmir.org/2024/1/e48186 %U https://doi.org/10.2196/48186 %U http://www.ncbi.nlm.nih.gov/pubmed/38451620 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e44648 %T Association Between Nitrogen Dioxide Pollution and Cause-Specific Mortality in China: Cross-Sectional Time Series Study %A Zeng,Jie %A Lin,Guozhen %A Dong,Hang %A Li,Mengmeng %A Ruan,Honglian %A Yang,Jun %+ School of Public Health, Guangzhou Medical University, No. 1 Xinzao Road, Xinzao Town, Panyu District, Guangzhou, 511436, China, 86 020 37103532, yangjun_eci@jnu.edu.cn %K nitrogen dioxide %K cause-specific mortality %K stratification effect %K vulnerable subpopulations %K China %D 2024 %7 5.2.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Nitrogen dioxide (NO2) has been frequently linked to a range of diseases and associated with high rates of mortality and morbidity worldwide. However, there is limited evidence regarding the risk of NO2 on a spectrum of causes of mortality. Moreover, adjustment for potential confounders in NO2 analysis has been insufficient, and the spatial resolution of exposure assessment has been limited. Objective: This study aimed to quantitatively assess the relationship between short-term NO2 exposure and death from a range of causes by adjusting for potential confounders in Guangzhou, China, and determine the modifying effect of gender and age. Methods: A time series study was conducted on 413,703 deaths that occurred in Guangzhou during the period of 2010 to 2018. The causes of death were classified into 10 categories and 26 subcategories. We utilized a generalized additive model with quasi-Poisson regression analysis using a natural cubic splines function with lag structure of 0 to 4 days to estimate the potential lag effect of NO2 on cause-specific mortality. We estimated the percentage change in cause-specific mortality rates per 10 μg/m3 increase in NO2 levels. We stratified meteorological factors such as temperature, humidity, wind speed, and air pressure into high and low levels with the median as the critical value and analyzed the effects of NO2 on various death-causing diseases at those high and low levels. To further identify potentially vulnerable subpopulations, we analyzed groups stratified by gender and age. Results: A significant association existed between NO2 exposure and deaths from multiple causes. Each 10 μg/m3 increment in NO2 density at a lag of 0 to 4 days increased the risks of all-cause mortality by 1.73% (95% CI 1.36%-2.09%) and mortality due to nonaccidental causes, cardiovascular disease, respiratory disease, endocrine disease, and neoplasms by 1.75% (95% CI 1.38%-2.12%), 2.06% (95% CI 1.54%-2.59%), 2.32% (95% CI 1.51%-3.13%), 2.40% (95% CI 0.84%-3.98%), and 1.18% (95% CI 0.59%-1.78%), respectively. Among the 26 subcategories, mortality risk was associated with 16, including intentional self-harm, hypertensive disease, and ischemic stroke disease. Relatively higher effect estimates of NO2 on mortality existed for low levels of temperature, relative humidity, wind speed, and air pressure than with high levels, except a relatively higher effect estimate was present for endocrine disease at a high air pressure level. Most of the differences between subgroups were not statistically significant. The effect estimates for NO2 were similar by gender. There were significant differences between the age groups for mortality due to all causes, nonaccidental causes, and cardiovascular disease. Conclusions: Short-term NO2 exposure may increase the risk of mortality due to a spectrum of causes, especially in potentially vulnerable populations. These findings may be important for predicting and modifying guidelines for NO2 exposure in China. %M 38315528 %R 10.2196/44648 %U https://publichealth.jmir.org/2024/1/e44648 %U https://doi.org/10.2196/44648 %U http://www.ncbi.nlm.nih.gov/pubmed/38315528 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 10 %N %P e50020 %T Latent Heterogeneity of Online Sexual Experiences and Associations With Sexual Risk Behaviors and Behavioral Health Outcomes in Chinese Young Adults: Cross-Sectional Study %A Fong,Ted C T %A Cheung,Derek Yee Tak %A Choi,Edmond Pui Hang %A Fong,Daniel Y T %A Ho,Rainbow T H %A Ip,Patrick %A Kung,Man Chun %A Lam,Mona Wai Cheung %A Lee,Antoinette Marie %A Wong,William Chi Wai %A Lam,Tai Hing %A Yip,Paul S F %+ Centre for Suicide Research and Prevention, Faculty of Social Sciences, The University of Hong Kong, 2/F, The HKJC Building for Interdisciplinary Research, 5 Sassoon Road, Pokfulam, Hong Kong, 852, China (Hong Kong), 852 2831 5232, sfpyip@hku.hk %K Hong Kong %K latent class analysis %K mediation %K mental health %K sex knowledge %K sexual risk behaviors %K sexually transmitted infections %K structural equation modeling %K youth sexuality %D 2024 %7 26.1.2024 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Online sexual experiences (OSEs) are becoming increasingly common in young adults, but existing papers have reported only on specific types of OSEs and have not shown the heterogeneous nature of the repertoire of OSEs. The use patterns of OSEs remain unclear, and the relationships of OSEs with sexual risk behaviors and behavioral health outcomes have not been evaluated. Objective: This study aimed to examine the latent heterogeneity of OSEs in young adults and the associations with sexual risk behaviors and behavioral health outcomes. Methods: The 2021 Youth Sexuality Study of the Hong Kong Family Planning Association phone interviewed a random sample of 1205 young adults in Hong Kong in 2022 (male sex: 613/1205, 50.9%; mean age 23.0 years, SD 2.86 years) on lifetime OSEs, demographic and family characteristics, Patient Health Questionnaire-4 (PHQ-4) scores, sex-related factors (sexual orientation, sex knowledge, and sexual risk behaviors), and behavioral health outcomes (sexually transmitted infections [STIs], drug use, and suicidal ideation) in the past year. Sample heterogeneity of OSEs was analyzed via latent class analysis with substantive checking of the class profiles. Structural equation modeling was used to examine the direct and indirect associations between the OSE class and behavioral health outcomes via sexual risk behaviors and PHQ-4 scores. Results: The data supported 3 latent classes of OSEs with measurement invariance by sex. In this study, 33.1% (398/1205), 56.0% (675/1205), and 10.9% (132/1205) of the sample were in the abstinent class (minimal OSEs), normative class (occasional OSEs), and active class (substantive OSEs), respectively. Male participants showed a lower prevalence of the abstinent class (131/613, 21.4% versus 263/592, 44.4%) and a higher prevalence of the active class (104/613, 17.0% versus 28/592, 4.7%) than female participants. The normative class showed significantly higher sex knowledge than the other 2 classes. The active class was associated with male sex, nonheterosexual status, higher sex desire and PHQ-4 scores, and more sexual risk behaviors than the other 2 classes. Compared with the nonactive (abstinent and normative) classes, the active class was indirectly associated with higher rates of STIs (absolute difference in percentage points [Δ]=4.8%; P=.03) and drug use (Δ=7.6%; P=.001) via sexual risk behaviors, and with higher rates of suicidal ideation (Δ=2.5%; P=.007) via PHQ-4 scores. Conclusions: This study provided the first results on the 3 (abstinent, normative, and active) latent classes of OSEs with distinct profiles in OSEs, demographic and family characteristics, PHQ-4 scores, sex-related factors, and behavioral health outcomes. The active class showed indirect associations with higher rates of STIs and drug use via sexual risk behaviors and higher rates of suicidal ideation via PHQ-4 scores than the other 2 classes. These results have implications for the formulation and evaluation of targeted interventions to help young adults. %M 38277190 %R 10.2196/50020 %U https://publichealth.jmir.org/2024/1/e50020 %U https://doi.org/10.2196/50020 %U http://www.ncbi.nlm.nih.gov/pubmed/38277190 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e47981 %T Evaluation of a Targeted COVID-19 Community Outreach Intervention: Case Report for Precision Public Health %A De La Cerda,Isela %A Bauer,Cici X %A Zhang,Kehe %A Lee,Miryoung %A Jones,Michelle %A Rodriguez,Arturo %A McCormick,Joseph B %A Fisher-Hoch,Susan P %+ Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health Brownsville Campus, University of Texas Health Science Center at Houston, 780 Ringgold Road, Brownsville, TX, 78520, United States, 1 956 755 0600, Susan.P.Fisher-Hoch@uth.tmc.edu %K community interventions %K emergency preparedness %K health disparities %K intervention evaluation %K precision public health %K public health informatics %K public health intervention %K public health %K spatial epidemiology %K surveillance %D 2023 %7 20.12.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Cameron County, a low-income south Texas-Mexico border county marked by severe health disparities, was consistently among the top counties with the highest COVID-19 mortality in Texas at the onset of the pandemic. The disparity in COVID-19 burden within Texas counties revealed the need for effective interventions to address the specific needs of local health departments and their communities. Publicly available COVID-19 surveillance data were not sufficiently timely or granular to deliver such targeted interventions. An agency-academic collaboration in Cameron used novel geographic information science methods to produce granular COVID-19 surveillance data. These data were used to strategically target an educational outreach intervention named “Boots on the Ground” (BOG) in the City of Brownsville (COB). Objective: This study aimed to evaluate the impact of a spatially targeted community intervention on daily COVID-19 test counts. Methods: The agency-academic collaboration between the COB and UTHealth Houston led to the creation of weekly COVID-19 epidemiological reports at the census tract level. These reports guided the selection of census tracts to deliver targeted BOG between April 21 and June 8, 2020. Recordkeeping of the targeted BOG tracts and the intervention dates, along with COVID-19 daily testing counts per census tract, provided data for intervention evaluation. An interrupted time series design was used to evaluate the impact on COVID-19 test counts 2 weeks before and after targeted BOG. A piecewise Poisson regression analysis was used to quantify the slope (sustained) and intercept (immediate) change between pre- and post-BOG COVID-19 daily test count trends. Additional analysis of COB tracts that did not receive targeted BOG was conducted for comparison purposes. Results: During the intervention period, 18 of the 48 COB census tracts received targeted BOG. Among these, a significant change in the slope between pre- and post-BOG daily test counts was observed in 5 tracts, 80% (n=4) of which had a positive slope change. A positive slope change implied a significant increase in daily COVID-19 test counts 2 weeks after targeted BOG compared to the testing trend observed 2 weeks before intervention. In an additional analysis of the 30 census tracts that did not receive targeted BOG, significant slope changes were observed in 10 tracts, of which positive slope changes were only observed in 20% (n=2). In summary, we found that BOG-targeted tracts had mostly positive daily COVID-19 test count slope changes, whereas untargeted tracts had mostly negative daily COVID-19 test count slope changes. Conclusions: Evaluation of spatially targeted community interventions is necessary to strengthen the evidence base of this important approach for local emergency preparedness. This report highlights how an academic-agency collaboration established and evaluated the impact of a real-time, targeted intervention delivering precision public health to a small community. %M 38117549 %R 10.2196/47981 %U https://publichealth.jmir.org/2023/1/e47981 %U https://doi.org/10.2196/47981 %U http://www.ncbi.nlm.nih.gov/pubmed/38117549 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e46898 %T Application of Machine Learning Prediction of Individual SARS-CoV-2 Vaccination and Infection Status to the French Serosurveillance Survey From March 2020 to 2022: Cross-Sectional Study %A Bougeard,Stéphanie %A Huneau-Salaun,Adeline %A Attia,Mikael %A Richard,Jean-Baptiste %A Demeret,Caroline %A Platon,Johnny %A Allain,Virginie %A Le Vu,Stéphane %A Goyard,Sophie %A Gillon,Véronique %A Bernard-Stoecklin,Sibylle %A Crescenzo-Chaigne,Bernadette %A Jones,Gabrielle %A Rose,Nicolas %A van der Werf,Sylvie %A Lantz,Olivier %A Rose,Thierry %A Noël,Harold %+ Epidemiology, Health and Welfare, Laboratory of Ploufragan-Plouzané-Niort, French Agency for Food, Environmental, Occupational Health & Safety, BP 53 - Technopole Saint Brieuc Armor, Ploufragan, 22440, France, 33 296010150, stephanie.bougeard@anses.fr %K SARS-CoV-2 %K serological surveillance %K infection %K vaccination %K machine learning %K seroprevalence %K blood testing %K immunity %K survey %K vaccine response %K French population %K prediction %D 2023 %7 28.11.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The seroprevalence of SARS-CoV-2 infection in the French population was estimated with a representative, repeated cross-sectional survey based on residual sera from routine blood testing. These data contained no information on infection or vaccination status, thus limiting the ability to detail changes observed in the immunity level of the population over time. Objective: Our aim is to predict the infected or vaccinated status of individuals in the French serosurveillance survey based only on the results of serological assays. Reference data on longitudinal serological profiles of seronegative, infected, and vaccinated individuals from another French cohort were used to build the predictive model. Methods: A model of individual vaccination or infection status with respect to SARS-CoV-2 obtained from a machine learning procedure was proposed based on 3 complementary serological assays. This model was applied to the French nationwide serosurveillance survey from March 2020 to March 2022 to estimate the proportions of the population that were negative, infected, vaccinated, or infected and vaccinated. Results: From February 2021 to March 2022, the estimated percentage of infected and unvaccinated individuals in France increased from 7.5% to 16.8%. During this period, the estimated percentage increased from 3.6% to 45.2% for vaccinated and uninfected individuals and from 2.1% to 29.1% for vaccinated and infected individuals. The decrease in the seronegative population can be largely attributed to vaccination. Conclusions: Combining results from the serosurveillance survey with more complete data from another longitudinal cohort completes the information retrieved from serosurveillance while keeping its protocol simple and easy to implement. %M 38015594 %R 10.2196/46898 %U https://publichealth.jmir.org/2023/1/e46898 %U https://doi.org/10.2196/46898 %U http://www.ncbi.nlm.nih.gov/pubmed/38015594 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e46708 %T Optimal Look-Back Period to Identify True Incident Cases of Diabetes in Medical Insurance Data in the Chinese Population: Retrospective Analysis Study %A Yang,Wenyi %A Wang,Baohua %A Ma,Shaobo %A Wang,Jingxin %A Ai,Limei %A Li,Zhengyu %A Wan,Xia %+ Institute of Basic Medical Sciences, Chinese Academy of Medical Science, Dongdan Street, 5th, Beijing, 100052, China, 86 01065233870, xiawan@ibms.pumc.edu.cn %K diabetes %K incident cases %K administrative data %K look-back period %K retrograde survival function %D 2023 %7 6.11.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Accurate estimation of incidence and prevalence is vital for preventing and controlling diabetes. Administrative data (including insurance data) could be a good source to estimate the incidence of diabetes. However, how to determine the look-back period (LP) to remove cases with preceding records remains a problem for administrative data. A short LP will cause overestimation of incidence, whereas a long LP will limit the usefulness of a database. Therefore, it is necessary to determine the optimal LP length for identifying incident cases in administrative data. Objective: This study aims to offer different methods to identify the optimal LP for diabetes by using medical insurance data from the Chinese population with reference to other diseases in the administrative data. Methods: Data from the insurance database of the city of Weifang, China from between January 2016 and December 2020 were used. To identify the incident cases in 2020, we removed prevalent patients with preceding records of diabetes between 2016 and 2019 (ie, a 4-year LP). Using this 4-year LP as a reference, consistency examination indexes (CEIs), including positive predictive values, the κ coefficient, and overestimation rate, were calculated to determine the level of agreement between different LPs and an LP of 4 years (the longest LP). Moreover, we constructed a retrograde survival function, in which survival (ie, incident cases) means not having a preceding record at the given time and the survival time is the difference between the date of the last record in 2020 and the most recent previous record in the LP. Based on the survival outcome and survival time, we established the survival function and survival hazard function. When the survival probability, S(t), remains stable, and survival hazard converges to zero, we obtain the optimal LP. Combined with the results of these two methods, we determined the optimal LP for Chinese diabetes patients. Results: The κ agreement was excellent (0.950), with a high positive predictive value (92.2%) and a low overestimation rate (8.4%) after a 2-year LP. As for the retrograde survival function, S(t) dropped rapidly during the first 1-year LP (from 1.00 to 0.11). At a 417-day LP, the hazard function reached approximately zero (ht=0.000459), S(t) remained at 0.10, and at 480 days, the frequency of S(t) did not increase. Combining the two methods, we found that the optimal LP is 2 years for Chinese diabetes patients. Conclusions: The retrograde survival method and CEIs both showed effectiveness. A 2-year LP should be considered when identifying incident cases of diabetes using insurance data in the Chinese population. %M 37930785 %R 10.2196/46708 %U https://publichealth.jmir.org/2023/1/e46708 %U https://doi.org/10.2196/46708 %U http://www.ncbi.nlm.nih.gov/pubmed/37930785 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e41862 %T Mortality Risk and Burden From a Spectrum of Causes in Relation to Size-Fractionated Particulate Matters: Time Series Analysis %A Yang,Jun %A Dong,Hang %A Yu,Chao %A Li,Bixia %A Lin,Guozhen %A Chen,Sujuan %A Cai,Dongjie %A Huang,Lin %A Wang,Boguang %A Li,Mengmeng %+ State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, No 651 Dongfeng East Road, Guangzhou 510060, Guangzhou, 510060, China, 86 020 87345679, limm@sysucc.org.cn %K size-fractionated particulate matter %K cause-specific mortality %K cardiovascular disease %K respiratory disease %K neoplasm %K attributable burden %D 2023 %7 9.10.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: There is limited evidence regarding the adverse impact of particulate matters (PMs) on multiple body systems from both epidemiological and mechanistic studies. The association between size-fractionated PMs and mortality risk, as well as the burden of a whole spectrum of causes of death, remains poorly characterized. Objective: We aimed to examine the wide range of susceptible diseases affected by different sizes of PMs. We also assessed the association between PMs with an aerodynamic diameter less than 1 µm (PM1), 2.5 µm (PM2.5), and 10 µm (PM10) and deaths from 36 causes in Guangzhou, China. Methods: Daily data were obtained on cause-specific mortality, PMs, and meteorology from 2014 to 2016. A time-stratified case-crossover approach was applied to estimate the risk and burden of cause-specific mortality attributable to PMs after adjusting for potential confounding variables, such as long-term trend and seasonality, relative humidity, temperature, air pressure, and public holidays. Stratification analyses were further conducted to explore the potential modification effects of season and demographic characteristics (eg, gender and age). We also assessed the reduction in mortality achieved by meeting the new air quality guidelines set by the World Health Organization (WHO). Results: Positive and monotonic associations were generally observed between PMs and mortality. For every 10 μg/m3 increase in 4-day moving average concentrations of PM1, PM2.5, and PM10, the risk of all-cause mortality increased by 2.00% (95% CI 1.08%-2.92%), 1.54% (95% CI 0.93%-2.16%), and 1.38% (95% CI 0.95%-1.82%), respectively. Significant effects of size-fractionated PMs were observed for deaths attributed to nonaccidental causes, cardiovascular disease, respiratory disease, neoplasms, chronic rheumatic heart diseases, hypertensive diseases, cerebrovascular diseases, stroke, influenza, and pneumonia. If daily concentrations of PM1, PM2.5, and PM10 reached the WHO target levels of 10, 15, and 45 μg/m3, 7921 (95% empirical CI [eCI] 4454-11,206), 8303 (95% eCI 5063-11,248), and 8326 (95% eCI 5980-10690) deaths could be prevented, respectively. The effect estimates of PMs were relatively higher during hot months, among female individuals, and among those aged 85 years and older, although the differences between subgroups were not statistically significant. Conclusions: We observed positive and monotonical exposure-response curves between PMs and deaths from several diseases. The effect of PM1 was stronger on mortality than that of PM2.5 and PM10. A substantial number of premature deaths could be preventable by adhering to the WHO’s new guidelines for PMs. Our findings highlight the importance of a size-based strategy in controlling PMs and managing their health impact. %M 37812487 %R 10.2196/41862 %U https://publichealth.jmir.org/2023/1/e41862 %U https://doi.org/10.2196/41862 %U http://www.ncbi.nlm.nih.gov/pubmed/37812487 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e44211 %T Firearm Possession Rates in Home Countries and Firearm Suicide Rates Among US- and Foreign-Born Suicide Decedents in the United States: Analysis of Combined Data from the National Violent Death Reporting System and the Small Arms Survey %A Song,In Han %A Lee,Jin Hyuk %A Shin,Jee Soo %+ ICONS Convergence Academy, Yonsei University, Appenzeller Hall #205, Seoul, 03722, Republic of Korea, 82 221236217, isong@yonsei.ac.kr %K firearm suicide %K US born %K foreign born %K means of suicide %K firearm possession rate %K suicide decedents %D 2023 %7 29.9.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Suicide by firearms is a serious public health issue in the United States. However, little research has been conducted on the relationship between cultural backgrounds and suicide by firearms, specifically in those born and raised in the United States compared to those who have immigrated to the United States. Objective: To better understand the relationship between cultural backgrounds and suicide, this study aimed to examine firearm suicide rates among US- and foreign-born suicide decedents based on the firearm possession rate in the decedent’s home country. Methods: Multivariate logistic regression was performed to analyze data of 28,895 suicide decedents from 37 states obtained from the 2017 National Violent Death Reporting System data set. The firearm possession rate in the home countries of foreign-born suicide decedents was obtained from the 2017 Small Arms Survey. Results: The firearm suicide rate was about twice as high among US-born suicide decedents compared to their foreign-born counterparts. Meanwhile, suicide by hanging was about 75% higher among foreign-born compared to US-born suicide decedents. Those from countries with a low-to-medium firearm possession rate were significantly less likely to use firearms compared to US-born suicide decedents (adjusted odds ratio [AOR]=0.45, 95% CI 0.31-0.65, and AOR=0.46, 95% CI 0.39-0.53, respectively). Meanwhile, firearm suicide rates were not different between US- and foreign-born suicide decedents from countries with a similarly high firearm possession rate. Conclusions: The results suggest that there is an association between using firearms as a means of suicide and the firearm possession rate in the decedent’s home country. Suicide by firearms in the United States needs to be understood in the sociocultural context related to firearm possession. %M 37773604 %R 10.2196/44211 %U https://publichealth.jmir.org/2023/1/e44211 %U https://doi.org/10.2196/44211 %U http://www.ncbi.nlm.nih.gov/pubmed/37773604 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e49968 %T Nasopharyngeal Cancer Incidence and Mortality in 185 Countries in 2020 and the Projected Burden in 2040: Population-Based Global Epidemiological Profiling %A Zhang,Yanting %A Rumgay,Harriet %A Li,Mengmeng %A Cao,Sumei %A Chen,Wanqing %+ Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, No.1 Xincheng Road, Dongguan, 523808, China, 86 076922896050, zhangyt@gdmu.edu.cn %K nasopharyngeal cancer %K incidence %K mortality %K epidemiology %K worldwide %D 2023 %7 20.9.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Nasopharyngeal cancer (NPC) is one of the most common head and neck cancers. Objective: This study describes the global epidemiological profiles of NPC incidence and mortality in 185 countries in 2020 and the projected burden in 2040. Methods: The estimated numbers of NPC cases and deaths were retrieved from the GLOBOCAN 2020 data set. Age-standardized incidence rates (ASIRs) and age-standardized mortality rates (ASMRs) were calculated using the world standard. The future number of NPC cases and deaths by 2040 were estimated based on global demographic projections. Results: Globally, approximately 133,354 cases and 80,008 deaths from NPC were estimated in 2020 corresponding to ASIRs and ASMRs of 1.5 and 0.9 per 100,000 person-years, respectively. The largest numbers of both global cases and deaths from NPC occurred in Eastern Asia (65,866/133,354, 49.39% and 36,453/80,008, 45.56%, respectively), in which China contributed most to this burden (62,444/133,354, 46.82% and 34,810/80,008, 43.50%, respectively). The ASIRs and ASMRs in men were approximately 3-fold higher than those in women. Incidence rates varied across world regions, with the highest ASIRs for both men and women detected in South-Eastern Asia (7.7 and 2.5 per 100,000 person-years, respectively) and Eastern Asia (3.9 and 1.5 per 100,000 person-years, respectively). The highest ASMRs for both men and women were found in South-Eastern Asia (5.4 and 1.5 per 100,000 person-years, respectively). By 2040, the annual number of cases and deaths will increase to 179,476 (46,122/133,354, a 34.58% increase from the year 2020) and 113,851 (33,843/80,008, a 42.29% increase), respectively. Conclusions: Disparities in NPC incidence and mortality persist worldwide. Our study highlights the urgent need to develop and accelerate NPC control initiatives to tackle the NPC burden in certain regions and countries (eg, South-Eastern Asia, China). %M 37728964 %R 10.2196/49968 %U https://publichealth.jmir.org/2023/1/e49968 %U https://doi.org/10.2196/49968 %U http://www.ncbi.nlm.nih.gov/pubmed/37728964 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e47902 %T Trends in Cause-Specific Injury Mortality in China in 2005-2019: Longitudinal Observational Study %A Ji,Zixiang %A Wu,Hengjing %A Zhu,Rongyu %A Wang,Lu %A Wang,Yuzhu %A Zhang,Lijuan %+ Clinical Center for Intelligent Rehabilitation Research, Shanghai YangZhi Rehabilitation Hospital, Tongji University School of Medicine, Tongji University, 50 Chifeng Road, Yangpu, Shanghai, 201619, China, 86 13817934887, zhangxiaoyi@tongji.edu.cn %K reverse %K age-standardized mortality rate %K injury %K suicide %K trend %K potential years of life lost %K average years of life lost %K crude mortality rate %K falls %K older adults %K young adults %D 2023 %7 15.9.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Over the last few decades, although the age-standardized mortality rate (ASMR) of injury has shown a significant declining trend in China, this pattern has dramatically reversed recently. Objective: We aimed to elucidate the geographical, demographic, and temporal trends of cause-specific injuries, the reversal phenomenon of these trends, and the fluctuations of injury burden from 2005 to 2019 in China. Methods: A longitudinal observational study was performed using the raw data of injury deaths in the National Cause-of-Death surveillance data provided by the disease surveillance points system in 2005-2019. The cause-specific injuries were divided into disparate subgroups by sex, age, urban/rural region, and eastern/central/western areas of China. The burden of injury was assessed using potential years of life lost (PYLL), average years of life lost (AYLL), and PYLL rate (PYLLR). Temporal trends of mortality rates and burden were evaluated using best-fitting joinpoint models. Results: Injury deaths accounted for 7.51% (1,156,504/15,403,835) of all-cause deaths in China in 2005-2019. The crude mortality rate of all-cause injury was 47.74 per 100,000 persons. The top 3 injury types (traffic accident, falls, and suicide) accounted for 70.57% (816,145/1,156,504) of all injury-related deaths. The ASMR of all-cause injury decreased (P=.003), while the crude mortality rate remained unchanged (P=.52) during 2005-2019. A significant reverse trend in ASMR of all-cause injury was observed in urban older adults since 2013, mainly due to the inverted trend in injuries from falls. A reverse trend in ASMR of suicide was observed among individuals aged 10-24 years, with notable increases by 35.18% (annual percentage change 15.4%, 95% CI 4.1%-28.0%) in men since 2017. The AYLL and PYLLR of all-cause injury among older adults showed consistent ascending trends from 2005 to 2019 (average annual percentage change [AAPC] 6.1%, 95% CI 5.4%-6.9%, 129.04% increase for AYLL; AAPC 5.4%, 95% CI 2.4%-8.4%, 105.52% increase for PYLLR). The AYLL due to suicide for individuals aged 10-24 years showed a considerable upswing tendency (AAPC 0.5%, 95% CI 0.4%-0.7%, 8.02% increase). Conclusions: Although the ASMR of all-cause injury decreased in China from 2005 to 2019, the trend in suicide among adolescents and young adults and falls among older adults has been on the rise in recent years. Interventions should be encouraged to mitigate the cause-specific burdens of injury death. %M 37713250 %R 10.2196/47902 %U https://publichealth.jmir.org/2023/1/e47902 %U https://doi.org/10.2196/47902 %U http://www.ncbi.nlm.nih.gov/pubmed/37713250 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e45943 %T Global, Regional, and National Prevalence of Gout From 1990 to 2019: Age-Period-Cohort Analysis With Future Burden Prediction %A He,Qiyu %A Mok,Tsz-Ngai %A Sin,Tat-Hang %A Yin,Jiaying %A Li,Sicun %A Yin,Yiyue %A Ming,Wai-Kit %A Feng,Bin %+ Department of Orthopedic Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, No 1, Shuaifuyuan Street, Beijing, China, 86 69155200, fengbin@pumch.cn %K gout %K prevalence %K age-period-cohort analysis %K Global Burden of Disease Study 2019 %K prediction %K Bayesian age-period-cohort analysis %K Norped age-period-cohort analysis %D 2023 %7 7.6.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Gout is a common and debilitating condition that is associated with significant morbidity and mortality. Despite advances in medical treatment, the global burden of gout continues to increase, particularly in high–sociodemographic index (SDI) regions. Objective: To address the aforementioned issue, we used age-period-cohort (APC) modeling to analyze global trends in gout incidence and prevalence from 1990 to 2019. Methods: Data were extracted from the Global Burden of Disease Study 2019 to assess all-age prevalence and age-standardized prevalence rates, as well as years lived with disability rates, for 204 countries and territories. APC effects were also examined in relation to gout prevalence. Future burden prediction was carried out using the Nordpred APC prediction of future incidence cases and the Bayesian APC model. Results: The global gout incidence has increased by 63.44% over the past 2 decades, with a corresponding increase of 51.12% in global years lived with disability. The sex ratio remained consistent at 3:1 (male to female), but the global gout incidence increased in both sexes over time. Notably, the prevalence and incidence of gout were the highest in high-SDI regions (95% uncertainty interval 14.19-20.62), with a growth rate of 94.3%. Gout prevalence increases steadily with age, and the prevalence increases rapidly in high-SDI quantiles for the period effect. Finally, the cohort effect showed that gout prevalence increases steadily, with the risk of morbidity increasing in younger birth cohorts. The prediction model suggests that the gout incidence rate will continue to increase globally. Conclusions: Our study provides important insights into the global burden of gout and highlights the need for effective management and prophylaxis of this condition. The APC model used in our analysis provides a novel approach to understanding the complex trends in gout prevalence and incidence, and our findings can inform the development of targeted interventions to address this growing health issue. %M 37285198 %R 10.2196/45943 %U https://publichealth.jmir.org/2023/1/e45943 %U https://doi.org/10.2196/45943 %U http://www.ncbi.nlm.nih.gov/pubmed/37285198 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e44647 %T Prediction of Multimorbidity in Brazil: Latest Fifth of a Century Population Study %A Li,Xi-liang %A Huang,Hang %A Lu,Ying %A Stafford,Randall S %A Lima,Simone Maria %A Mota,Caroline %A Shi,Xin %+ School of Mathematics and Information Science, Shandong Technology and Business University, Binhai Road, Yantai, 264005, China, 86 18059892450, jasonshi510@hotmail.com %K Brazil %K demographic factors %K logistic regression analysis %K multimorbidity %K nomogram prediction %K prevalence %D 2023 %7 30.5.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Multimorbidity is characterized by the co-occurrence of 2 or more chronic diseases and has been a focus of the health care sector and health policy makers due to its severe adverse effects. Objective: This paper aims to use the latest 2 decades of national health data in Brazil to analyze the effects of demographic factors and predict the impact of various risk factors on multimorbidity. Methods: Data analysis methods include descriptive analysis, logistic regression, and nomogram prediction. The study makes use of a set of national cross-sectional data with a sample size of 877,032. The study used data from 1998, 2003, and 2008 from the Brazilian National Household Sample Survey, and from 2013 and 2019 from the Brazilian National Health Survey. We developed a logistic regression model to assess the influence of risk factors on multimorbidity and predict the influence of the key risk factors in the future, based on the prevalence of multimorbidity in Brazil. Results: Overall, females were 1.7 times more likely to experience multimorbidity than males (odds ratio [OR] 1.72, 95% CI 1.69-1.74). The prevalence of multimorbidity among unemployed individuals was 1.5 times that of employed individuals (OR 1.51, 95% CI 1.49-1.53). Multimorbidity prevalence increased significantly with age. People over 60 years of age were about 20 times more likely to have multiple chronic diseases than those between 18 and 29 years of age (OR 19.6, 95% CI 19.15-20.07). The prevalence of multimorbidity in illiterate individuals was 1.2 times that in literate ones (OR 1.26, 95% CI 1.24-1.28). The subjective well-being of seniors without multimorbidity was 15 times that among people with multimorbidity (OR 15.29, 95% CI 14.97-15.63). Adults with multimorbidity were more than 1.5 times more likely to be hospitalized than those without (OR 1.53, 95% CI 1.50-1.56) and 1.9 times more likely need medical care (OR 1.94, 95% CI 1.91-1.97). These patterns were similar in all 5 cohort studies and remained stable for over 21 years. A nomogram model was used to predict multimorbidity prevalence under the influence of various risk factors. The prediction results were consistent with the effects of logistic regression; older age and poorer participant well-being had the strongest correlation with multimorbidity. Conclusions: Our study shows that multimorbidity prevalence varied little in the past 2 decades but varies widely across social groups. Identifying populations with higher rates of multimorbidity prevalence may improve policy making around multimorbidity prevention and management. The Brazilian government can create public health policies targeting these groups, and provide more medical treatment and health services to support and protect the multimorbidity population. %M 37252771 %R 10.2196/44647 %U https://publichealth.jmir.org/2023/1/e44647 %U https://doi.org/10.2196/44647 %U http://www.ncbi.nlm.nih.gov/pubmed/37252771 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e44517 %T Participatory Surveillance for COVID-19 Trend Detection in Brazil: Cross-sectional Study %A Wittwer,Salome %A Paolotti,Daniela %A Lichand,Guilherme %A Leal Neto,Onicio %+ Institute for Information Security, Department of Computer Science, ETH Zürich, Universitätstrasse 6, Zurich, 8092, Switzerland, 41 44 632 50 94, onicio.batistalealneto@inf.ethz.ch %K participatory surveillance %K COVID-19 %K digital epidemiology %K coronavirus %K infectious disease %K epidemic %K pandemic %K SARS-CoV-2 %K forecast %K trend %K reporting %K self-report %K surveillance %D 2023 %7 26.4.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The ongoing COVID-19 pandemic has emphasized the necessity of a well-functioning surveillance system to detect and mitigate disease outbreaks. Traditional surveillance (TS) usually relies on health care providers and generally suffers from reporting lags that prevent immediate response plans. Participatory surveillance (PS), an innovative digital approach whereby individuals voluntarily monitor and report on their own health status via web-based surveys, has emerged in the past decade to complement traditional data collection approaches. Objective: This study compared novel PS data on COVID-19 infection rates across 9 Brazilian cities with official TS data to examine the opportunities and challenges of using PS data, and the potential advantages of combining the 2 approaches. Methods: The TS data for Brazil are publicly accessible on GitHub. The PS data were collected through the Brazil Sem Corona platform, a Colab platform. To gather information on an individual’s health status, each participant was asked to fill out a daily questionnaire on symptoms and exposure in the Colab app. Results: We found that high participation rates are key for PS data to adequately mirror TS infection rates. Where participation was high, we documented a significant trend correlation between lagged PS data and TS infection rates, suggesting that PS data could be used for early detection. In our data, forecasting models integrating both approaches increased accuracy up to 3% relative to a 14-day forecast model based exclusively on TS data. Furthermore, we showed that PS data captured a population that significantly differed from a traditional observation. Conclusions: In the traditional system, the new recorded COVID-19 cases per day are aggregated based on positive laboratory-confirmed tests. In contrast, PS data show a significant share of reports categorized as potential COVID-19 cases that are not laboratory confirmed. Quantifying the economic value of PS system implementation remains difficult. However, scarce public funds and persisting constraints to the TS system provide motivation for a PS system, making it an important avenue for future research. The decision to set up a PS system requires careful evaluation of its expected benefits, relative to the costs of setting up platforms and incentivizing engagement to increase both coverage and consistent reporting over time. The ability to compute such economic tradeoffs might be key to have PS become a more integral part of policy toolkits moving forward. These results corroborate previous studies when it comes to the benefits of an integrated and comprehensive surveillance system, and shed light on its limitations and on the need for additional research to improve future implementations of PS platforms. %M 36888908 %R 10.2196/44517 %U https://publichealth.jmir.org/2023/1/e44517 %U https://doi.org/10.2196/44517 %U http://www.ncbi.nlm.nih.gov/pubmed/36888908 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e43723 %T Changes in the Demographic Distribution of Chicago Gun-Homicide Decedents From 2015-2021: Violent Death Surveillance Cross-sectional Study %A Mason,Maryann %A Khazanchi,Rushmin %A Brewer,Audrey %A Sheehan,Karen %A Liu,Yingxuan %A Post,Lori %+ Department of Emergency Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States, 1 312 503 5142, maryann-mason@northwestern.edu %K gun-homicide surveillance %K gun-homicide decedents %K demographics %K age, gun violence %K firearm %D 2023 %7 7.4.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Homicide is one of the 5 leading causes of death in the United States for persons aged 1 to 44 years. In 2019, 75% of US homicides were by gun. Chicago has a gun-homicide rate 4 times the national average, and 90% of all homicides are by gun. The public health approach to violence prevention calls for a 4-step process, beginning with defining and monitoring the problem. Insight into the characteristics of gun-homicide decedents can help frame next steps, including identifying risk and protective factors, developing prevention and intervention strategies, and scaling effective responses. Although much is known about gun homicide because it is a long-standing, entrenched public health problem, it is useful to monitor trends to update ongoing prevention efforts. Objective: This study aimed to use public health surveillance data and methods to describe changes in the race/ethnicity, sex, and age of Chicago gun-homicide decedents from 2015-2021, in the context of year-to-year variation and an overall increase in the city’s gun-homicide rate. Methods: We calculated the distribution of gun-related homicide deaths by 6 race/ethnicity and sex groups (non-Hispanic Black female, non-Hispanic White female, Hispanic female, non-Hispanic Black male, non-Hispanic White male, and Hispanic male), age in years, and age by age group. We used counts, percentages, and rates per 100,000 persons to describe the distribution of deaths among these demographic groups. Comparisons of means and column proportions with tests of significance set at P≤.05 were used to describe changes in the distribution of gun-homicide decedents over time by race-ethnicity-sex and age groups. The comparison of mean age by race-ethnicity-sex group is done using 1-way ANOVA with significance set at P≤.05. Results: The distribution of gun-homicide decedents in Chicago by race/ethnicity and sex groups had been relatively stable from 2015 to 2021 with 2 notable exceptions: a more than doubling of the proportion of gun-homicide decedents who were non-Hispanic Black female (3.6% in 2015 to 8.2% in 2021) and an increase of 3.27 years in the mean age of gun-homicide decedents. The increase in mean age coincided with a decrease in the proportion of non-Hispanic Black male gun-homicide decedents between the ages of 15-19 and 20-24 years and, conversely, an increase in the proportion of non-Hispanic Black male gun-homicide decedents aged 25-34 years. Conclusions: The annual gun-homicide rate in Chicago had been increasing since 2015 with year-to-year variation. Continued monitoring of trends in the demographic makeup of gun-homicide decedents is necessary to provide the most relevant and timely information to help shape violence prevention efforts. We detected several changes that suggest a need for increased outreach and engagement marketed toward non-Hispanic Black female and non-Hispanic Black male individuals between the ages of 25-34 years. %M 37027193 %R 10.2196/43723 %U https://publichealth.jmir.org/2023/1/e43723 %U https://doi.org/10.2196/43723 %U http://www.ncbi.nlm.nih.gov/pubmed/37027193 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e43836 %T COVID-19 Contact Tracing as an Indicator for Evaluating a Pandemic Situation: Simulation Study %A Marques-Cruz,Manuel %A Nogueira-Leite,Diogo %A Alves,João Miguel %A Fernandes,Francisco %A Fernandes,José Miguel %A Almeida,Miguel Ângelo %A Cunha Correia,Patrícia %A Perestrelo,Paula %A Cruz-Correia,Ricardo %A Pita Barros,Pedro %+ Department of Community Medicine, Information and Decision in Health, Faculty of Medicine, University of Porto, Faculdade de Medicina da Universidade do Porto, Rua Dr. Plácido da Costa, Porto, 4200-450, Portugal, 351 225513622, up201000048@up.pt %K COVID-19 %K public health %K public health surveillance %K quarantine %K infection transmission %K epidemiological models %D 2023 %7 6.4.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Contact tracing is a fundamental intervention in public health. When systematically applied, it enables the breaking of chains of transmission, which is important for controlling COVID-19 transmission. In theoretically perfect contact tracing, all new cases should occur among quarantined individuals, and an epidemic should vanish. However, the availability of resources influences the capacity to perform contact tracing. Therefore, it is necessary to estimate its effectiveness threshold. We propose that this effectiveness threshold may be indirectly estimated using the ratio of COVID-19 cases arising from quarantined high-risk contacts, where higher ratios indicate better control and, under a threshold, contact tracing may fail and other restrictions become necessary. Objective: This study assessed the ratio of COVID-19 cases in high-risk contacts quarantined through contact tracing and its potential use as an ancillary pandemic control indicator. Methods: We built a 6-compartment epidemiological model to emulate COVID-19 infection flow according to publicly available data from Portuguese authorities. Our model extended the usual susceptible-exposed-infected-recovered model by adding a compartment Q with individuals in mandated quarantine who could develop infection or return to the susceptible pool and a compartment P with individuals protected from infection because of vaccination. To model infection dynamics, data on SARS-CoV-2 infection risk (IR), time until infection, and vaccine efficacy were collected. Estimation was needed for vaccine data to reflect the timing of inoculation and booster efficacy. In total, 2 simulations were built: one adjusting for the presence and absence of variants or vaccination and another maximizing IR in quarantined individuals. Both simulations were based on a set of 100 unique parameterizations. The daily ratio of infected cases arising from high-risk contacts (q estimate) was calculated. A theoretical effectiveness threshold of contact tracing was defined for 14-day average q estimates based on the classification of COVID-19 daily cases according to the pandemic phases and was compared with the timing of population lockdowns in Portugal. A sensitivity analysis was performed to understand the relationship between different parameter values and the threshold obtained. Results: An inverse relationship was found between the q estimate and daily cases in both simulations (correlations >0.70). The theoretical effectiveness thresholds for both simulations attained an alert phase positive predictive value of >70% and could have anticipated the need for additional measures in at least 4 days for the second and fourth lockdowns. Sensitivity analysis showed that only the IR and booster dose efficacy at inoculation significantly affected the q estimates. Conclusions: We demonstrated the impact of applying an effectiveness threshold for contact tracing on decision-making. Although only theoretical thresholds could be provided, their relationship with the number of confirmed cases and the prediction of pandemic phases shows the role as an indirect indicator of the efficacy of contact tracing. %M 36877958 %R 10.2196/43836 %U https://publichealth.jmir.org/2023/1/e43836 %U https://doi.org/10.2196/43836 %U http://www.ncbi.nlm.nih.gov/pubmed/36877958 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 7 %N %P e42895 %T Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study %A Nguyen,Trong Tue %A Ho,Cam Tu %A Bui,Huong Thi Thu %A Ho,Lam Khanh %A Ta,Van Thanh %+ Medical Laboratory Department, Hanoi Medical University, 1 Ton That Tung Dong Da, Hanoi, Vietnam, 84 902185488, trongtue@hmu.edu.vn %K COVID-19 %K multidimensional analysis %K hierarchical cluster analysis %K regression analysis %K mild %K moderate %K severe %K age %K scoring index of chest x-ray %K percentage and quantity of neutrophils %K albumin %K C-reactive protein %K ratio of lymphocytes %D 2023 %7 16.2.2023 %9 Original Paper %J JMIR Form Res %G English %X Background: Machine learning (ML) is a type of artificial intelligence strategy. Its algorithms are used on big data sets to see patterns, learn from their results, and perform tasks autonomously without being instructed on how to address problems. New diseases like COVID-19 provide important data for ML. Therefore, all relevant parameters should be explicitly quantified and modeled. Objective: The purpose of this study was to determine (1) the overall preclinical characteristics, (2) the cumulative cutoff values and risk ratios (RRs), and (3) the factors associated with COVID-19 severity in unidimensional and multidimensional analyses involving 2173 SARS-CoV-2 patients. Methods: The study population consisted of 2173 patients (1587 mild status [mild group] and asymptomatic patients, 377 moderate status patients [moderate group], and 209 severe status patients [severe group]). The status of the patients was recorded from September 2021 to March 2022. Two correlation tests, relative risk, and RR were used to eliminate unbalanced parameters and select the most remarkable parameters. The independent methods of hierarchical cluster analysis and k-means were used to classify parameters according to their r values. Finally, network analysis provided a 3-dimensional view of the results. Results: COVID-19 severity was significantly correlated with age (mild-moderate group: RR 4.19, 95% CI 3.58-4.95; P<.001), scoring index of chest x-ray (mild-moderate group: RR 3.29, 95% CI 2.76-3.92; P<.001; moderate-severe group: RR 3.03, 95% CI 2.4023-3.8314; P<.001), percentage of neutrophils (mild-moderate group: RR 3.18, 95% CI 2.73-3.70; P<.001; moderate-severe group: RR 3.32, 95% CI 2.6480-4.1529; P<.001), quantity of neutrophils (moderate-severe group: RR 3.15, 95% CI 2.6153-3.8025; P<.001), albumin (moderate-severe group: RR 0.46, 95% CI 0.3650-0.5752; P<.001), C-reactive protein (mild-moderate group: RR 3.4, 95% CI 2.91-3.97; P<.001), and ratio of lymphocytes (moderate-severe group: RR 0.34, 95% CI 0.2743-0.4210; P<.001). Significant inversion of correlations among the severity groups is important. Alanine transaminase and leucocytes showed a significant negative correlation (r=−1; P<.001) in the mild group and a significant positive correlation in the moderate group (r=1; P<.001). Transferrin and anion Cl showed a significant positive correlation (r=1; P<.001) in the mild group and a significant negative correlation in the moderate group (r=−0.59; P<.001). The clustering and network analysis showed that in the mild-moderate group, the closest neighbors of COVID-19 severity were ferritin and age. C-reactive protein, scoring index of chest x-ray, albumin, and lactate dehydrogenase were the next closest neighbors of these 3 factors. In the moderate-severe group, the closest neighbors of COVID-19 severity were ferritin, fibrinogen, albumin, quantity of lymphocytes, scoring index of chest x-ray, white blood cell count, lactate dehydrogenase, and quantity of neutrophils. Conclusions: This multidimensional study in Vietnam showed possible correlations between several elements and COVID-19 severity to provide clinical reference markers for surveillance and diagnostic management. %M 36668902 %R 10.2196/42895 %U https://formative.jmir.org/2023/1/e42895 %U https://doi.org/10.2196/42895 %U http://www.ncbi.nlm.nih.gov/pubmed/36668902 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 9 %N %P e41450 %T Small Area Forecasting of Opioid-Related Mortality: Bayesian Spatiotemporal Dynamic Modeling Approach %A Bauer,Cici %A Zhang,Kehe %A Li,Wenjun %A Bernson,Dana %A Dammann,Olaf %A LaRochelle,Marc R %A Stopka,Thomas J %+ Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, 1200 Pressler Street, Room E819, Houston, TX, 77030, United States, 1 713 500 9581, cici.x.bauer@uth.tmc.edu %K opioid-related mortality %K small area estimation %K spatiotemporal models %K Bayesian %K forecasting %D 2023 %7 10.2.2023 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Opioid-related overdose mortality has remained at crisis levels across the United States, increasing 5-fold and worsened during the COVID-19 pandemic. The ability to provide forecasts of opioid-related mortality at granular geographical and temporal scales may help guide preemptive public health responses. Current forecasting models focus on prediction on a large geographical scale, such as states or counties, lacking the spatial granularity that local public health officials desire to guide policy decisions and resource allocation. Objective: The overarching objective of our study was to develop Bayesian spatiotemporal dynamic models to predict opioid-related mortality counts and rates at temporally and geographically granular scales (ie, ZIP Code Tabulation Areas [ZCTAs]) for Massachusetts. Methods: We obtained decedent data from the Massachusetts Registry of Vital Records and Statistics for 2005 through 2019. We developed Bayesian spatiotemporal dynamic models to predict opioid-related mortality across Massachusetts’ 537 ZCTAs. We evaluated the prediction performance of our models using the one-year ahead approach. We investigated the potential improvement of prediction accuracy by incorporating ZCTA-level demographic and socioeconomic determinants. We identified ZCTAs with the highest predicted opioid-related mortality in terms of rates and counts and stratified them by rural and urban areas. Results: Bayesian dynamic models with the full spatial and temporal dependency performed best. Inclusion of the ZCTA-level demographic and socioeconomic variables as predictors improved the prediction accuracy, but only in the model that did not account for the neighborhood-level spatial dependency of the ZCTAs. Predictions were better for urban areas than for rural areas, which were more sparsely populated. Using the best performing model and the Massachusetts opioid-related mortality data from 2005 through 2019, our models suggested a stabilizing pattern in opioid-related overdose mortality in 2020 and 2021 if there were no disruptive changes to the trends observed for 2005-2019. Conclusions: Our Bayesian spatiotemporal models focused on opioid-related overdose mortality data facilitated prediction approaches that can inform preemptive public health decision-making and resource allocation. While sparse data from rural and less populated locales typically pose special challenges in small area predictions, our dynamic Bayesian models, which maximized information borrowing across geographic areas and time points, were used to provide more accurate predictions for small areas. Such approaches can be replicated in other jurisdictions and at varying temporal and geographical levels. We encourage the formation of a modeling consortium for fatal opioid-related overdose predictions, where different modeling techniques could be ensembled to inform public health policy. %M 36763450 %R 10.2196/41450 %U https://publichealth.jmir.org/2023/1/e41450 %U https://doi.org/10.2196/41450 %U http://www.ncbi.nlm.nih.gov/pubmed/36763450 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 11 %P e34908 %T Incidence and Prevalence of Peripheral Arterial Disease in South Korea: Retrospective Analysis of National Claims Data %A Ryu,Gi Wook %A Park,Young Shin %A Kim,Jeewuan %A Yang,Yong Sook %A Ko,Young-Guk %A Choi,Mona %+ Mo-Im Kim Nursing Research Institute, College of Nursing, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea, 82 2 2228 3341, monachoi@yuhs.ac %K peripheral arterial disease %K insurance claims %K incidence %K prevalence %K endovascular revascularization %K amputation %K population-based study %K blood flow %K intermittent claudication %K age %K sex %D 2022 %7 18.11.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Peripheral arterial disease (PAD) causes blood vessel narrowing that decreases blood flow to the lower extremities, with symptoms such as leg pain, discomfort, and intermittent claudication. PAD increases risks for amputation, poor health-related quality of life, and mortality. It is estimated that more than 200 million people worldwide have PAD, although the paucity of PAD research in the East detracts from knowledge on global PAD epidemiology. There are few national data–based analyses or health care utilization investigations. Thus, a national data analysis of PAD incidence and prevalence would provide baseline data to enable health promotion strategies for patients with PAD. Objective: This study aims to identify South Korean trends in the incidence and prevalence of PAD and PAD treatment, in-hospital deaths, and health care utilization. Methods: This was a retrospective analysis of South Korean national claims data from 2009 to 2018. The incidence of PAD was determined by setting the years 2010 and 2011 as a washout period to exclude previously diagnosed patients with PAD. The study included adults aged ≥20 and <90 years who received a primary diagnosis of PAD between 2011 and 2018; patients were stratified according to age, sex, and insurance status for the incidence and prevalence analyses. Descriptive statistics were used to assess incidence, prevalence, endovascular revascularization (EVR) events, amputations, in-hospital deaths, and the health care utilization characteristics of patients with PAD. Results: Based on data from 2011 to 2018, there were an average of 124,682 and 993,048 incident and prevalent PAD cases, respectively, in 2018. PAD incidence (per 1000 persons) ranged from 2.68 to 3.09 during the study period. From 2012 to 2018, the incidence rate in both sexes showed an increasing trend. PAD incidence continued to increase with age. PAD prevalence (per 1000 persons) increased steadily, from 3.93 in 2011 to 23.55 in 2018. The number of EVR events varied between 933 and 1422 during the study period, and both major and minor amputations showed a decreasing trend. Health care utilization characteristics showed that women visited clinics more frequently than men, whereas men used tertiary and general hospitals more often than women. Conclusions: The number of incident and prevalent PAD cases generally showed an increasing trend. Visits to tertiary and general hospitals were higher among men than women. These results indicate the need for attention not only to Western and male patients, but also to Eastern and female patients with PAD. The results are generalizable, as they are based on national claims data from the entire South Korean population, and they can promote preventive care and management strategies for patients with PAD in clinical and public health settings. %M 36399371 %R 10.2196/34908 %U https://publichealth.jmir.org/2022/11/e34908 %U https://doi.org/10.2196/34908 %U http://www.ncbi.nlm.nih.gov/pubmed/36399371 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 11 %P e38037 %T Modeling the Potential Impact of Missing Race and Ethnicity Data in Infectious Disease Surveillance Systems on Disparity Measures: Scenario Analysis of Different Imputation Strategies %A Ansari,Bahareh %A Hart-Malloy,Rachel %A Rosenberg,Eli S %A Trigg,Monica %A Martin,Erika G %+ Department of Public Administration and Policy, Rockefeller College of Public Affairs and Policy, University at Albany, 300 Milne Hall, 135 Western Ave, Albany, NY, 12203, United States, 1 518 442 5243, emartin@albany.edu %K missing data %K sexually transmitted diseases %K imputation %K surveillance %K health equity %D 2022 %7 9.11.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Monitoring progress toward population health equity goals requires developing robust disparity indicators. However, surveillance data gaps that result in undercounting racial and ethnic minority groups might influence the observed disparity measures. Objective: This study aimed to assess the impact of missing race and ethnicity data in surveillance systems on disparity measures. Methods: We explored variations in missing race and ethnicity information in reported annual chlamydia and gonorrhea diagnoses in the United States from 2007 to 2018 by state, year, reported sex, and infection. For diagnoses with incomplete demographic information in 2018, we estimated disparity measures (relative rate ratio and rate difference) with 5 imputation scenarios compared with the base case (no adjustments). The 5 scenarios used the racial and ethnic distribution of chlamydia or gonorrhea diagnoses in the same state, chlamydia or gonorrhea diagnoses in neighboring states, chlamydia or gonorrhea diagnoses within the geographic region, HIV diagnoses, and syphilis diagnoses. Results: In 2018, a total of 31.93% (560,551/1,755,510) of chlamydia and 22.11% (128,790/582,475) of gonorrhea diagnoses had missing race and ethnicity information. Missingness differed by infection type but not by reported sex. Missing race and ethnicity information varied widely across states and times (range across state-years: from 0.0% to 96.2%). The rate ratio remained similar in the imputation scenarios, although the rate difference differed nationally and in some states. Conclusions: We found that missing race and ethnicity information affects measured disparities, which is important to consider when interpreting disparity metrics. Addressing missing information in surveillance systems requires system-level solutions, such as collecting more complete laboratory data, improving the linkage of data systems, and designing more efficient data collection procedures. As a short-term solution, local public health agencies can adapt these imputation scenarios to their aggregate data to adjust surveillance data for use in population indicators of health equity. %M 36350701 %R 10.2196/38037 %U https://publichealth.jmir.org/2022/11/e38037 %U https://doi.org/10.2196/38037 %U http://www.ncbi.nlm.nih.gov/pubmed/36350701 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 11 %P e36424 %T The Relationship Between Population-Level SARS-CoV-2 Cycle Threshold Values and Trend of COVID-19 Infection: Longitudinal Study %A Dehesh,Paria %A Baradaran,Hamid Reza %A Eshrati,Babak %A Motevalian,Seyed Abbas %A Salehi,Masoud %A Donyavi,Tahereh %+ Department of Epidemiology, School of Public Health, Iran University of Medical Sciences, Hemmat Highway, Tehran, 1449614535, Iran, 98 9183616737, babak.eshrati@gmail.com %K cycle threshold value %K COVID-19 %K trend %K surveillance %K epidemiology %K disease surveillance %K surveillance %K digital surveillance %K prediction model %K epidemic modeling %K health system %K infectious disease %D 2022 %7 8.11.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The distribution of population-level real-time reverse transcription-polymerase chain reaction (RT-PCR) cycle threshold (Ct) values as a proxy of viral load may be a useful indicator for predicting COVID-19 dynamics. Objective: The aim of this study was to determine the relationship between the daily trend of average Ct values and COVID-19 dynamics, calculated as the daily number of hospitalized patients with COVID-19, daily number of new positive tests, daily number of COVID-19 deaths, and number of hospitalized patients with COVID-19 by age. We further sought to determine the lag between these data series. Methods: The samples included in this study were collected from March 21, 2021, to December 1, 2021. Daily Ct values of all patients who were referred to the Molecular Diagnostic Laboratory of Iran University of Medical Sciences in Tehran, Iran, for RT-PCR tests were recorded. The daily number of positive tests and the number of hospitalized patients by age group were extracted from the COVID-19 patient information registration system in Tehran province, Iran. An autoregressive integrated moving average (ARIMA) model was constructed for the time series of variables. Cross-correlation analysis was then performed to determine the best lag and correlations between the average daily Ct value and other COVID-19 dynamics–related variables. Finally, the best-selected lag of Ct identified through cross-correlation was incorporated as a covariate into the autoregressive integrated moving average with exogenous variables (ARIMAX) model to calculate the coefficients. Results: Daily average Ct values showed a significant negative correlation (23-day time delay) with the daily number of newly hospitalized patients (P=.02), 30-day time delay with the daily number of new positive tests (P=.02), and daily number of COVID-19 deaths (P=.02). The daily average Ct value with a 30-day delay could impact the daily number of positive tests for COVID-19 (β=–16.87, P<.001) and the daily number of deaths from COVID-19 (β=–1.52, P=.03). There was a significant association between Ct lag (23 days) and the number of COVID-19 hospitalizations (β=–24.12, P=.005). Cross-correlation analysis showed significant time delays in the average Ct values and daily hospitalized patients between 18-59 years (23-day time delay, P=.02) and in patients over 60 years old (23-day time delay, P<.001). No statistically significant relation was detected in the number of daily hospitalized patients under 5 years old (9-day time delay, P=.27) and aged 5-17 years (13-day time delay, P=.39). Conclusions: It is important for surveillance of COVID-19 to find a good indicator that can predict epidemic surges in the community. Our results suggest that the average daily Ct value with a 30-day delay can predict increases in the number of positive confirmed COVID-19 cases, which may be a useful indicator for the health system. %M 36240022 %R 10.2196/36424 %U https://publichealth.jmir.org/2022/11/e36424 %U https://doi.org/10.2196/36424 %U http://www.ncbi.nlm.nih.gov/pubmed/36240022 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 10 %P e34555 %T Key Population Size Estimation to Guide HIV Epidemic Responses in Nigeria: Bayesian Analysis of 3-Source Capture-Recapture Data %A McIntyre,Anne F %A Mitchell,Andrew %A Stafford,Kristen A %A Nwafor,Samuel Uchenna %A Lo,Julia %A Sebastian,Victor %A Schwitters,Amee %A Swaminathan,Mahesh %A Dalhatu,Ibrahim %A Charurat,Man %+ Division of Global HIV and TB, Center for Global Health, Centers for Disease Control and Prevention, 1600 Clifton Road NE MS E-30, Atlanta, GA, 30329, United States, 1 404 639 8284, zat4@cdc.gov %K sex workers %K men who have sex with men %K people who inject drugs %K HIV %K population size %K population %K data %K female %K men %K drugs %K drug injection %K epidemic %K Nigeria %D 2022 %7 26.10.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Nigeria has the fourth largest burden of HIV globally. Key populations, including female sex workers, men who have sex with men, and people who inject drugs, are more vulnerable to HIV than the general population due to stigmatized and criminalized behaviors. Reliable key population size estimates are needed to guide HIV epidemic response efforts. Objective: The objective of our study was to use empirical methods for sampling and analysis to improve the quality of population size estimates of female sex workers, men who have sex with men, and people who inject drugs in 7 states (Akwa Ibom, Benue, Cross River, Lagos, Nasarawa, Rivers, and the Federal Capital Territory) of Nigeria for program planning and to demonstrate improved statistical estimation methods. Methods: From October to December 2018, we used 3-source capture-recapture to produce population size estimates in 7 states in Nigeria. Hotspots were mapped before 3-source capture-recapture started. We sampled female sex workers, men who have sex with men, and people who inject drugs during 3 independent captures about one week apart. During hotspot encounters, key population members were offered inexpensive, memorable objects unique to each capture round. In subsequent rounds, key population members were offered an object and asked to identify objects received during previous rounds (if any). Correct responses were tallied and recorded on tablets. Data were aggregated by key population and state for analysis. Median population size estimates were derived using Bayesian nonparametric latent-class models with 80% highest density intervals. Results: Overall, we sampled approximately 310,000 persons at 9015 hotspots during 3 independent captures. Population size estimates for female sex workers ranged from 14,500 to 64,300; population size estimates for men who have sex with men ranged from 3200 to 41,400; and population size estimates for people who inject drugs ranged from 3400 to 30,400. Conclusions: This was the first implementation of these 3-source capture-recapture methods in Nigeria. Our population size estimates were larger than previously documented for each key population in all states. The Bayesian models account for factors, such as social visibility, that influence heterogeneous capture probabilities, resulting in more reliable population size estimates. The larger population size estimates suggest a need for programmatic scale-up to reach these populations, which are at highest risk for HIV. %M 36287587 %R 10.2196/34555 %U https://publichealth.jmir.org/2022/10/e34555 %U https://doi.org/10.2196/34555 %U http://www.ncbi.nlm.nih.gov/pubmed/36287587 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 6 %N 10 %P e39373 %T Mental Illness Concordance Between Hospital Clinical Records and Mentions in Domestic Violence Police Narratives: Data Linkage Study %A Karystianis,George %A Cabral,Rina Carines %A Adily,Armita %A Lukmanjaya,Wilson %A Schofield,Peter %A Buchan,Iain %A Nenadic,Goran %A Butler,Tony %+ School of Population Health, University of New South Wales, Level 3, Samuels Building, Gate 11, Botany Street, UNSW Kensington Campus, Sydney, 2052, Australia, 61 93852517, g.karystianis@unsw.edu.au %K data linkage %K mental health %K domestic violence %K police records %K hospital records %K text mining %D 2022 %7 20.10.2022 %9 Original Paper %J JMIR Form Res %G English %X Background: To better understand domestic violence, data sources from multiple sectors such as police, justice, health, and welfare are needed. Linking police data to data collections from other agencies could provide unique insights and promote an all-of-government response to domestic violence. The New South Wales Police Force attends domestic violence events and records information in the form of both structured data and a free-text narrative, with the latter shown to be a rich source of information on the mental health status of persons of interest (POIs) and victims, abuse types, and sustained injuries. Objective: This study aims to examine the concordance (ie, matching) between mental illness mentions extracted from the police’s event narratives and mental health diagnoses from hospital and emergency department records. Methods: We applied a rule-based text mining method on 416,441 domestic violence police event narratives between December 2005 and January 2016 to identify mental illness mentions for POIs and victims. Using different window periods (1, 3, 6, and 12 months) before and after a domestic violence event, we linked the extracted mental illness mentions of victims and POIs to clinical records from the Emergency Department Data Collection and the Admitted Patient Data Collection in New South Wales, Australia using a unique identifier for each individual in the same cohort. Results: Using a 2-year window period (ie, 12 months before and after the domestic violence event), less than 1% (3020/416,441, 0.73%) of events had a mental illness mention and also a corresponding hospital record. About 16% of domestic violence events for both POIs (382/2395, 15.95%) and victims (101/631, 16.01%) had an agreement between hospital records and police narrative mentions of mental illness. A total of 51,025/416,441 (12.25%) events for POIs and 14,802/416,441 (3.55%) events for victims had mental illness mentions in their narratives but no hospital record. Only 841 events for POIs and 919 events for victims had a documented hospital record within 48 hours of the domestic violence event. Conclusions: Our findings suggest that current surveillance systems used to report on domestic violence may be enhanced by accessing rich information (ie, mental illness) contained in police text narratives, made available for both POIs and victims through the application of text mining. Additional insights can be gained by linkage to other health and welfare data collections. %M 36264613 %R 10.2196/39373 %U https://formative.jmir.org/2022/10/e39373 %U https://doi.org/10.2196/39373 %U http://www.ncbi.nlm.nih.gov/pubmed/36264613 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 9 %P e37922 %T Spatiotemporal Analysis of Online Purchase of HIV Self-testing Kits in China, 2015-2017: Longitudinal Observational Study %A Lv,Yi %A Zhu,Qiyu %A Xu,Chengdong %A Zhang,Guanbin %A Jiang,Yan %A Han,Mengjie %A Jin,Cong %+ National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, 155 Changbai Road, Changping District, Beijing, 102206, China, 86 10 58900995, jinc@chinaaids.cn %K spatiotemporal %K characteristics %K online %K purchase %K HIV %K self-testing %K e-commerce %K economic status %K HIV epidemic %K China %D 2022 %7 27.9.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Since the introduction of HIV self-testing by UNAIDS in 2014, the practice has been extensively implemented around the world. HIV self-testing (HIVST) was developed in China around 2015, and the online purchase of HIVST kits through e-commerce platforms has since become the most important delivery method for self-testing, with advantages such as user-friendliness, speed, and better privacy protection. Objective: Understanding the spatiotemporal characteristics of online HIVST kit purchasing behavior and identifying potential impacting factors will help promote the HIV self-testing strategy. Methods: The online retail data of HIVST kits from the 2 largest e-commerce platforms in China from 2015 to 2017 were collected for this study. The Bayesian spatiotemporal hierarchical model was used to investigate the spatiotemporal characteristics of online purchased HIVST kits. Ordinary least squares regression was used to identify potential factors associated with online purchase, including GDP per capita, population density, road density, HIV screening laboratory density, and newly diagnosed HIV/AIDS cases per 100,000 persons. The q statistics calculated by Geodetector were used to determine the interactive effect of every 2 factors on the online purchase. Results: The online purchase of HIVST kits increased rapidly in China from 2015 to 2017, with annual peak sales in May and December. Five economically superior regions in China, Pearl River Delta, Yangtze River Delta, Chengdu and surrounding areas, Beijing and Tianjin areas, and Shandong Peninsula, showed a comparatively higher spatial preference for online purchased HIVST kits. The GDP per capita (P<.001) and the rate of newly diagnosed HIV/AIDS cases per 100,000 persons (P<.001) were identified as 2 factors positively associated with online purchase. Among the factors we investigated in this study, 2 factors associated with online purchase, GDP per capita and the rate of newly diagnosed HIV/AIDS cases per 100,000 persons, also displayed the strongest interactive effect, with a q value of 0.66. Conclusions: Individuals in better-off areas are more inclined to purchase HIVST kits online. In addition to economic status, the severity of the HIV epidemic is also a factor influencing the online purchase of HIVST kits. %M 35918844 %R 10.2196/37922 %U https://publichealth.jmir.org/2022/9/e37922 %U https://doi.org/10.2196/37922 %U http://www.ncbi.nlm.nih.gov/pubmed/35918844 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 9 %P e37887 %T The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study %A Weiss,Paul Samuel %A Waller,Lance Allyn %+ Rollins School of Public Health, Emory University, 1518 Clifton Rd NE, Room 308, Atlanta, GA, 30322-4201, United States, 1 404 712 9641, paul.weiss@emory.edu %K surveillance %K estimation %K missing data %K population-level estimates %K health policy %K public health policy %K estimates %K data %K policy decision %K bias %K response rate %D 2022 %7 9.9.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Surveillance data are essential public health resources for guiding policy and allocation of human and capital resources. These data often consist of large collections of information based on nonrandom sample designs. Population estimates based on such data may be impacted by the underlying sample distribution compared to the true population of interest. In this study, we simulate a population of interest and allow response rates to vary in nonrandom ways to illustrate and measure the effect this has on population-based estimates of an important public health policy outcome. Objective: The aim of this study was to illustrate the effect of nonrandom missingness on population-based survey sample estimation. Methods: We simulated a population of respondents answering a survey question about their satisfaction with their community’s policy regarding vaccination mandates for government personnel. We allowed response rates to differ between the generally satisfied and dissatisfied and considered the effect of common efforts to control for potential bias such as sampling weights, sample size inflation, and hypothesis tests for determining missingness at random. We compared these conditions via mean squared errors and sampling variability to characterize the bias in estimation arising under these different approaches. Results: Sample estimates present clear and quantifiable bias, even in the most favorable response profile. On a 5-point Likert scale, nonrandom missingness resulted in errors averaging to almost a full point away from the truth. Efforts to mitigate bias through sample size inflation and sampling weights have negligible effects on the overall results. Additionally, hypothesis testing for departures from random missingness rarely detect the nonrandom missingness across the widest range of response profiles considered. Conclusions: Our results suggest that assuming surveillance data are missing at random during analysis could provide estimates that are widely different from what we might see in the whole population. Policy decisions based on such potentially biased estimates could be devastating in terms of community disengagement and health disparities. Alternative approaches to analysis that move away from broad generalization of a mismeasured population at risk are necessary to identify the marginalized groups, where overall response may be very different from those observed in measured respondents. %M 36083618 %R 10.2196/37887 %U https://publichealth.jmir.org/2022/9/e37887 %U https://doi.org/10.2196/37887 %U http://www.ncbi.nlm.nih.gov/pubmed/36083618 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 9 %P e34472 %T Privacy of Study Participants in Open-access Health and Demographic Surveillance System Data: Requirements Analysis for Data Anonymization %A Templ,Matthias %A Kanjala,Chifundo %A Siems,Inken %+ Institute of Data Analysis and Process Design, Zurich University of Applied Sciences, Rosenstrasse 3, Winterthur, 8404, Switzerland, 41 793221578, matthias.templ@zhaw.ch %K longitudinal data and event history data %K low- and middle-income countries %K LMIC %K anonymization %K health and demographic surveillance system %D 2022 %7 2.9.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Data anonymization and sharing have become popular topics for individuals, organizations, and countries worldwide. Open-access sharing of anonymized data containing sensitive information about individuals makes the most sense whenever the utility of the data can be preserved and the risk of disclosure can be kept below acceptable levels. In this case, researchers can use the data without access restrictions and limitations. Objective: This study aimed to highlight the requirements and possible solutions for sharing health surveillance event history data. The challenges lie in the anonymization of multiple event dates and time-varying variables. Methods: A sequential approach that adds noise to event dates is proposed. This approach maintains the event order and preserves the average time between events. In addition, a nosy neighbor distance-based matching approach to estimate the risk is proposed. Regarding the key variables that change over time, such as educational level or occupation, we make 2 proposals: one based on limiting the intermediate statuses of the individual and the other to achieve k-anonymity in subsets of the data. The proposed approaches were applied to the Karonga health and demographic surveillance system (HDSS) core residency data set, which contains longitudinal data from 1995 to the end of 2016 and includes 280,381 events with time-varying socioeconomic variables and demographic information. Results: An anonymized version of the event history data, including longitudinal information on individuals over time, with high data utility, was created. Conclusions: The proposed anonymization of event history data comprising static and time-varying variables applied to HDSS data led to acceptable disclosure risk, preserved utility, and being sharable as public use data. It was found that high utility was achieved, even with the highest level of noise added to the core event dates. The details are important to ensure consistency or credibility. Importantly, the sequential noise addition approach presented in this study does not only maintain the event order recorded in the original data but also maintains the time between events. We proposed an approach that preserves the data utility well but limits the number of response categories for the time-varying variables. Furthermore, using distance-based neighborhood matching, we simulated an attack under a nosy neighbor situation and by using a worst-case scenario where attackers have full information on the original data. We showed that the disclosure risk is very low, even when assuming that the attacker’s database and information are optimal. The HDSS and medical science research communities in low- and middle-income country settings will be the primary beneficiaries of the results and methods presented in this paper; however, the results will be useful for anyone working on anonymizing longitudinal event history data with time-varying variables for the purposes of sharing. %M 36053573 %R 10.2196/34472 %U https://publichealth.jmir.org/2022/9/e34472 %U https://doi.org/10.2196/34472 %U http://www.ncbi.nlm.nih.gov/pubmed/36053573 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 8 %P e36861 %T Development and Validation of Indicators for Population Injury Surveillance in Hong Kong: Development and Usability Study %A Tung,Keith T S %A Wong,Rosa S %A Ho,Frederick K %A Chan,Ko Ling %A Wong,Wilfred H S %A Leung,Hugo %A Leung,Ming %A Leung,Gilberto K K %A Chow,Chun Bong %A Ip,Patrick %+ Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, NCB 123, Queen Mary Hospital, Hong Kong, Hong Kong, 852 22554090, patricip@hku.hk %K injury %K indicators %K modified Delphi research design %K surveillance %D 2022 %7 18.8.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Injury is an increasingly pressing global health issue. An effective surveillance system is required to monitor the trends and burden of injuries. Objective: This study aimed to identify a set of valid and context-specific injury indicators to facilitate the establishment of an injury surveillance program in Hong Kong. Methods: This development of indicators adopted a multiphased modified Delphi research design. A literature search was conducted on academic databases using injury-related search terms in various combinations. A list of potential indicators was sent to a panel of experts from various backgrounds to rate the validity and context-specificity of these indicators. Local hospital data on the selected core indicators were used to examine their applicability in the context of Hong Kong. Results: We reviewed 142 articles and identified 55 indicators, which were classified into 4 domains. On the basis of the ratings by the expert panel, 13 indicators were selected as core indicators because of their good validity and high relevance to the local context. Among these indicators, 10 were from the construct of health care service use, and 3 were from the construct of postdischarge outcomes. Regression analyses of local hospitalization data showed that the Hong Kong Safe Community certification status had no association with 5 core indicators (admission to intensive care unit, mortality rate, length of intensive care unit stay, need for a rehabilitation facility, and long-term behavioral and emotional outcomes), negative associations with 4 core indicators (operative intervention, infection rate, length of hospitalization, and disability-adjusted life years), and positive associations with the remaining 4 core indicators (attendance to accident and emergency department, discharge rate, suicide rate, and hospitalization rate after attending the accident and emergency department). These results confirmed the validity of the selected core indicators for the quantification of injury burden and evaluation of injury-related services, although some indicators may better measure the consequences of severe injuries. Conclusions: This study developed a set of injury outcome indicators that would be useful for monitoring injury trends and burdens in Hong Kong. %M 35980728 %R 10.2196/36861 %U https://publichealth.jmir.org/2022/8/e36861 %U https://doi.org/10.2196/36861 %U http://www.ncbi.nlm.nih.gov/pubmed/35980728 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 8 %P e34589 %T Colorectal Cancer Incidence, Inequalities, and Prevention Priorities in Urban Texas: Surveillance Study With the “surveil” Software Package %A Donegan,Connor %A Hughes,Amy E %A Lee,Simon J Craddock %+ Department of Population and Data Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, 75390, United States, 1 214 648 9400, Connor.Donegan@UTSouthwestern.edu %K Bayesian analysis %K cancer prevention %K colorectal cancer %K health equity %K open source software %K public health monitoring %K time-series analysis %D 2022 %7 16.8.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Monitoring disease incidence rates over time with population surveillance data is fundamental to public health research and practice. Bayesian disease monitoring methods provide advantages over conventional methods including greater flexibility in model specification and the ability to conduct formal inference on model-derived quantities of interest. However, software platforms for Bayesian inference are often inaccessible to nonspecialists. Objective: To increase the accessibility of Bayesian methods among health surveillance researchers, we introduce a Bayesian methodology and open source software package, surveil, for time-series modeling of disease incidence and mortality. Given case count and population-at-risk data, the software enables health researchers to draw inferences about underlying risk and derivative quantities including age-standardized rates, annual and cumulative percent change, and measures of inequality. Methods: We specify a Poisson likelihood for case counts and model trends in log-risk using the first-difference (random-walk) prior. Models in the surveil R package were built using the Stan modeling language. We demonstrate the methodology and software by analyzing age-standardized colorectal cancer (CRC) incidence rates by race and ethnicity for non-Latino Black (Black), non-Latino White (White), and Hispanic/Latino (of any race) adults aged 50-79 years in Texas’s 4 largest metropolitan statistical areas between 1999 and 2018. Results: Our analysis revealed a cumulative decline of 31% (95% CI –37% to –25%) in CRC risk among Black adults, 17% (95% CI –23% to –11%) for Latino adults, and 35% (95% CI –38% to –31%) for White adults from 1999 to 2018. None of the 3 observed groups experienced significant incidence reduction in the final 4 years of the study (2015-2018). The Black-White rate difference (per 100,000) was 44 (95% CI 30-57) in 1999 and 35 (95% CI 28-43) in 2018. Cumulatively, the Black-White gap accounts for 3983 CRC cases (95% CI 3746-4219) or 31% (95% CI 29%-32%) of total CRC incidence among Black adults in this period. Conclusions: Stalled progress on CRC prevention and excess CRC risk among Black residents warrant special attention as cancer prevention and control priorities in urban Texas. Our methodology and software can help the public and health agencies monitor health inequalities and evaluate progress toward disease prevention goals. Advantages of the methodology over current common practice include the following: (1) the absence of piecewise linearity constraints on the model space, and (2) formal inference can be undertaken on any model-derived quantities of interest using Bayesian methods. %M 35972778 %R 10.2196/34589 %U https://publichealth.jmir.org/2022/8/e34589 %U https://doi.org/10.2196/34589 %U http://www.ncbi.nlm.nih.gov/pubmed/35972778 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 8 %P e35840 %T Investigating Linkages Between Spatiotemporal Patterns of the COVID-19 Delta Variant and Public Health Interventions in Southeast Asia: Prospective Space-Time Scan Statistical Analysis Method %A Luo,Wei %A Liu,Zhaoyin %A Zhou,Yuxuan %A Zhao,Yumin %A Li,Yunyue Elita %A Masrur,Arif %A Yu,Manzhu %+ Department of Geography, National University of Singapore, 1 Arts Link, #04-32 Block AS2, Singapore, 117570, Singapore, 65 65163851, geowl@nus.edu.sg %K COVID-19 %K Delta variant %K space-time scan %K intervention %K Southeast Asia %D 2022 %7 9.8.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The COVID-19 Delta variant has presented an unprecedented challenge to countries in Southeast Asia (SEA). Its transmission has shown spatial heterogeneity in SEA after countries have adopted different public health interventions during the process. Hence, it is crucial for public health authorities to discover potential linkages between epidemic progression and corresponding interventions such that collective and coordinated control measurements can be designed to increase their effectiveness at reducing transmission in SEA. Objective: The purpose of this study is to explore potential linkages between the spatiotemporal progression of the COVID-19 Delta variant and nonpharmaceutical intervention (NPI) measures in SEA. We detected the space-time clusters of outbreaks of COVID-19 and analyzed how the NPI measures relate to the propagation of COVID-19. Methods: We collected district-level daily new cases of COVID-19 from June 1 to October 31, 2021, and district-level population data in SEA. We adopted prospective space-time scan statistics to identify the space-time clusters. Using cumulative prospective space-time scan statistics, we further identified variations of relative risk (RR) across each district at a half-month interval and their potential public health intervention linkages. Results: We found 7 high-risk clusters (clusters 1-7) of COVID-19 transmission in Malaysia, the Philippines, Thailand, Vietnam, and Indonesia between June and August, 2021, with an RR of 5.45 (P<.001), 3.50 (P<.001), 2.30 (P<.001), 1.36 (P<.001), 5.62 (P<.001), 2.38 (P<.001), 3.45 (P<.001), respectively. There were 34 provinces in Indonesia that have successfully mitigated the risk of COVID-19, with a decreasing range between –0.05 and –1.46 due to the assistance of continuous restrictions. However, 58.6% of districts in Malaysia, Singapore, Thailand, and the Philippines saw an increase in the infection risk, which is aligned with their loosened restrictions. Continuous strict interventions were effective in mitigating COVID-19, while relaxing restrictions may exacerbate the propagation risk of this epidemic. Conclusions: The analyses of space-time clusters and RRs of districts benefit public health authorities with continuous surveillance of COVID-19 dynamics using real-time data. International coordination with more synchronized interventions amidst all SEA countries may play a key role in mitigating the progression of COVID-19. %M 35861674 %R 10.2196/35840 %U https://publichealth.jmir.org/2022/8/e35840 %U https://doi.org/10.2196/35840 %U http://www.ncbi.nlm.nih.gov/pubmed/35861674 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 7 %P e31306 %T Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation %A Stockham,Nathaniel %A Washington,Peter %A Chrisman,Brianna %A Paskov,Kelley %A Jung,Jae-Yoon %A Wall,Dennis Paul %+ Neurosciences Interdepartmental Program, Stanford University, 3145 Porter Dr, Palo Alto, CA, 94304, United States, 1 2056021832, stockham@stanford.edu %K selection bias %K COVID-19 %K epidemiology %K causality %K sensitivity analysis %K public health %K surveillance %K method %K epidemiologic research design %K model %K bias %K development %K validation %K utility %K implementation %K sensitivity %K design %K research %K epidemiology %D 2022 %7 21.7.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Selection bias and unmeasured confounding are fundamental problems in epidemiology that threaten study internal and external validity. These phenomena are particularly dangerous in internet-based public health surveillance, where traditional mitigation and adjustment methods are inapplicable, unavailable, or out of date. Recent theoretical advances in causal modeling can mitigate these threats, but these innovations have not been widely deployed in the epidemiological community. Objective: The purpose of our paper is to demonstrate the practical utility of causal modeling to both detect unmeasured confounding and selection bias and guide model selection to minimize bias. We implemented this approach in an applied epidemiological study of the COVID-19 cumulative infection rate in the New York City (NYC) spring 2020 epidemic. Methods: We collected primary data from Qualtrics surveys of Amazon Mechanical Turk (MTurk) crowd workers residing in New Jersey and New York State across 2 sampling periods: April 11-14 and May 8-11, 2020. The surveys queried the subjects on household health status and demographic characteristics. We constructed a set of possible causal models of household infection and survey selection mechanisms and ranked them by compatibility with the collected survey data. The most compatible causal model was then used to estimate the cumulative infection rate in each survey period. Results: There were 527 and 513 responses collected for the 2 periods, respectively. Response demographics were highly skewed toward a younger age in both survey periods. Despite the extremely strong relationship between age and COVID-19 symptoms, we recovered minimally biased estimates of the cumulative infection rate using only primary data and the most compatible causal model, with a relative bias of +3.8% and –1.9% from the reported cumulative infection rate for the first and second survey periods, respectively. Conclusions: We successfully recovered accurate estimates of the cumulative infection rate from an internet-based crowdsourced sample despite considerable selection bias and unmeasured confounding in the primary data. This implementation demonstrates how simple applications of structural causal modeling can be effectively used to determine falsifiable model conditions, detect selection bias and confounding factors, and minimize estimate bias through model selection in a novel epidemiological context. As the disease and social dynamics of COVID-19 continue to evolve, public health surveillance protocols must continue to adapt; the emergence of Omicron variants and shift to at-home testing as recent challenges. Rigorous and transparent methods to develop, deploy, and diagnosis adapted surveillance protocols will be critical to their success. %M 35605128 %R 10.2196/31306 %U https://publichealth.jmir.org/2022/7/e31306 %U https://doi.org/10.2196/31306 %U http://www.ncbi.nlm.nih.gov/pubmed/35605128 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 6 %P e35266 %T Enhancing COVID-19 Epidemic Forecasting Accuracy by Combining Real-time and Historical Data From Multiple Internet-Based Sources: Analysis of Social Media Data, Online News Articles, and Search Queries %A Li,Jingwei %A Huang,Wei %A Sia,Choon Ling %A Chen,Zhuo %A Wu,Tailai %A Wang,Qingnan %+ National Center for Applied Mathematics Shenzhen, No. 1088, Xueyuan Avenue, Nanshan District, Shenzhen, 518055, China, 86 15129077179, waynehuangwei@163.com %K SARS-CoV-2 %K COVID 19 %K epidemic forecasting %K disease surveillance %K infectious disease epidemiology %K social medial %K online news %K search query %K autoregression model %D 2022 %7 16.6.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The SARS-COV-2 virus and its variants pose extraordinary challenges for public health worldwide. Timely and accurate forecasting of the COVID-19 epidemic is key to sustaining interventions and policies and efficient resource allocation. Internet-based data sources have shown great potential to supplement traditional infectious disease surveillance, and the combination of different Internet-based data sources has shown greater power to enhance epidemic forecasting accuracy than using a single Internet-based data source. However, existing methods incorporating multiple Internet-based data sources only used real-time data from these sources as exogenous inputs but did not take all the historical data into account. Moreover, the predictive power of different Internet-based data sources in providing early warning for COVID-19 outbreaks has not been fully explored. Objective: The main aim of our study is to explore whether combining real-time and historical data from multiple Internet-based sources could improve the COVID-19 forecasting accuracy over the existing baseline models. A secondary aim is to explore the COVID-19 forecasting timeliness based on different Internet-based data sources. Methods: We first used core terms and symptom-related keyword-based methods to extract COVID-19–related Internet-based data from December 21, 2019, to February 29, 2020. The Internet-based data we explored included 90,493,912 online news articles, 37,401,900 microblogs, and all the Baidu search query data during that period. We then proposed an autoregressive model with exogenous inputs, incorporating real-time and historical data from multiple Internet-based sources. Our proposed model was compared with baseline models, and all the models were tested during the first wave of COVID-19 epidemics in Hubei province and the rest of mainland China separately. We also used lagged Pearson correlations for COVID-19 forecasting timeliness analysis. Results: Our proposed model achieved the highest accuracy in all 5 accuracy measures, compared with all the baseline models of both Hubei province and the rest of mainland China. In mainland China, except for Hubei, the COVID-19 epidemic forecasting accuracy differences between our proposed model (model i) and all the other baseline models were statistically significant (model 1, t198=–8.722, P<.001; model 2, t198=–5.000, P<.001, model 3, t198=–1.882, P=.06; model 4, t198=–4.644, P<.001; model 5, t198=–4.488, P<.001). In Hubei province, our proposed model's forecasting accuracy improved significantly compared with the baseline model using historical new confirmed COVID-19 case counts only (model 1, t198=–1.732, P=.09). Our results also showed that Internet-based sources could provide a 2- to 6-day earlier warning for COVID-19 outbreaks. Conclusions: Our approach incorporating real-time and historical data from multiple Internet-based sources could improve forecasting accuracy for epidemics of COVID-19 and its variants, which may help improve public health agencies' interventions and resource allocation in mitigating and controlling new waves of COVID-19 or other relevant epidemics. %M 35507921 %R 10.2196/35266 %U https://publichealth.jmir.org/2022/6/e35266 %U https://doi.org/10.2196/35266 %U http://www.ncbi.nlm.nih.gov/pubmed/35507921 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 6 %P e37491 %T The Distribution of HIV and AIDS Cases in Luzhou, China, From 2011 to 2020: Bayesian Spatiotemporal Analysis %A Ren,Ningjun %A Li,Yuansheng %A Wang,Ruolan %A Zhang,Wenxin %A Chen,Run %A Xiao,Ticheng %A Chen,Hang %A Li,Ailing %A Fan,Song %+ School of Public Health, Southwest Medical University, No.1, Section 1, Xianglin Road, Longmatan District, Luzhou, 646000, China, 86 8303175813, fansong@swmu.edu.cn %K HIV and AIDS %K reported incidence %K Bayesian model %K spatio-temporal distribution %D 2022 %7 14.6.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The vastly increasing number of reported HIV and AIDS cases in Luzhou, China, in recent years, coupled with the city’s unique geographical location at the intersection of 4 provinces, makes it particularly important to conduct a spatiotemporal analysis of HIV and AIDS cases. Objective: The aim of this study is to understand the spatiotemporal distribution of HIV and the factors influencing this distribution in Luzhou, China, from 2011 to 2020. Methods: Data on the incidence of HIV and AIDS in Luzhou from 2011 to 2020 were obtained from the AIDS Information Management System of the Luzhou Center for Disease Control and Prevention. ArcGIS was used to visualize the spatiotemporal distribution of HIV and AIDS cases. The Bayesian spatiotemporal model was used to investigate factors affecting the spatiotemporal distribution of HIV and AIDS, including the gross domestic product (GDP) per capita, urbanization rate, number of hospital beds, population density, and road mileage. Results: The reported incidence of HIV and AIDS rose from 8.50 cases per 100,000 population in 2011 to 49.25 cases per 100,000 population in 2020—an increase of 578.87%. In the first 5 years, hotspots were concentrated in Jiangyang district, Longmatan district, and Luxian county. After 2016, Luzhou’s high HIV incidence areas gradually shifted eastward, with Hejiang county having the highest average prevalence rate (41.68 cases per 100,000 population) from 2011 to 2020, being 2.28 times higher than that in Gulin county (18.30 cases per 100,000), where cold spots were concentrated. The risk for the incidence of HIV and AIDS was associated with the urbanization rate, population density, and GDP per capita. For every 1% increase in the urbanization rate, the relative risk (RR) increases by 1.3%, while an increase of 100 people per square kilometer would increase the RR by 8.7%; for every 1000 Yuan (US $148.12) increase in GDP per capita, the RR decreases by 1.5%. Conclusions: In Luzhou, current HIV and AIDS prevention and control efforts must be focused on the location of each district or county government; we suggest the region balance urban development and HIV and AIDS prevention. Moreover, more attention should be paid to economically disadvantaged areas. %M 35700022 %R 10.2196/37491 %U https://publichealth.jmir.org/2022/6/e37491 %U https://doi.org/10.2196/37491 %U http://www.ncbi.nlm.nih.gov/pubmed/35700022 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 6 %P e37377 %T Overlapping Delta and Omicron Outbreaks During the COVID-19 Pandemic: Dynamic Panel Data Estimates %A Lundberg,Alexander L %A Lorenzo-Redondo,Ramon %A Hultquist,Judd F %A Hawkins,Claudia A %A Ozer,Egon A %A Welch,Sarah B %A Prasad,P V Vara %A Achenbach,Chad J %A White,Janine I %A Oehmke,James F %A Murphy,Robert L %A Havey,Robert J %A Post,Lori A %+ Buehler Center for Health Policy and Economics, Robert J Havey, MD Institute for Global Health, Northwestern University, 750 N. Lake Shore Drive, Chicago, IL, 60611, United States, 1 3125031706, lori.post@northwestern.edu %K Omicron variant of concern %K Delta %K COVID-19 %K SARS-CoV-2 %K B.1.1.529 %K outbreak %K Arellano-Bond estimator %K dynamic panel data %K stringency index %K surveillance %K disease transmission metrics %D 2022 %7 3.6.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The Omicron variant of SARS-CoV-2 is more transmissible than prior variants of concern (VOCs). It has caused the largest outbreaks in the pandemic, with increases in mortality and hospitalizations. Early data on the spread of Omicron were captured in countries with relatively low case counts, so it was unclear how the arrival of Omicron would impact the trajectory of the pandemic in countries already experiencing high levels of community transmission of Delta. Objective: The objective of this study is to quantify and explain the impact of Omicron on pandemic trajectories and how they differ between countries that were or were not in a Delta outbreak at the time Omicron occurred. Methods: We used SARS-CoV-2 surveillance and genetic sequence data to classify countries into 2 groups: those that were in a Delta outbreak (defined by at least 10 novel daily transmissions per 100,000 population) when Omicron was first sequenced in the country and those that were not. We used trend analysis, survival curves, and dynamic panel regression models to compare outbreaks in the 2 groups over the period from November 1, 2021, to February 11, 2022. We summarized the outbreaks in terms of their peak rate of SARS-CoV-2 infections and the duration of time the outbreaks took to reach the peak rate. Results: Countries that were already in an outbreak with predominantly Delta lineages when Omicron arrived took longer to reach their peak rate and saw greater than a twofold increase (2.04) in the average apex of the Omicron outbreak compared to countries that were not yet in an outbreak. Conclusions: These results suggest that high community transmission of Delta at the time of the first detection of Omicron was not protective, but rather preluded larger outbreaks in those countries. Outbreak status may reflect a generally susceptible population, due to overlapping factors, including climate, policy, and individual behavior. In the absence of strong mitigation measures, arrival of a new, more transmissible variant in these countries is therefore more likely to lead to larger outbreaks. Alternately, countries with enhanced surveillance programs and incentives may be more likely to both exist in an outbreak status and detect more cases during an outbreak, resulting in a spurious relationship. Either way, these data argue against herd immunity mitigating future outbreaks with variants that have undergone significant antigenic shifts. %M 35500140 %R 10.2196/37377 %U https://publichealth.jmir.org/2022/6/e37377 %U https://doi.org/10.2196/37377 %U http://www.ncbi.nlm.nih.gov/pubmed/35500140 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 6 %P e34296 %T Estimating COVID-19 Hospitalizations in the United States With Surveillance Data Using a Bayesian Hierarchical Model: Modeling Study %A Couture,Alexia %A Iuliano,A Danielle %A Chang,Howard H %A Patel,Neha N %A Gilmer,Matthew %A Steele,Molly %A Havers,Fiona P %A Whitaker,Michael %A Reed,Carrie %+ Centers for Disease Control and Prevention, 1600 Clifton Road, Atlanta, GA, 30329, United States, 1 4044985984, njh6@cdc.gov %K COVID-19 %K SARS-CoV-2 %K hospitalization %K Bayesian %K COVID-NET %K extrapolation %K hospital %K estimation %K prediction %K United States %K surveillance %K data %K model %K modeling %K hierarchical %K rate %K novel %K framework %K monitoring %D 2022 %7 2.6.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: In the United States, COVID-19 is a nationally notifiable disease, meaning cases and hospitalizations are reported by states to the Centers for Disease Control and Prevention (CDC). Identifying and reporting every case from every facility in the United States may not be feasible in the long term. Creating sustainable methods for estimating the burden of COVID-19 from established sentinel surveillance systems is becoming more important. Objective: We aimed to provide a method leveraging surveillance data to create a long-term solution to estimate monthly rates of hospitalizations for COVID-19. Methods: We estimated monthly hospitalization rates for COVID-19 from May 2020 through April 2021 for the 50 states using surveillance data from the COVID-19-Associated Hospitalization Surveillance Network (COVID-NET) and a Bayesian hierarchical model for extrapolation. Hospitalization rates were calculated from patients hospitalized with a lab-confirmed SARS-CoV-2 test during or within 14 days before admission. We created a model for 6 age groups (0-17, 18-49, 50-64, 65-74, 75-84, and ≥85 years) separately. We identified covariates from multiple data sources that varied by age, state, and month and performed covariate selection for each age group based on 2 methods, Least Absolute Shrinkage and Selection Operator (LASSO) and spike and slab selection methods. We validated our method by checking the sensitivity of model estimates to covariate selection and model extrapolation as well as comparing our results to external data. Results: We estimated 3,583,100 (90% credible interval [CrI] 3,250,500-3,945,400) hospitalizations for a cumulative incidence of 1093.9 (992.4-1204.6) hospitalizations per 100,000 population with COVID-19 in the United States from May 2020 through April 2021. Cumulative incidence varied from 359 to 1856 per 100,000 between states. The age group with the highest cumulative incidence was those aged ≥85 years (5575.6; 90% CrI 5066.4-6133.7). The monthly hospitalization rate was highest in December (183.7; 90% CrI 154.3-217.4). Our monthly estimates by state showed variations in magnitudes of peak rates, number of peaks, and timing of peaks between states. Conclusions: Our novel approach to estimate hospitalizations for COVID-19 has potential to provide sustainable estimates for monitoring COVID-19 burden as well as a flexible framework leveraging surveillance data. %M 35452402 %R 10.2196/34296 %U https://publichealth.jmir.org/2022/6/e34296 %U https://doi.org/10.2196/34296 %U http://www.ncbi.nlm.nih.gov/pubmed/35452402 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 11 %N 5 %P e36261 %T Lessening Organ Dysfunction With Vitamin C (LOVIT) Trial: Statistical Analysis Plan %A Adhikari,Neill KJ %A Pinto,Ruxandra %A Day,Andrew G %A Masse,Marie-Hélène %A Ménard,Julie %A Sprague,Sheila %A Annane,Djillali %A Arabi,Yaseen M %A Battista,Marie-Claude %A Cohen,Dian %A Cook,Deborah J %A Guyatt,Gordon H %A Heyland,Daren K %A Kanji,Salmaan %A McGuinness,Shay P %A Parke,Rachael L %A Tirupakuzhi Vijayaraghavan,Bharath Kumar %A Charbonney,Emmanuel %A Chassé,Michaël %A Del Sorbo,Lorenzo %A Kutsogiannis,Demetrios James %A Lauzier,François %A Leblanc,Rémi %A Maslove,David M %A Mehta,Sangeeta %A Mekontso Dessap,Armand %A Mele,Tina S %A Rochwerg,Bram %A Rewa,Oleksa G %A Shahin,Jason %A Twardowski,Pawel %A Young,Paul Jeffrey %A Lamontagne,François %A , %+ Department of Critical Care Medicine, Sunnybrook Health Sciences Centre, 2075 Bayview Avenue, Room D1.08, Toronto, ON, M4N 3M5, Canada, 1 4164804522, neill.adhikari@utoronto.ca %K sepsis %K vitamin C %K statistical analysis %K organ %K ascorbic acid %K critical care %K organ dysfunction %K intensive care unit %K intensive care %K patient %K vasopressor %K infection %K intravenous %K health data %K trial database %K patient outcome %K mortality %K statistical framework %K binomial distribution %D 2022 %7 20.5.2022 %9 Protocol %J JMIR Res Protoc %G English %X Background: The LOVIT (Lessening Organ Dysfunction with Vitamin C) trial is a blinded multicenter randomized clinical trial comparing high-dose intravenous vitamin C to placebo in patients admitted to the intensive care unit with proven or suspected infection as the main diagnosis and receiving a vasopressor. Objective: We aim to describe a prespecified statistical analysis plan (SAP) for the LOVIT trial prior to unblinding and locking of the trial database. Methods: The SAP was designed by the LOVIT principal investigators and statisticians, and approved by the steering committee and coinvestigators. The SAP defines the primary and secondary outcomes, and describes the planned primary, secondary, and subgroup analyses. Results: The SAP includes a draft participant flow diagram, tables, and planned figures. The primary outcome is a composite of mortality and persistent organ dysfunction (receipt of mechanical ventilation, vasopressors, or new renal replacement therapy) at 28 days, where day 1 is the day of randomization. All analyses will use a frequentist statistical framework. The analysis of the primary outcome will estimate the risk ratio and 95% CI in a generalized linear mixed model with binomial distribution and log link, with site as a random effect. We will perform a secondary analysis adjusting for prespecified baseline clinical variables. Subgroup analyses will include age, sex, frailty, severity of illness, Sepsis-3 definition of septic shock, baseline ascorbic acid level, and COVID-19 status. Conclusions: We have developed an SAP for the LOVIT trial and will adhere to it in the analysis phase. International Registered Report Identifier (IRRID): DERR1-10.2196/36261 %M 35420994 %R 10.2196/36261 %U https://www.researchprotocols.org/2022/5/e36261 %U https://doi.org/10.2196/36261 %U http://www.ncbi.nlm.nih.gov/pubmed/35420994 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 5 %P e29343 %T Factors Associated With COVID-19 Death in the United States: Cohort Study %A Chen,Uan-I %A Xu,Hua %A Krause,Trudy Millard %A Greenberg,Raymond %A Dong,Xiao %A Jiang,Xiaoqian %+ School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St., Houston, TX, 77030, United States, 1 713 500 3930, xiaoqian.jiang@uth.tmc.edu %K COVID-19 %K risk factors %K survival analysis %K cohort studies %K EHR data %D 2022 %7 12.5.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Since the initial COVID-19 cases were identified in the United States in February 2020, the United States has experienced a high incidence of the disease. Understanding the risk factors for severe outcomes identifies the most vulnerable populations and helps in decision-making. Objective: This study aims to assess the factors associated with COVID-19–related deaths from a large, national, individual-level data set. Methods: A cohort study was conducted using data from the Optum de-identified COVID-19 electronic health record (EHR) data set; 1,271,033 adult participants were observed from February 1, 2020, to August 31, 2020, until their deaths due to COVID-19, deaths due to other reasons, or the end of the study. Cox proportional hazards models were constructed to evaluate the risks for each patient characteristic. Results: A total of 1,271,033 participants (age: mean 52.6, SD 17.9 years; male: 507,574/1,271,033, 39.93%) were included in the study, and 3315 (0.26%) deaths were attributed to COVID-19. Factors associated with COVID-19–related death included older age (≥80 vs 50-59 years old: hazard ratio [HR] 13.28, 95% CI 11.46-15.39), male sex (HR 1.68, 95% CI 1.57-1.80), obesity (BMI ≥40 vs <30 kg/m2: HR 1.71, 95% CI 1.50-1.96), race (Hispanic White, African American, Asian vs non-Hispanic White: HR 2.46, 95% CI 2.01-3.02; HR 2.27, 95% CI 2.06-2.50; HR 2.06, 95% CI 1.65-2.57), region (South, Northeast, Midwest vs West: HR 1.62, 95% CI 1.33-1.98; HR 2.50, 95% CI 2.06-3.03; HR 1.35, 95% CI 1.11-1.64), chronic respiratory disease (HR 1.21, 95% CI 1.12-1.32), cardiac disease (HR 1.10, 95% CI 1.01-1.19), diabetes (HR 1.92, 95% CI 1.75-2.10), recent diagnosis of lung cancer (HR 1.70, 95% CI 1.14-2.55), severely reduced kidney function (HR 1.92, 95% CI 1.69-2.19), stroke or dementia (HR 1.25, 95% CI 1.15-1.36), other neurological diseases (HR 1.77, 95% CI 1.59-1.98), organ transplant (HR 1.35, 95% CI 1.09-1.67), and other immunosuppressive conditions (HR 1.21, 95% CI 1.01-1.46). Conclusions: This is one of the largest national cohort studies in the United States; we identified several patient characteristics associated with COVID-19–related deaths, and the results can serve as the basis for policy making. The study also offered directions for future studies, including the effect of other socioeconomic factors on the increased risk for minority groups. %M 35377319 %R 10.2196/29343 %U https://publichealth.jmir.org/2022/5/e29343 %U https://doi.org/10.2196/29343 %U http://www.ncbi.nlm.nih.gov/pubmed/35377319 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 10 %N 5 %P e35422 %T Clustering Diagnoses From 58 Million Patient Visits in Finland Between 2015 and 2018 %A Fränti,Pasi %A Sieranoja,Sami %A Wikström,Katja %A Laatikainen,Tiina %+ Machine Learning Group, School of Computing, University of Eastern Finland, Box 111, Joensuu, 80101, Finland, 358 405929966, samisi@cs.uef.fi %K multimorbidity %K cluster analysis %K disease co-occurrence %K multimorbidity network %K health care data analysis %K graph clustering %K k-means %K data analysis %K cluster %K machine learning %K comorbidity %K register %K big data %K Finland %K Europe %K health record %D 2022 %7 4.5.2022 %9 Original Paper %J JMIR Med Inform %G English %X Background: Multiple chronic diseases in patients are a major burden on the health service system. Currently, diseases are mostly treated separately without paying sufficient attention to their relationships, which results in the fragmentation of the care process. The better integration of services can lead to the more effective organization of the overall health care system. Objective: This study aimed to analyze the connections between diseases based on their co-occurrences to support decision-makers in better organizing health care services. Methods: We performed a cluster analysis of diagnoses by using data from the Finnish Health Care Registers for primary and specialized health care visits and inpatient care. The target population of this study comprised those 3.8 million individuals (3,835,531/5,487,308, 69.90% of the whole population) aged ≥18 years who used health care services from the years 2015 to 2018. They had a total of 58 million visits. Clustering was performed based on the co-occurrence of diagnoses. The more the same pair of diagnoses appeared in the records of the same patients, the more the diagnoses correlated with each other. On the basis of the co-occurrences, we calculated the relative risk of each pair of diagnoses and clustered the data by using a graph-based clustering algorithm called the M-algorithm—a variant of k-means. Results: The results revealed multimorbidity clusters, of which some were expected (eg, one representing hypertensive and cardiovascular diseases). Other clusters were more unexpected, such as the cluster containing lower respiratory tract diseases and systemic connective tissue disorders. The annual cost of all clusters was €10.0 billion, and the costliest cluster was cardiovascular and metabolic problems, costing €2.3 billion. Conclusions: The method and the achieved results provide new insights into identifying key multimorbidity groups, especially those resulting in burden and costs in health care services. %M 35507390 %R 10.2196/35422 %U https://medinform.jmir.org/2022/5/e35422 %U https://doi.org/10.2196/35422 %U http://www.ncbi.nlm.nih.gov/pubmed/35507390 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 4 %P e36022 %T The Impact of COVID-19 on Mortality in Italy: Retrospective Analysis of Epidemiological Trends %A Rovetta,Alessandro %A Bhagavathula,Akshaya Srikanth %+ R & C Research, via Brede T2, Bovezzo (Brescia), 25073, Italy, 39 3927112808, rovetta.mresearch@gmail.com %K COVID-19 %K deniers %K excess deaths %K epidemiology %K infodemic %K infodemiology %K Italy %K longitudinal analysis %K mortality %K time series %K pandemic %K public health %D 2022 %7 7.4.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Despite the available evidence on its severity, COVID-19 has often been compared with seasonal flu by some conspirators and even scientists. Various public discussions arose about the noncausal correlation between COVID-19 and the observed deaths during the pandemic period in Italy. Objective: This paper aimed to search for endogenous reasons for the mortality increase recorded in Italy during 2020 to test this controversial hypothesis. Furthermore, we provide a framework for epidemiological analyses of time series. Methods: We analyzed deaths by age, sex, region, and cause of death in Italy from 2011 to 2019. Ordinary least squares (OLS) linear regression analyses and autoregressive integrated moving average (ARIMA) were used to predict the best value for 2020. A Grubbs 1-sided test was used to assess the significance of the difference between predicted and observed 2020 deaths/mortality. Finally, a 1-sample t test was used to compare the population of regional excess deaths to a null mean. The relationship between mortality and predictive variables was assessed using OLS multiple regression models. Since there is no uniform opinion on multicomparison adjustment and false negatives imply great epidemiological risk, the less-conservative Siegel approach and more-conservative Holm-Bonferroni approach were employed. By doing so, we provided the reader with the means to carry out an independent analysis. Results: Both ARIMA and OLS linear regression models predicted the number of deaths in Italy during 2020 to be between 640,000 and 660,000 (range of 95% CIs: 620,000-695,000) against the observed value of above 750,000. We found strong evidence supporting that the death increase in all regions (average excess=12.2%) was not due to chance (t21=7.2; adjusted P<.001). Male and female national mortality excesses were 18.4% (P<.001; adjusted P=.006) and 14.1% (P=.005; adjusted P=.12), respectively. However, we found limited significance when comparing male and female mortality residuals’ using the Mann-Whitney U test (P=.27; adjusted P=.99). Finally, mortality was strongly and positively correlated with latitude (R=0.82; adjusted P<.001). In this regard, the significance of the mortality increases during 2020 varied greatly from region to region. Lombardy recorded the highest mortality increase (38% for men, adjusted P<.001; 31% for women, P<.001; adjusted P=.006). Conclusions: Our findings support the absence of historical endogenous reasons capable of justifying the mortality increase observed in Italy during 2020. Together with the current knowledge on SARS-CoV-2, these results provide decisive evidence on the devastating impact of COVID-19. We suggest that this research be leveraged by government, health, and information authorities to furnish proof against conspiracy hypotheses that minimize COVID-19–related risks. Finally, given the marked concordance between ARIMA and OLS regression, we suggest that these models be exploited for public health surveillance. Specifically, meaningful information can be deduced by comparing predicted and observed epidemiological trends. %M 35238784 %R 10.2196/36022 %U https://publichealth.jmir.org/2022/4/e36022 %U https://doi.org/10.2196/36022 %U http://www.ncbi.nlm.nih.gov/pubmed/35238784 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 3 %P e30032 %T Subphenotyping of Mexican Patients With COVID-19 at Preadmission To Anticipate Severity Stratification: Age-Sex Unbiased Meta-Clustering Technique %A Zhou,Lexin %A Romero-García,Nekane %A Martínez-Miranda,Juan %A Conejero,J Alberto %A García-Gómez,Juan M %A Sáez,Carlos %+ Biomedical Data Science Lab, Instituto Universitario de Tecnologías de la Información y Comunicaciones, Universitat Politècnica de València, Camino de Vera s/n, Valencia, 46022, Spain, 34 963877000 ext 75278, carsaesi@upv.es %K COVID-19 %K subphenotypes %K clustering %K characterization %K observational %K epidemiology %K Mexico %D 2022 %7 30.3.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The COVID-19 pandemic has led to an unprecedented global health care challenge for both medical institutions and researchers. Recognizing different COVID-19 subphenotypes—the division of populations of patients into more meaningful subgroups driven by clinical features—and their severity characterization may assist clinicians during the clinical course, the vaccination process, research efforts, the surveillance system, and the allocation of limited resources. Objective: We aimed to discover age-sex unbiased COVID-19 patient subphenotypes based on easily available phenotypical data before admission, such as pre-existing comorbidities, lifestyle habits, and demographic features, to study the potential early severity stratification capabilities of the discovered subgroups through characterizing their severity patterns, including prognostic, intensive care unit (ICU), and morbimortality outcomes. Methods: We used the Mexican Government COVID-19 open data, including 778,692 SARS-CoV-2 population-based patient-level data as of September 2020. We applied a meta-clustering technique that consists of a 2-stage clustering approach combining dimensionality reduction (ie, principal components analysis and multiple correspondence analysis) and hierarchical clustering using the Ward minimum variance method with Euclidean squared distance. Results: In the independent age-sex clustering analyses, 56 clusters supported 11 clinically distinguishable meta-clusters (MCs). MCs 1-3 showed high recovery rates (90.27%-95.22%), including healthy patients of all ages, children with comorbidities and priority in receiving medical resources (ie, higher rates of hospitalization, intubation, and ICU admission) compared with other adult subgroups that have similar conditions, and young obese smokers. MCs 4-5 showed moderate recovery rates (81.30%-82.81%), including patients with hypertension or diabetes of all ages and obese patients with pneumonia, hypertension, and diabetes. MCs 6-11 showed low recovery rates (53.96%-66.94%), including immunosuppressed patients with high comorbidity rates, patients with chronic kidney disease with a poor survival length and probability of recovery, older smokers with chronic obstructive pulmonary disease, older adults with severe diabetes and hypertension, and the oldest obese smokers with chronic obstructive pulmonary disease and mild cardiovascular disease. Group outcomes conformed to the recent literature on dedicated age-sex groups. Mexican states and several types of clinical institutions showed relevant heterogeneity regarding severity, potentially linked to socioeconomic or health inequalities. Conclusions: The proposed 2-stage cluster analysis methodology produced a discriminative characterization of the sample and explainability over age and sex. These results can potentially help in understanding the clinical patient and their stratification for automated early triage before further tests and laboratory results are available and even in locations where additional tests are not available or to help decide resource allocation among vulnerable subgroups such as to prioritize vaccination or treatments. %M 35144239 %R 10.2196/30032 %U https://publichealth.jmir.org/2022/3/e30032 %U https://doi.org/10.2196/30032 %U http://www.ncbi.nlm.nih.gov/pubmed/35144239 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 3 %P e33191 %T The Disease and Economic Burdens of Esophageal Cancer in China from 2013 to 2030: Dynamic Cohort Modeling Study %A Li,Yuanyuan %A Xu,Junfang %A Gu,Yuxuan %A Sun,Xueshan %A Dong,Hengjin %A Chen,Changgui %+ General Practice, Hangzhou Ninth People’s Hospital, Yipeng Rd 98, Qiantang District, Hangzhou, 311225, China, 86 13757119185, generalpractice@163.com %K esophageal cancer %K disease burden %K disability-adjusted life year %K economic burden %D 2022 %7 2.3.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Esophageal cancer (EC) is the sixth leading cause of tumor-related deaths worldwide. Estimates of the EC burden are necessary and could offer evidence-based suggestions for local cancer control. Objective: The aim of this study was to predict the disease burden of EC in China through the estimation of disability-adjusted life years (DALYs) and direct medical expenditure by sex from 2013 to 2030. Methods: A dynamic cohort Markov model was developed to simulate EC prevalence, DALYs, and direct medical expenditure by sex. Input data were collected from the China Statistical Yearbooks, Statistical Report of China Children’s Development, World Population Prospects 2019, and published papers. The JoinPoint Regression Program was used to calculate the average annual percentage change (AAPC) of DALY rates, whereas the average annual growth rate (AAGR) was applied to analyze the changing direct medical expenditure trend over time. Results: From 2013 to 2030, the predicted EC prevalence is projected to increase from 61.0 to 64.5 per 100,000 people, with annual EC cases increasing by 11.5% (from 835,600 to 931,800). The DALYs will increase by 21.3% (from 30,034,000 to 36,444,000), and the years of life lost (YLL) will account for over 90% of the DALYs. The DALY rates per 100,000 people will increase from 219.2 to 252.3; however, there was a difference between sexes, with an increase from 302.9 to 384.3 in males and a decline from 131.2 to 115.9 in females. The AAPC was 0.8% (95% CI 0.8% to 0.9%), 1.4% (95% CI 1.3% to 1.5%), and –0.7% (95% CI –0.8% to –0.7%) for both sexes, males, and females, respectively. The direct medical expenditure will increase by 128.7% (from US $33.4 to US $76.4 billion), with an AAGR of 5.0%. The direct medical expenditure is 2-3 times higher in males than in females. Conclusions: EC still causes severe disease and economic burdens. YLL are responsible for the majority of DALYs, which highlights an urgent need to establish a beneficial policy to reduce the EC burden. %M 34963658 %R 10.2196/33191 %U https://publichealth.jmir.org/2022/3/e33191 %U https://doi.org/10.2196/33191 %U http://www.ncbi.nlm.nih.gov/pubmed/34963658 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 24 %N 2 %P e27146 %T Age- and Sex-Specific Differences in Multimorbidity Patterns and Temporal Trends on Assessing Hospital Discharge Records in Southwest China: Network-Based Study %A Wang,Liya %A Qiu,Hang %A Luo,Li %A Zhou,Li %+ School of Computer Science and Engineering, University of Electronic Science and Technology of China, No.2006, Xiyuan Ave, West Hi-Tech Zone, Chengdu, 611731, China, 86 28 61830278, qiuhang@uestc.edu.cn %K multimorbidity pattern %K temporal trend %K network analysis %K multimorbidity prevalence %K administrative data %K longitudinal study %K regional research %D 2022 %7 25.2.2022 %9 Original Paper %J J Med Internet Res %G English %X Background: Multimorbidity represents a global health challenge, which requires a more global understanding of multimorbidity patterns and trends. However, the majority of studies completed to date have often relied on self-reported conditions, and a simultaneous assessment of the entire spectrum of chronic disease co-occurrence, especially in developing regions, has not yet been performed. Objective: We attempted to provide a multidimensional approach to understand the full spectrum of chronic disease co-occurrence among general inpatients in southwest China, in order to investigate multimorbidity patterns and temporal trends, and assess their age and sex differences. Methods: We conducted a retrospective cohort analysis based on 8.8 million hospital discharge records of about 5.0 million individuals of all ages from 2015 to 2019 in a megacity in southwest China. We examined all chronic diagnoses using the ICD-10 (International Classification of Diseases, 10th revision) codes at 3 digits and focused on chronic diseases with ≥1% prevalence for each of the age and sex strata, which resulted in a total of 149 and 145 chronic diseases in males and females, respectively. We constructed multimorbidity networks in the general population based on sex and age, and used the cosine index to measure the co-occurrence of chronic diseases. Then, we divided the networks into communities and assessed their temporal trends. Results: The results showed complex interactions among chronic diseases, with more intensive connections among males and inpatients ≥40 years old. A total of 9 chronic diseases were simultaneously classified as central diseases, hubs, and bursts in the multimorbidity networks. Among them, 5 diseases were common to both males and females, including hypertension, chronic ischemic heart disease, cerebral infarction, other cerebrovascular diseases, and atherosclerosis. The earliest leaps (degree leaps ≥6) appeared at a disorder of glycoprotein metabolism that happened at 25-29 years in males, about 15 years earlier than in females. The number of chronic diseases in the community increased over time, but the new entrants did not replace the root of the community. Conclusions: Our multimorbidity network analysis identified specific differences in the co-occurrence of chronic diagnoses by sex and age, which could help in the design of clinical interventions for inpatient multimorbidity. %M 35212632 %R 10.2196/27146 %U https://www.jmir.org/2022/2/e27146 %U https://doi.org/10.2196/27146 %U http://www.ncbi.nlm.nih.gov/pubmed/35212632 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 2 %P e28737 %T COVID-19 Surveillance Updates in US Metropolitan Areas: Dynamic Panel Data Modeling %A Oehmke,Theresa B %A Moss,Charles B %A Oehmke,James F %+ Department of Civil and Environmental Engineering, University of California, Berkeley, 202 O'Brien Hall, Berkeley, CA, 94720, United States, 1 5108986406, toehmke@berkeley.edu %K surveillance system %K COVID-19 %K coronavirus %K Sars-CoV-2 %K Houston %K dynamic panel data model %K speed %K jerk %K acceleration %K 7-Day persistence %K modeling %K data %K surveillance %K monitoring %K public health %K United States %K transmission %K response %D 2022 %7 24.2.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Despite the availability of vaccines, the US incidence of new COVID-19 cases per day nearly doubled from the beginning of July to the end of August 2021, fueled largely by the rapid spread of the Delta variant. While the “Delta wave” appears to have peaked nationally, some states and municipalities continue to see elevated numbers of new cases. Vigilant surveillance including at a metropolitan level can help identify any reignition and validate continued and strong public health policy responses in problem localities. Objective: This surveillance report aimed to provide up-to-date information for the 25 largest US metropolitan areas about the rapidity of descent in the number of new cases following the Delta wave peak, as well as any potential reignition of the pandemic associated with declining vaccine effectiveness over time, new variants, or other factors. Methods: COVID-19 pandemic dynamics for the 25 largest US metropolitan areas were analyzed through September 19, 2021, using novel metrics of speed, acceleration, jerk, and 7-day persistence, calculated from the observed data on the cumulative number of cases as reported by USAFacts. Statistical analysis was conducted using dynamic panel data models estimated with the Arellano-Bond regression techniques. The results are presented in tabular and graphic forms for visual interpretation. Results: On average, speed in the 25 largest US metropolitan areas declined from 34 new cases per day per 100,000 population, during the week ending August 15, 2021, to 29 new cases per day per 100,000 population, during the week ending September 19, 2021. This average masks important differences across metropolitan areas. For example, Miami’s speed decreased from 105 for the week ending August 15, 2021, to 40 for the week ending September 19, 2021. Los Angeles, San Francisco, Riverside, and San Diego had decreasing speed over the sample period and ended with single-digit speeds for the week ending September 19, 2021. However, Boston, Washington DC, Detroit, Minneapolis, Denver, and Charlotte all had their highest speed of the sample during the week ending September 19, 2021. These cities, as well as Houston and Baltimore, had positive acceleration for the week ending September 19, 2021. Conclusions: There is great variation in epidemiological curves across US metropolitan areas, including increasing numbers of new cases in 8 of the largest 25 metropolitan areas for the week ending September 19, 2021. These trends, including the possibility of waning vaccine effectiveness and the emergence of resistant variants, strongly indicate the need for continued surveillance and perhaps a return to more restrictive public health guidelines for some areas. %M 34882569 %R 10.2196/28737 %U https://publichealth.jmir.org/2022/2/e28737 %U https://doi.org/10.2196/28737 %U http://www.ncbi.nlm.nih.gov/pubmed/34882569 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 2 %P e32426 %T The Use of Cremation Data for Timely Mortality Surveillance During the COVID-19 Pandemic in Ontario, Canada: Validation Study %A Postill,Gemma %A Murray,Regan %A Wilton,Andrew S %A Wells,Richard A %A Sirbu,Renee %A Daley,Mark J %A Rosella,Laura %+ Epidemiology Division, Dalla Lana School of Public Health, 155 College Street, Suite 600, Toronto, ON, M5T 3M7, Canada, 1 416 978 0901, laura.rosella@utoronto.ca %K excess deaths %K real-time mortality %K cremation %K COVID-19 %K SARS-CoV-2 %K mortality %K estimate %K impact %K public health %K validation %K pattern %K trend %K utility %K Canada %K mortality data %K pandemic %K death %K cremation data %K cause of death %K vital statistics %K excess mortality %D 2022 %7 21.2.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Early estimates of excess mortality are crucial for understanding the impact of COVID-19. However, there is a lag of several months in the reporting of vital statistics mortality data for many jurisdictions, including across Canada. In Ontario, a Canadian province, certification by a coroner is required before cremation can occur, creating real-time mortality data that encompasses the majority of deaths within the province. Objective: This study aimed to validate the use of cremation data as a timely surveillance tool for all-cause mortality during a public health emergency in a jurisdiction with delays in vital statistics data. Specifically, this study aimed to validate this surveillance tool by determining the stability, timeliness, and robustness of its real-time estimation of all-cause mortality. Methods: Cremation records from January 2020 until April 2021 were compared to the historical records from 2017 to 2019, grouped according to week, age, sex, and whether COVID-19 was the cause of death. Cremation data were compared to Ontario’s provisional vital statistics mortality data released by Statistics Canada. The 2020 and 2021 records were then compared to previous years (2017-2019) to determine whether there was excess mortality within various age groups and whether deaths attributed to COVID-19 accounted for the entirety of the excess mortality. Results: Between 2017 and 2019, cremations were performed for 67.4% (95% CI 67.3%-67.5%) of deaths. The proportion of cremated deaths remained stable throughout 2020, even within age and sex categories. Cremation records are 99% complete within 3 weeks of the date of death, which precedes the compilation of vital statistics data by several months. Consequently, during the first wave (from April to June 2020), cremation records detected a 16.9% increase (95% CI 14.6%-19.3%) in all-cause mortality, a finding that was confirmed several months later with cremation data. Conclusions: The percentage of Ontarians cremated and the completion of cremation data several months before vital statistics did not change meaningfully during the COVID-19 pandemic period, establishing that the pandemic did not significantly alter cremation practices. Cremation data can be used to accurately estimate all-cause mortality in near real-time, particularly when real-time mortality estimates are needed to inform policy decisions for public health measures. The accuracy of this excess mortality estimation was confirmed by comparing it with official vital statistics data. These findings demonstrate the utility of cremation data as a complementary data source for timely mortality information during public health emergencies. %M 35038302 %R 10.2196/32426 %U https://publichealth.jmir.org/2022/2/e32426 %U https://doi.org/10.2196/32426 %U http://www.ncbi.nlm.nih.gov/pubmed/35038302 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 1 %P e35763 %T Has Omicron Changed the Evolution of the Pandemic? %A Lundberg,Alexander L %A Lorenzo-Redondo,Ramon %A Ozer,Egon A %A Hawkins,Claudia A %A Hultquist,Judd F %A Welch,Sarah B %A Prasad,PV Vara %A Oehmke,James F %A Achenbach,Chad J %A Murphy,Robert L %A White,Janine I %A Havey,Robert J %A Post,Lori Ann %+ Buehler Center for Health Policy and Economics, Robert J. Havey, MD Institute for Global Health, Northwestern University, 750 N. Lake Shore Drive, Chicago, IL, 60611, United States, 1 312 503 5659, lori.post@northwestern.edu %K Omicron %K SARS-CoV-2 %K public health surveillance %K VOC %K variant of concern %K Delta %K Beta %K COVID-19 %K sub-Saharan Africa %K public health %K pandemic %K epidemiology %D 2022 %7 31.1.2022 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Variants of the SARS-CoV-2 virus carry differential risks to public health. The Omicron (B.1.1.529) variant, first identified in Botswana on November 11, 2021, has spread globally faster than any previous variant of concern. Understanding the transmissibility of Omicron is vital in the development of public health policy. Objective: The aim of this study is to compare SARS-CoV-2 outbreaks driven by Omicron to those driven by prior variants of concern in terms of both the speed and magnitude of an outbreak. Methods: We analyzed trends in outbreaks by variant of concern with validated surveillance metrics in several southern African countries. The region offers an ideal setting for a natural experiment given that most outbreaks thus far have been driven primarily by a single variant at a time. With a daily longitudinal data set of new infections, total vaccinations, and cumulative infections in countries in sub-Saharan Africa, we estimated how the emergence of Omicron has altered the trajectory of SARS-CoV-2 outbreaks. We used the Arellano-Bond method to estimate regression coefficients from a dynamic panel model, in which new infections are a function of infections yesterday and last week. We controlled for vaccinations and prior infections in the population. To test whether Omicron has changed the average trajectory of a SARS-CoV-2 outbreak, we included an interaction between an indicator variable for the emergence of Omicron and lagged infections. Results: The observed Omicron outbreaks in this study reach the outbreak threshold within 5-10 days after first detection, whereas other variants of concern have taken at least 14 days and up to as many as 35 days. The Omicron outbreaks also reach peak rates of new cases that are roughly 1.5-2 times those of prior variants of concern. Dynamic panel regression estimates confirm Omicron has created a statistically significant shift in viral spread. Conclusions: The transmissibility of Omicron is markedly higher than prior variants of concern. At the population level, the Omicron outbreaks occurred more quickly and with larger magnitude, despite substantial increases in vaccinations and prior infections, which should have otherwise reduced susceptibility to new infections. Unless public health policies are substantially altered, Omicron outbreaks in other countries are likely to occur with little warning. %M 35072638 %R 10.2196/35763 %U https://publichealth.jmir.org/2022/1/e35763 %U https://doi.org/10.2196/35763 %U http://www.ncbi.nlm.nih.gov/pubmed/35072638 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 12 %P e32846 %T Patterns of SARS-CoV-2 Testing Preferences in a National Cohort in the United States: Latent Class Analysis of a Discrete Choice Experiment %A Zimba,Rebecca %A Romo,Matthew L %A Kulkarni,Sarah G %A Berry,Amanda %A You,William %A Mirzayi,Chloe %A Westmoreland,Drew A %A Parcesepe,Angela M %A Waldron,Levi %A Rane,Madhura S %A Kochhar,Shivani %A Robertson,McKaylee M %A Maroko,Andrew R %A Grov,Christian %A Nash,Denis %+ Institute for Implementation Science in Population Health, CUNY Graduate School of Public Health & Health Policy, 55 W 125th St, 6th Floor, New York, NY, 10027, United States, 1 646 364 9618, rebecca.zimba@sph.cuny.edu %K SARS-CoV-2 %K testing %K discrete choice experiment %K latent class analysis %K COVID-19 %K pattern %K trend %K preference %K cohort %K United States %K discrete choice %K diagnostic %K transmission %K vaccine %K uptake %K public health %D 2021 %7 30.12.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Inadequate screening and diagnostic testing in the United States throughout the first several months of the COVID-19 pandemic led to undetected cases transmitting disease in the community and an underestimation of cases. Though testing supply has increased, maintaining testing uptake remains a public health priority in the efforts to control community transmission considering the availability of vaccinations and threats from variants. Objective: This study aimed to identify patterns of preferences for SARS-CoV-2 screening and diagnostic testing prior to widespread vaccine availability and uptake. Methods: We conducted a discrete choice experiment (DCE) among participants in the national, prospective CHASING COVID (Communities, Households, and SARS-CoV-2 Epidemiology) Cohort Study from July 30 to September 8, 2020. The DCE elicited preferences for SARS-CoV-2 test type, specimen type, testing venue, and result turnaround time. We used latent class multinomial logit to identify distinct patterns of preferences related to testing as measured by attribute-level part-worth utilities and conducted a simulation based on the utility estimates to predict testing uptake if additional testing scenarios were offered. Results: Of the 5098 invited cohort participants, 4793 (94.0%) completed the DCE. Five distinct patterns of SARS-CoV-2 testing emerged. Noninvasive home testers (n=920, 19.2% of participants) were most influenced by specimen type and favored less invasive specimen collection methods, with saliva being most preferred; this group was the least likely to opt out of testing. Fast-track testers (n=1235, 25.8%) were most influenced by result turnaround time and favored immediate and same-day turnaround time. Among dual testers (n=889, 18.5%), test type was the most important attribute, and preference was given to both antibody and viral tests. Noninvasive dual testers (n=1578, 32.9%) were most strongly influenced by specimen type and test type, preferring saliva and cheek swab specimens and both antibody and viral tests. Among hesitant home testers (n=171, 3.6%), the venue was the most important attribute; notably, this group was the most likely to opt out of testing. In addition to variability in preferences for testing features, heterogeneity was observed in the distribution of certain demographic characteristics (age, race/ethnicity, education, and employment), history of SARS-CoV-2 testing, COVID-19 diagnosis, and concern about the pandemic. Simulation models predicted that testing uptake would increase from 81.6% (with a status quo scenario of polymerase chain reaction by nasal swab in a provider’s office and a turnaround time of several days) to 98.1% by offering additional scenarios using less invasive specimens, both viral and antibody tests from a single specimen, faster turnaround time, and at-home testing. Conclusions: We identified substantial differences in preferences for SARS-CoV-2 testing and found that offering additional testing options would likely increase testing uptake in line with public health goals. Additional studies may be warranted to understand if preferences for testing have changed since the availability and widespread uptake of vaccines. %M 34793320 %R 10.2196/32846 %U https://publichealth.jmir.org/2021/12/e32846 %U https://doi.org/10.2196/32846 %U http://www.ncbi.nlm.nih.gov/pubmed/34793320 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 12 %P e34178 %T Predicting New Daily COVID-19 Cases and Deaths Using Search Engine Query Data in South Korea From 2020 to 2021: Infodemiology Study %A Husnayain,Atina %A Shim,Eunha %A Fuad,Anis %A Su,Emily Chia-Yu %+ Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 172-1 Keelung Rd, Sec 2, Taipei, 106, Taiwan, 886 266382736 ext 1515, emilysu@tmu.edu.tw %K prediction %K internet search %K COVID-19 %K South Korea %K infodemiology %D 2021 %7 22.12.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Given the ongoing COVID-19 pandemic situation, accurate predictions could greatly help in the health resource management for future waves. However, as a new entity, COVID-19’s disease dynamics seemed difficult to predict. External factors, such as internet search data, need to be included in the models to increase their accuracy. However, it remains unclear whether incorporating online search volumes into models leads to better predictive performances for long-term prediction. Objective: The aim of this study was to analyze whether search engine query data are important variables that should be included in the models predicting new daily COVID-19 cases and deaths in short- and long-term periods. Methods: We used country-level case-related data, NAVER search volumes, and mobility data obtained from Google and Apple for the period of January 20, 2020, to July 31, 2021, in South Korea. Data were aggregated into four subsets: 3, 6, 12, and 18 months after the first case was reported. The first 80% of the data in all subsets were used as the training set, and the remaining data served as the test set. Generalized linear models (GLMs) with normal, Poisson, and negative binomial distribution were developed, along with linear regression (LR) models with lasso, adaptive lasso, and elastic net regularization. Root mean square error values were defined as a loss function and were used to assess the performance of the models. All analyses and visualizations were conducted in SAS Studio, which is part of the SAS OnDemand for Academics. Results: GLMs with different types of distribution functions may have been beneficial in predicting new daily COVID-19 cases and deaths in the early stages of the outbreak. Over longer periods, as the distribution of cases and deaths became more normally distributed, LR models with regularization may have outperformed the GLMs. This study also found that models performed better when predicting new daily deaths compared to new daily cases. In addition, an evaluation of feature effects in the models showed that NAVER search volumes were useful variables in predicting new daily COVID-19 cases, particularly in the first 6 months of the outbreak. Searches related to logistical needs, particularly for “thermometer” and “mask strap,” showed higher feature effects in that period. For longer prediction periods, NAVER search volumes were still found to constitute an important variable, although with a lower feature effect. This finding suggests that search term use should be considered to maintain the predictive performance of models. Conclusions: NAVER search volumes were important variables in short- and long-term prediction, with higher feature effects for predicting new daily COVID-19 cases in the first 6 months of the outbreak. Similar results were also found for death predictions. %M 34762064 %R 10.2196/34178 %U https://www.jmir.org/2021/12/e34178 %U https://doi.org/10.2196/34178 %U http://www.ncbi.nlm.nih.gov/pubmed/34762064 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 12 %P e34016 %T Predicting the Number of Suicides in Japan Using Internet Search Queries: Vector Autoregression Time Series Model %A Taira,Kazuya %A Hosokawa,Rikuya %A Itatani,Tomoya %A Fujita,Sumio %+ Department of Human Health Sciences, Graduate School of Medicine, Kyoto University, 53, Shogoinkawara-cho, Sakyo-ku, Kyoto, 606-8507, Japan, 81 75 751 3927, taira.kazuya.5m@kyoto-u.ac.jp %K suicide %K internet search engine %K infoveillance %K query %K time series analysis %K vector autoregression model %K COVID-19 %K suicide-related terms %K internet %K information seeking %K time series %K model %K loneliness %K mental health %K prediction %K Japan %K behavior %K trend %D 2021 %7 3.12.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The number of suicides in Japan increased during the COVID-19 pandemic. Predicting the number of suicides is important to take timely preventive measures. Objective: This study aims to clarify whether the number of suicides can be predicted by suicide-related search queries used before searching for the keyword “suicide.” Methods: This study uses the infoveillance approach for suicide in Japan by search trends in search engines. The monthly number of suicides by gender, collected and published by the National Police Agency, was used as an outcome variable. The number of searches by gender with queries associated with “suicide” on “Yahoo! JAPAN Search” from January 2016 to December 2020 was used as a predictive variable. The following five phrases highly relevant to suicide were used as search terms before searching for the keyword “suicide” and extracted and used for analyses: “abuse”; “work, don’t want to go”; “company, want to quit”; “divorce”; and “no money.” The augmented Dickey-Fuller and Johansen tests were performed for the original series and to verify the existence of unit roots and cointegration for each variable, respectively. The vector autoregression model was applied to predict the number of suicides. The Breusch-Godfrey Lagrangian multiplier (BG-LM) test, autoregressive conditional heteroskedasticity Lagrangian multiplier (ARCH-LM) test, and Jarque-Bera (JB) test were used to confirm model convergence. In addition, a Granger causality test was performed for each predictive variable. Results: In the original series, unit roots were found in the trend model, whereas in the first-order difference series, both men (minimum tau 3: −9.24; max tau 3: −5.38) and women (minimum tau 3: −9.24; max tau 3: −5.38) had no unit roots for all variables. In the Johansen test, a cointegration relationship was observed among several variables. The queries used in the converged models were “divorce” for men (BG-LM test: P=.55; ARCH-LM test: P=.63; JB test: P=.66) and “no money” for women (BG-LM test: P=.17; ARCH-LM test: P=.15; JB test: P=.10). In the Granger causality test for each variable, “divorce” was significant for both men (F104=3.29; P=.04) and women (F104=3.23; P=.04). Conclusions: The number of suicides can be predicted by search queries related to the keyword “suicide.” Previous studies have reported that financial poverty and divorce are associated with suicide. The results of this study, in which search queries on “no money” and “divorce” predicted suicide, support the findings of previous studies. Further research on the economic poverty of women and those with complex problems is necessary. %M 34823225 %R 10.2196/34016 %U https://publichealth.jmir.org/2021/12/e34016 %U https://doi.org/10.2196/34016 %U http://www.ncbi.nlm.nih.gov/pubmed/34823225 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 12 %P e30648 %T Predicting COVID-19 Transmission to Inform the Management of Mass Events: Model-Based Approach %A Donnat,Claire %A Bunbury,Freddy %A Kreindler,Jack %A Liu,David %A Filippidis,Filippos T %A Esko,Tonu %A El-Osta,Austen %A Harris,Matthew %+ Department of Statistics, University of Chicago, 5747 South Ellis Avenue, Chicago, IL, 60637, United States, 1 773 702 9890, cdonnat@uchicago.edu %K COVID-19 %K transmission dynamics %K live event management %K Monte Carlo simulation %D 2021 %7 1.12.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Modelling COVID-19 transmission at live events and public gatherings is essential to controlling the probability of subsequent outbreaks and communicating to participants their personalized risk. Yet, despite the fast-growing body of literature on COVID-19 transmission dynamics, current risk models either neglect contextual information including vaccination rates or disease prevalence or do not attempt to quantitatively model transmission. Objective: This paper attempted to bridge this gap by providing informative risk metrics for live public events, along with a measure of their uncertainty. Methods: Building upon existing models, our approach ties together 3 main components: (1) reliable modelling of the number of infectious cases at the time of the event, (2) evaluation of the efficiency of pre-event screening, and (3) modelling of the event’s transmission dynamics and their uncertainty using Monte Carlo simulations. Results: We illustrated the application of our pipeline for a concert at the Royal Albert Hall and highlighted the risk’s dependency on factors such as prevalence, mask wearing, and event duration. We demonstrate how this event held on 3 different dates (August 20, 2020; January 20, 2021; and March 20, 2021) would likely lead to transmission events that are similar to community transmission rates (0.06 vs 0.07, 2.38 vs 2.39, and 0.67 vs 0.60, respectively). However, differences between event and background transmissions substantially widened in the upper tails of the distribution of the number of infections (as denoted by their respective 99th quantiles: 1 vs 1, 19 vs 8, and 6 vs 3, respectively, for our 3 dates), further demonstrating that sole reliance on vaccination and antigen testing to gain entry would likely significantly underestimate the tail risk of the event. Conclusions: Despite the unknowns surrounding COVID-19 transmission, our estimation pipeline opens the discussion on contextualized risk assessment by combining the best tools at hand to assess the order of magnitude of the risk. Our model can be applied to any future event and is presented in a user-friendly RShiny interface. Finally, we discussed our model’s limitations as well as avenues for model evaluation and improvement. %M 34583317 %R 10.2196/30648 %U https://publichealth.jmir.org/2021/12/e30648 %U https://doi.org/10.2196/30648 %U http://www.ncbi.nlm.nih.gov/pubmed/34583317 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 11 %P e29693 %T Prevalence of Multimorbidity of Chronic Noncommunicable Diseases in Brazil: Population-Based Study %A Shi,Xin %A Lima,Simone Maria da Silva %A Mota,Caroline Maria de Miranda %A Lu,Ying %A Stafford,Randall S %A Pereira,Corintho Viana %+ Management Engineering Department, Universidade Federal de Pernambuco, Av Prof Moraes Rego, 1235, Cidade Universitaria, Recife, 50670-901, Brazil, 55 8138795574, caroline.mota@ufpe.br %K multimorbidity %K prevalence %K health care %K public health %K Brazil %K logistic regression %D 2021 %7 25.11.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Multimorbidity is the co-occurrence of two or more chronic diseases. Objective: This study, based on self-reported medical diagnosis, aims to investigate the dynamic distribution of multimorbidity across sociodemographic levels and its impacts on health-related issues over 15 years in Brazil using national data. Methods: Data were analyzed using descriptive statistics, hypothesis tests, and logistic regression. The study sample comprised 679,572 adults (18-59 years of age) and 115,699 elderly people (≥60 years of age) from the two latest cross-sectional, multiple-cohort, national-based studies: the National Sample Household Survey (PNAD) of 1998, 2003, and 2008, and the Brazilian National Health Survey (PNS) of 2013. Results: Overall, the risk of multimorbidity in adults was 1.7 times higher in women (odds ratio [OR] 1.73, 95% CI 1.67-1.79) and 1.3 times higher among people without education (OR 1.34, 95% CI 1.28-1.41). Multiple chronic diseases considerably increased with age in Brazil, and people between 50 and 59 years old were about 12 times more likely to have multimorbidity than adults between 18 and 29 years of age (OR 11.89, 95% CI 11.27-12.55). Seniors with multimorbidity had more than twice the likelihood of receiving health assistance in community services or clinics (OR 2.16, 95% CI 2.02-2.31) and of being hospitalized (OR 2.37, 95% CI 2.21-2.56). The subjective well-being of adults with multimorbidity was often worse than people without multiple chronic diseases (OR=12.85, 95% CI: 12.07-13.68). These patterns were similar across all 4 cohorts analyzed and were relatively stable over 15 years. Conclusions: Our study shows little variation in the prevalence of the multimorbidity of chronic diseases in Brazil over time, but there are differences in the prevalence of multimorbidity across different social groups. It is hoped that the analysis of multimorbidity from the two latest Brazil national surveys will support policy making on epidemic prevention and management. %M 34842558 %R 10.2196/29693 %U https://publichealth.jmir.org/2021/11/e29693 %U https://doi.org/10.2196/29693 %U http://www.ncbi.nlm.nih.gov/pubmed/34842558 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 11 %P e25976 %T Long-Term Survival Among Histological Subtypes in Advanced Epithelial Ovarian Cancer: Population-Based Study Using the Surveillance, Epidemiology, and End Results Database %A Yang,Shi-Ping %A Su,Hui-Luan %A Chen,Xiu-Bei %A Hua,Li %A Chen,Jian-Xian %A Hu,Min %A Lei,Jian %A Wu,San-Gang %A Zhou,Juan %+ Department of Obstetrics and Gynecology, The First Affiliated Hospital of Xiamen University, 55 Zhenhai Road, Xiamen, 361003, China, 86 5922139531, zhoujuan@xmu.edu.cn %K ovarian epithelial carcinoma %K survivors %K histology %K survival rate %K survival %K ovarian %K cancer %K surveillance %K epidemiology %K women’s health %K reproductive health %K Surveillance, Epidemiology, and End Results %K ovary %K oncology %K survivorship %K long-term outcome %K epithelial %D 2021 %7 17.11.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Actual long-term survival rates for advanced epithelial ovarian cancer (EOC) are rarely reported. Objective: This study aimed to assess the role of histological subtypes in predicting the prognosis among long-term survivors (≥5 years) of advanced EOC. Methods: We performed a retrospective analysis of data among patients with stage III-IV EOC diagnosed from 2000 to 2014 using the Surveillance, Epidemiology, and End Results cancer data of the United States. We used the chi-square test, Kaplan–Meier analysis, and multivariate Cox proportional hazards model for the analyses. Results: We included 8050 patients in this study, including 6929 (86.1%), 743 (9.2%), 237 (2.9%), and 141 (1.8%) patients with serous, endometrioid, clear cell, and mucinous tumors, respectively. With a median follow-up of 91 months, the most common cause of death was primary ovarian cancer (80.3%), followed by other cancers (8.1%), other causes of death (7.3%), cardiac-related death (3.2%), and nonmalignant pulmonary disease (3.2%). Patients with the serous subtype were more likely to die from primary ovarian cancer, and patients with the mucinous subtype were more likely to die from other cancers and cardiac-related disease. Multivariate Cox analysis showed that patients with endometrioid (hazard ratio [HR] 0.534, P<.001), mucinous (HR 0.454, P<.001), and clear cell (HR 0.563, P<.001) subtypes showed better ovarian cancer-specific survival than those with the serous subtype. Similar results were found regarding overall survival. However, ovarian cancer–specific survival and overall survival were comparable among those with endometrioid, clear cell, and mucinous tumors. Conclusions: Ovarian cancer remains the primary cause of death in long-term ovarian cancer survivors. Moreover, the probability of death was significantly different among those with different histological subtypes. It is important for clinicians to individualize the surveillance program for long-term ovarian cancer survivors. %M 34787583 %R 10.2196/25976 %U https://publichealth.jmir.org/2021/11/e25976 %U https://doi.org/10.2196/25976 %U http://www.ncbi.nlm.nih.gov/pubmed/34787583 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 10 %P e29820 %T Emotional Tone, Analytical Thinking, and Somatosensory Processes of a Sample of Italian Tweets During the First Phases of the COVID-19 Pandemic: Observational Study %A Monzani,Dario %A Vergani,Laura %A Pizzoli,Silvia Francesca Maria %A Marton,Giulia %A Pravettoni,Gabriella %+ Department of Oncology and Hemato-Oncology, University of Milan, Via Festa del Perdono 7, Milan, 20122, Italy, 39 029 4372099, laura.vergani@ieo.it %K internet %K mHealth %K infodemiology %K infoveillance %K pandemic %K public health %K COVID-19 %K Twitter %K psycholinguistic analysis %K trauma %D 2021 %7 27.10.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: The COVID-19 pandemic is a traumatic individual and collective chronic experience, with tremendous consequences on mental and psychological health that can also be reflected in people’s use of words. Psycholinguistic analysis of tweets from Twitter allows obtaining information about people’s emotional expression, analytical thinking, and somatosensory processes, which are particularly important in traumatic events contexts. Objective: We aimed to analyze the influence of official Italian COVID-19 daily data (new cases, deaths, and hospital discharges) and the phase of managing the pandemic on how people expressed emotions and their analytical thinking and somatosensory processes in Italian tweets written during the first phases of the COVID-19 pandemic in Italy. Methods: We retrieved 1,697,490 Italian COVID-19–related tweets written from February 24, 2020 to June 14, 2020 and analyzed them using LIWC2015 to calculate 3 summary psycholinguistic variables: emotional tone, analytical thinking, and somatosensory processes. Official daily data about new COVID-19 cases, deaths, and hospital discharges were retrieved from the Italian Prime Minister's Office and Civil Protection Department GitHub page. We considered 3 phases of managing the COVID-19 pandemic in Italy. We performed 3 general models, 1 for each summary variable as the dependent variable and with daily data and phase of managing the pandemic as independent variables. Results: General linear models to assess differences in daily scores of emotional tone, analytical thinking, and somatosensory processes were significant (F6,104=21.53, P<.001, R2= .55; F5,105=9.20, P<.001, R2= .30; F6,104=6.15, P<.001, R2=.26, respectively). Conclusions: The COVID-19 pandemic affects how people express emotions, analytical thinking, and somatosensory processes in tweets. Our study contributes to the investigation of pandemic psychological consequences through psycholinguistic analysis of social media textual data. %M 34516386 %R 10.2196/29820 %U https://www.jmir.org/2021/10/e29820 %U https://doi.org/10.2196/29820 %U http://www.ncbi.nlm.nih.gov/pubmed/34516386 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 10 %P e30824 %T Self–Training With Quantile Errors for Multivariate Missing Data Imputation for Regression Problems in Electronic Medical Records: Algorithm Development Study %A Gwon,Hansle %A Ahn,Imjin %A Kim,Yunha %A Kang,Hee Jun %A Seo,Hyeram %A Cho,Ha Na %A Choi,Heejung %A Jun,Tae Joon %A Kim,Young-Hak %+ Division of Cardiology, Department of Internal Medicine, Asan Medical Center, University of Ulsan College of Medicine, 88, Olympic-ro 43-gil, Songpa-gu, Seoul, 05505, Republic of Korea, 82 2 3010 0955, mdyhkim@amc.seoul.kr %K self-training %K artificial intelligence %K electronic medical records %K imputation %D 2021 %7 13.10.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: When using machine learning in the real world, the missing value problem is the first problem encountered. Methods to impute this missing value include statistical methods such as mean, expectation-maximization, and multiple imputations by chained equations (MICE) as well as machine learning methods such as multilayer perceptron, k-nearest neighbor, and decision tree. Objective: The objective of this study was to impute numeric medical data such as physical data and laboratory data. We aimed to effectively impute data using a progressive method called self-training in the medical field where training data are scarce. Methods: In this paper, we propose a self-training method that gradually increases the available data. Models trained with complete data predict the missing values in incomplete data. Among the incomplete data, the data in which the missing value is validly predicted are incorporated into the complete data. Using the predicted value as the actual value is called pseudolabeling. This process is repeated until the condition is satisfied. The most important part of this process is how to evaluate the accuracy of pseudolabels. They can be evaluated by observing the effect of the pseudolabeled data on the performance of the model. Results: In self-training using random forest (RF), mean squared error was up to 12% lower than pure RF, and the Pearson correlation coefficient was 0.1% higher. This difference was confirmed statistically. In the Friedman test performed on MICE and RF, self-training showed a P value between .003 and .02. A Wilcoxon signed-rank test performed on the mean imputation showed the lowest possible P value, 3.05e-5, in all situations. Conclusions: Self-training showed significant results in comparing the predicted values and actual values, but it needs to be verified in an actual machine learning system. And self-training has the potential to improve performance according to the pseudolabel evaluation method, which will be the main subject of our future research. %M 34643539 %R 10.2196/30824 %U https://publichealth.jmir.org/2021/10/e30824 %U https://doi.org/10.2196/30824 %U http://www.ncbi.nlm.nih.gov/pubmed/34643539 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 10 %P e32559 %T Excess Mortality During the COVID-19 Pandemic in Jordan: Secondary Data Analysis %A Khader,Yousef %A Al Nsour,Mohannad %+ Department of Public Health, Faculty of Medicine, Jordan University of Science and Technology, Alramtha-Amman Street, Irbid, 22110, Jordan, 962 796802040, yskhader@just.edu.jo %K COVID-19 %K excess mortality %K pandemic %D 2021 %7 7.10.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: All-cause mortality and estimates of excess deaths are commonly used in different countries to estimate the burden of COVID-19 and assess its direct and indirect effects. Objective: This study aimed to analyze the excess mortality during the COVID-19 pandemic in Jordan in April-December 2020. Methods: Official data on deaths in Jordan for 2020 and previous years (2016-2019) were obtained from the Department of Civil Status. We contrasted mortality rates in 2020 with those in each year and the pooled period 2016-2020 using a standardized mortality ratio (SMR) measure. Expected deaths for 2020 were estimated by fitting the overdispersed Poisson generalized linear models to the monthly death counts for the period of 2016-2019. Results: Overall, a 21% increase in standardized mortality (SMR 1.21, 95% CI 1.19-1.22) occurred in April-December 2020 compared with the April-December months in the pooled period 2016-2019. The SMR was more pronounced for men than for women (SMR 1.26, 95% CI 1.24-1.29 vs SMR 1.12, 95% CI 1.10-1.14), and it was statistically significant for both genders (P<.05). Using overdispersed Poisson generalized linear models, the number of expected deaths in April-December 2020 was 12,845 (7957 for women and 4888 for men). The total number of excess deaths during this period was estimated at 4583 (95% CI 4451-4716), with higher excess deaths in men (3112, 95% CI 3003-3221) than in women (1503, 95% CI 1427-1579). Almost 83.66% of excess deaths were attributed to COVID-19 in the Ministry of Health database. The vast majority of excess deaths occurred in people aged 60 years or older. Conclusions: The reported COVID-19 death counts underestimated mortality attributable to COVID-19. Excess deaths could reflect the increased deaths secondary to the pandemic and its containment measures. The majority of excess deaths occurred among old age groups. It is, therefore, important to maintain essential services for the elderly during pandemics. %M 34617910 %R 10.2196/32559 %U https://publichealth.jmir.org/2021/10/e32559 %U https://doi.org/10.2196/32559 %U http://www.ncbi.nlm.nih.gov/pubmed/34617910 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 9 %P e29544 %T Uncovering Clinical Risk Factors and Predicting Severe COVID-19 Cases Using UK Biobank Data: Machine Learning Approach %A Wong,Kenneth Chi-Yin %A Xiang,Yong %A Yin,Liangying %A So,Hon-Cheong %+ School of Biomedical Sciences, The Chinese University of Hong Kong, RM 520A, Lo Kwee Seong Biomedical Sciences Buildiing, Chinese University of Hong Kong, Hong Kong, China, 86 39439255, hcso@cuhk.edu.hk %K prediction %K COVID-19 %K risk factors %K machine learning %K pandemic %K biobank %K public health %K prediction models %K medical informatics %D 2021 %7 30.9.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: COVID-19 is a major public health concern. Given the extent of the pandemic, it is urgent to identify risk factors associated with disease severity. More accurate prediction of those at risk of developing severe infections is of high clinical importance. Objective: Based on the UK Biobank (UKBB), we aimed to build machine learning models to predict the risk of developing severe or fatal infections, and uncover major risk factors involved. Methods: We first restricted the analysis to infected individuals (n=7846), then performed analysis at a population level, considering those with no known infection as controls (ncontrols=465,728). Hospitalization was used as a proxy for severity. A total of 97 clinical variables (collected prior to the COVID-19 outbreak) covering demographic variables, comorbidities, blood measurements (eg, hematological/liver/renal function/metabolic parameters), anthropometric measures, and other risk factors (eg, smoking/drinking) were included as predictors. We also constructed a simplified (lite) prediction model using 27 covariates that can be more easily obtained (demographic and comorbidity data). XGboost (gradient-boosted trees) was used for prediction and predictive performance was assessed by cross-validation. Variable importance was quantified by Shapley values (ShapVal), permutation importance (PermImp), and accuracy gain. Shapley dependency and interaction plots were used to evaluate the pattern of relationships between risk factors and outcomes. Results: A total of 2386 severe and 477 fatal cases were identified. For analyses within infected individuals (n=7846), our prediction model achieved area under the receiving-operating characteristic curve (AUC–ROC) of 0.723 (95% CI 0.711-0.736) and 0.814 (95% CI 0.791-0.838) for severe and fatal infections, respectively. The top 5 contributing factors (sorted by ShapVal) for severity were age, number of drugs taken (cnt_tx), cystatin C (reflecting renal function), waist-to-hip ratio (WHR), and Townsend deprivation index (TDI). For mortality, the top features were age, testosterone, cnt_tx, waist circumference (WC), and red cell distribution width. For analyses involving the whole UKBB population, AUCs for severity and fatality were 0.696 (95% CI 0.684-0.708) and 0.825 (95% CI 0.802-0.848), respectively. The same top 5 risk factors were identified for both outcomes, namely, age, cnt_tx, WC, WHR, and TDI. Apart from the above, age, cystatin C, TDI, and cnt_tx were among the top 10 across all 4 analyses. Other diseases top ranked by ShapVal or PermImp were type 2 diabetes mellitus (T2DM), coronary artery disease, atrial fibrillation, and dementia, among others. For the “lite” models, predictive performances were broadly similar, with estimated AUCs of 0.716, 0.818, 0.696, and 0.830, respectively. The top ranked variables were similar to above, including age, cnt_tx, WC, sex (male), and T2DM. Conclusions: We identified numerous baseline clinical risk factors for severe/fatal infection by XGboost. For example, age, central obesity, impaired renal function, multiple comorbidities, and cardiometabolic abnormalities may predispose to poorer outcomes. The prediction models may be useful at a population level to identify those susceptible to developing severe/fatal infections, facilitating targeted prevention strategies. A risk-prediction tool is also available online. Further replications in independent cohorts are required to verify our findings. %M 34591027 %R 10.2196/29544 %U https://publichealth.jmir.org/2021/9/e29544 %U https://doi.org/10.2196/29544 %U http://www.ncbi.nlm.nih.gov/pubmed/34591027 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 9 %P e30854 %T Revealing Public Opinion Towards COVID-19 Vaccines With Twitter Data in the United States: Spatiotemporal Perspective %A Hu,Tao %A Wang,Siqin %A Luo,Wei %A Zhang,Mengxi %A Huang,Xiao %A Yan,Yingwei %A Liu,Regina %A Ly,Kelly %A Kacker,Viraj %A She,Bing %A Li,Zhenlong %+ Department of Geography, National University of Singapore, 1 Arts Link, #04-32 Block AS2, Singapore, 117570, Singapore, 65 65163851, geowl@nus.edu.sg %K Twitter %K public opinion %K COVID-19 vaccines %K sentiment analysis %K emotion analysis %K topic modeling %K COVID-19 %D 2021 %7 10.9.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: The COVID-19 pandemic has imposed a large, initially uncontrollable, public health crisis both in the United States and across the world, with experts looking to vaccines as the ultimate mechanism of defense. The development and deployment of COVID-19 vaccines have been rapidly advancing via global efforts. Hence, it is crucial for governments, public health officials, and policy makers to understand public attitudes and opinions towards vaccines, such that effective interventions and educational campaigns can be designed to promote vaccine acceptance. Objective: The aim of this study was to investigate public opinion and perception on COVID-19 vaccines in the United States. We investigated the spatiotemporal trends of public sentiment and emotion towards COVID-19 vaccines and analyzed how such trends relate to popular topics found on Twitter. Methods: We collected over 300,000 geotagged tweets in the United States from March 1, 2020 to February 28, 2021. We examined the spatiotemporal patterns of public sentiment and emotion over time at both national and state scales and identified 3 phases along the pandemic timeline with sharp changes in public sentiment and emotion. Using sentiment analysis, emotion analysis (with cloud mapping of keywords), and topic modeling, we further identified 11 key events and major topics as the potential drivers to such changes. Results: An increasing trend in positive sentiment in conjunction with a decrease in negative sentiment were generally observed in most states, reflecting the rising confidence and anticipation of the public towards vaccines. The overall tendency of the 8 types of emotion implies that the public trusts and anticipates the vaccine. This is accompanied by a mixture of fear, sadness, and anger. Critical social or international events or announcements by political leaders and authorities may have potential impacts on public opinion towards vaccines. These factors help identify underlying themes and validate insights from the analysis. Conclusions: The analyses of near real-time social media big data benefit public health authorities by enabling them to monitor public attitudes and opinions towards vaccine-related information in a geo-aware manner, address the concerns of vaccine skeptics, and promote the confidence that individuals within a certain region or community have towards vaccines. %M 34346888 %R 10.2196/30854 %U https://www.jmir.org/2021/9/e30854 %U https://doi.org/10.2196/30854 %U http://www.ncbi.nlm.nih.gov/pubmed/34346888 %0 Journal Article %@ 2563-6316 %I JMIR Publications %V 2 %N 3 %P e24630 %T A Full-Scale Agent-Based Model to Hypothetically Explore the Impact of Lockdown, Social Distancing, and Vaccination During the COVID-19 Pandemic in Lombardy, Italy: Model Development %A Giacopelli,Giuseppe %+ Department of Mathematics and Informatics, University of Palermo, Via Archirafi, 34, Palermo, 90123, Italy, 39 09123891111, giuseppeg94@gmail.com %K epidemiology %K computational %K model %K COVID-19 %K modeling %K outbreak %K virus %K infectious disease %K simulation %K impact %K vaccine %K agent-based model %D 2021 %7 10.9.2021 %9 Original Paper %J JMIRx Med %G English %X Background: The COVID-19 outbreak, an event of global concern, has provided scientists the opportunity to use mathematical modeling to run simulations and test theories about the pandemic. Objective: The aim of this study was to propose a full-scale individual-based model of the COVID-19 outbreak in Lombardy, Italy, to test various scenarios pertaining to the pandemic and achieve novel performance metrics. Methods: The model was designed to simulate all 10 million inhabitants of Lombardy person by person via a simple agent-based approach using a commercial computer. In order to obtain performance data, a collision detection model was developed to enable cluster nodes in small cells that can be processed fully in parallel. Within this collision detection model, an epidemic model based mostly on experimental findings about COVID-19 was developed. Results: The model was used to explain the behavior of the COVID-19 outbreak in Lombardy. Different parameters were used to simulate various scenarios relating to social distancing and lockdown. According to the model, these simple actions were enough to control the virus. The model also explained the decline in cases in the spring and simulated a hypothetical vaccination scenario, confirming, for example, the herd immunity threshold computed in previous works. Conclusions: The model made it possible to test the impact of people’s daily actions (eg, maintaining social distance) on the epidemic and to investigate interactions among agents within a social network. It also provided insight on the impact of a hypothetical vaccine. %M 34606524 %R 10.2196/24630 %U https://med.jmirx.org/2021/3/e24630 %U https://doi.org/10.2196/24630 %U http://www.ncbi.nlm.nih.gov/pubmed/34606524 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 9 %P e26409 %T Estimation of COVID-19 Period Prevalence and the Undiagnosed Population in Canadian Provinces: Model-Based Analysis %A Hamadeh,Abdullah %A Feng,Zeny %A Niergarth,Jessmyn %A Wong,William WL %+ School of Pharmacy, University of Waterloo, 10A Victoria Street S, Kitchener, ON, N2G1C5, Canada, 1 519 888 4567 ext 21323, wwlwong@uwaterloo.ca %K COVID-19 %K prevalence %K undiagnosed proportion %K mathematical modeling %K estimate %K Canada %K diagnosis %K control %K distribution %K infectious disease %K model %K framework %K progression %K transmission %D 2021 %7 9.9.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The development of a successful COVID-19 control strategy requires a thorough understanding of the trends in geographic and demographic distributions of disease burden. In terms of the estimation of the population prevalence, this includes the crucial process of unravelling the number of patients who remain undiagnosed. Objective: This study estimates the period prevalence of COVID-19 between March 1, 2020, and November 30, 2020, and the proportion of the infected population that remained undiagnosed in the Canadian provinces of Quebec, Ontario, Alberta, and British Columbia. Methods: A model-based mathematical framework based on a disease progression and transmission model was developed to estimate the historical prevalence of COVID-19 using provincial-level statistics reporting seroprevalence, diagnoses, and deaths resulting from COVID-19. The framework was applied to three different age cohorts (< 30; 30-69; and ≥70 years) in each of the provinces studied. Results: The estimates of COVID-19 period prevalence between March 1, 2020, and November 30, 2020, were 4.73% (95% CI 4.42%-4.99%) for Quebec, 2.88% (95% CI 2.75%-3.02%) for Ontario, 3.27% (95% CI 2.72%-3.70%) for Alberta, and 2.95% (95% CI 2.77%-3.15%) for British Columbia. Among the cohorts considered in this study, the estimated total number of infections ranged from 2-fold the number of diagnoses (among Quebecers, aged ≥70 years: 26,476/53,549, 49.44%) to 6-fold the number of diagnoses (among British Columbians aged ≥70 years: 3108/18,147, 17.12%). Conclusions: Our estimates indicate that a high proportion of the population infected between March 1 and November 30, 2020, remained undiagnosed. Knowledge of COVID-19 period prevalence and the undiagnosed population can provide vital evidence that policy makers can consider when planning COVID-19 control interventions and vaccination programs. %M 34228626 %R 10.2196/26409 %U https://publichealth.jmir.org/2021/9/e26409 %U https://doi.org/10.2196/26409 %U http://www.ncbi.nlm.nih.gov/pubmed/34228626 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 8 %P e29957 %T Exploring the Utility of Google Mobility Data During the COVID-19 Pandemic in India: Digital Epidemiological Analysis %A Kishore,Kamal %A Jaswal,Vidushi %A Verma,Madhur %A Koushal,Vipin %+ All India Institute of Medical Sciences, Jodhpur Romana Road, Bathinda, 151001, India, 91 9466445513, drmadhurverma@gmail.com %K COVID-19 %K lockdown %K nonpharmaceutical Interventions %K social distancing %K digital surveillance %K Google Community Mobility Reports %K community mobility %D 2021 %7 30.8.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Association between human mobility and disease transmission has been established for COVID-19, but quantifying the levels of mobility over large geographical areas is difficult. Google has released Community Mobility Reports (CMRs) containing data about the movement of people, collated from mobile devices. Objective: The aim of this study is to explore the use of CMRs to assess the role of mobility in spreading COVID-19 infection in India. Methods: In this ecological study, we analyzed CMRs to determine human mobility between March and October 2020. The data were compared for the phases before the lockdown (between March 14 and 25, 2020), during lockdown (March 25-June 7, 2020), and after the lockdown (June 8-October 15, 2020) with the reference periods (ie, January 3-February 6, 2020). Another data set depicting the burden of COVID-19 as per various disease severity indicators was derived from a crowdsourced API. The relationship between the two data sets was investigated using the Kendall tau correlation to depict the correlation between mobility and disease severity. Results: At the national level, mobility decreased from –38% to –77% for all areas but residential (which showed an increase of 24.6%) during the lockdown compared to the reference period. At the beginning of the unlock phase, the state of Sikkim (minimum cases: 7) with a –60% reduction in mobility depicted more mobility compared to –82% in Maharashtra (maximum cases: 1.59 million). Residential mobility was negatively correlated (–0.05 to –0.91) with all other measures of mobility. The magnitude of the correlations for intramobility indicators was comparatively low for the lockdown phase (correlation ≥0.5 for 12 indicators) compared to the other phases (correlation ≥0.5 for 45 and 18 indicators in the prelockdown and unlock phases, respectively). A high correlation coefficient between epidemiological and mobility indicators was observed for the lockdown and unlock phases compared to the prelockdown phase. Conclusions: Mobile-based open-source mobility data can be used to assess the effectiveness of social distancing in mitigating disease spread. CMR data depicted an association between mobility and disease severity, and we suggest using this technique to supplement future COVID-19 surveillance. %M 34174780 %R 10.2196/29957 %U https://publichealth.jmir.org/2021/8/e29957 %U https://doi.org/10.2196/29957 %U http://www.ncbi.nlm.nih.gov/pubmed/34174780 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 10 %N 8 %P e15864 %T Health Impacts of Perchlorate and Pesticide Exposure: Protocol for Community-Engaged Research to Evaluate Environmental Toxicants in a US Border Community %A Trotter II,Robert %A Baldwin,Julie %A Buck,Charles Loren %A Remiker,Mark %A Aguirre,Amanda %A Milner,Trudie %A Torres,Emma %A von Hippel,Frank Arthur %+ Department of Anthropology, Northern Arizona University, 1395 Knoles Drive, Flagstaff, AZ, 86011, United States, 1 9283808684, robert.trotter@nau.edu %K community-engaged research %K endocrine disruption %K environmental contaminants %K health disparities %K toxic metal contamination %K perchlorates %K pesticides %K population health %K thyroid disease %D 2021 %7 11.8.2021 %9 Protocol %J JMIR Res Protoc %G English %X Background: The Northern Arizona University (NAU) Center for Health Equity Research (CHER) is conducting community-engaged health research involving “environmental scans” in Yuma County in collaboration with community health stakeholders, including the Yuma Regional Medical Center (YRMC), Regional Center for Border Health, Inc. (RCBH), Campesinos Sin Fronteras (CSF), Yuma County Public Health District, and government agencies and nongovernmental organizations (NGOs) working on border health issues. The purpose of these efforts is to address community-generated environmental health hazards identified through ongoing coalitions among NAU, and local health care and research institutions. Objective: We are undertaking joint community/university efforts to examine human exposures to perchlorate and agricultural pesticides. This project also includes the parallel development of a new animal model for investigating the mechanisms of toxicity following a “one health” approach. The ultimate goal of this community-engaged effort is to develop interventions to reduce exposures and health impacts of contaminants in Yuma populations. Methods: All participants completed the informed consent process, which included information on the purpose of the study, a request for access to health histories and medical records, and interviews. The interview included questions related to (1) demographics, (2) social determinants of health, (3) health screening, (4) occupational and environmental exposures to perchlorate and pesticides, and (5) access to health services. Each participant provided a hair sample for quantifying the metals used in pesticides, urine sample for perchlorate quantification, and blood sample for endocrine assays. Modeling will examine the relationships between the concentrations of contaminants and hormones, demographics and social determinants of health, and health status of the study population, including health markers known to be impacted by perchlorate and pesticides. Results: We recruited 323 adults residing in Yuma County during a 1-year pilot/feasibility study. Among these, 147 residents were patients from either YRMC or RCBH with a primary diagnosis of thyroid disease, including hyperthyroidism, hypothyroidism, thyroid cancer, or goiter. The remaining 176 participants were from the general population but with no history of thyroid disorder. The pilot study confirmed the feasibility of using the identified community-engaged protocol to recruit, consent, and collect data from a difficult-to-access, vulnerable population. The demographics of the pilot study population and positive feedback on the success of the community-engaged approach indicate that the project can be scaled up to a broader study with replicable population health findings. Conclusions: Using a community-engaged approach, the research protocol provided substantial evidence regarding the effectiveness of designing and implementing culturally relevant recruitment and dissemination processes that combine laboratory findings and public health information. Future findings will elucidate the mechanisms of toxicity and the population health effects of the contaminants of concern, as well as provide a new animal model to develop precision medicine capabilities for the population. International Registered Report Identifier (IRRID): DERR1-10.2196/15864 %M 34383679 %R 10.2196/15864 %U https://www.researchprotocols.org/2021/8/e15864 %U https://doi.org/10.2196/15864 %U http://www.ncbi.nlm.nih.gov/pubmed/34383679 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 8 %P e28195 %T Forecasting COVID-19 Hospital Census: A Multivariate Time-Series Model Based on Local Infection Incidence %A Nguyen,Hieu M %A Turk,Philip J %A McWilliams,Andrew D %+ Center for Outcomes Research and Evaluation, Atrium Health, 1300 Scott Ave, Charlotte, NC, 28204, United States, 1 9706914892, hieu.nguyen@atriumhealth.org %K COVID-19 %K forecasting %K time-series model %K vector error correction model %K hospital census %K hospital resource utilization %K infection incidence %D 2021 %7 4.8.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: COVID-19 has been one of the most serious global health crises in world history. During the pandemic, health care systems require accurate forecasts for key resources to guide preparation for patient surges. Forecasting the COVID-19 hospital census is among the most important planning decisions to ensure adequate staffing, number of beds, intensive care units, and vital equipment. Objective: The goal of this study was to explore the potential utility of local COVID-19 infection incidence data in developing a forecasting model for the COVID-19 hospital census. Methods: The study data comprised aggregated daily COVID-19 hospital census data across 11 Atrium Health hospitals plus a virtual hospital in the greater Charlotte metropolitan area of North Carolina, as well as the total daily infection incidence across the same region during the May 15 to December 5, 2020, period. Cross-correlations between hospital census and local infection incidence lagging up to 21 days were computed. A multivariate time-series framework, called the vector error correction model (VECM), was used to simultaneously incorporate both time series and account for their possible long-run relationship. Hypothesis tests and model diagnostics were performed to test for the long-run relationship and examine model goodness of fit. The 7-days-ahead forecast performance was measured by mean absolute percentage error (MAPE), with time-series cross-validation. The forecast performance was also compared with an autoregressive integrated moving average (ARIMA) model in the same cross-validation time frame. Based on different scenarios of the pandemic, the fitted model was leveraged to produce 60-days-ahead forecasts. Results: The cross-correlations were uniformly high, falling between 0.7 and 0.8. There was sufficient evidence that the two time series have a stable long-run relationship at the .01 significance level. The model had very good fit to the data. The out-of-sample MAPE had a median of 5.9% and a 95th percentile of 13.4%. In comparison, the MAPE of the ARIMA had a median of 6.6% and a 95th percentile of 14.3%. Scenario-based 60-days-ahead forecasts exhibited concave trajectories with peaks lagging 2 to 3 weeks later than the peak infection incidence. In the worst-case scenario, the COVID-19 hospital census can reach a peak over 3 times greater than the peak observed during the second wave. Conclusions: When used in the VECM framework, the local COVID-19 infection incidence can be an effective leading indicator to predict the COVID-19 hospital census. The VECM model had a very good 7-days-ahead forecast performance and outperformed the traditional ARIMA model. Leveraging the relationship between the two time series, the model can produce realistic 60-days-ahead scenario-based projections, which can inform health care systems about the peak timing and volume of the hospital census for long-term planning purposes. %M 34346897 %R 10.2196/28195 %U https://publichealth.jmir.org/2021/8/e28195 %U https://doi.org/10.2196/28195 %U http://www.ncbi.nlm.nih.gov/pubmed/34346897 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 7 %P e28812 %T Evaluation of a Parsimonious COVID-19 Outbreak Prediction Model: Heuristic Modeling Approach Using Publicly Available Data Sets %A Gupta,Agrayan K %A Grannis,Shaun J %A Kasthurirathne,Suranga N %+ Indiana University, 107 S Indiana Ave, Bloomington, IN, 47405, United States, 1 812 855 4848, aggupta@iu.edu %K coronavirus %K COVID-19 %K emerging outbreak %K modeling disease outbreak %K precision public health %K predictive modeling %D 2021 %7 26.7.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: The COVID-19 pandemic has changed public health policies and human and community behaviors through lockdowns and mandates. Governments are rapidly evolving policies to increase hospital capacity and supply personal protective equipment and other equipment to mitigate disease spread in affected regions. Current models that predict COVID-19 case counts and spread are complex by nature and offer limited explainability and generalizability. This has highlighted the need for accurate and robust outbreak prediction models that balance model parsimony and performance. Objective: We sought to leverage readily accessible data sets extracted from multiple states to train and evaluate a parsimonious predictive model capable of identifying county-level risk of COVID-19 outbreaks on a day-to-day basis. Methods: Our modeling approach leveraged the following data inputs: COVID-19 case counts per county per day and county populations. We developed an outbreak gold standard across California, Indiana, and Iowa. The model utilized a per capita running 7-day sum of the case counts per county per day and the mean cumulative case count to develop baseline values. The model was trained with data recorded between March 1 and August 31, 2020, and tested on data recorded between September 1 and October 31, 2020. Results: The model reported sensitivities of 81%, 92%, and 90% for California, Indiana, and Iowa, respectively. The precision in each state was above 85% while specificity and accuracy scores were generally >95%. Conclusions: Our parsimonious model provides a generalizable and simple alternative approach to outbreak prediction. This methodology can be applied to diverse regions to help state officials and hospitals with resource allocation and to guide risk management, community education, and mitigation strategies. %M 34156964 %R 10.2196/28812 %U https://www.jmir.org/2021/7/e28812 %U https://doi.org/10.2196/28812 %U http://www.ncbi.nlm.nih.gov/pubmed/34156964 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 6 %P e27888 %T How New Mexico Leveraged a COVID-19 Case Forecasting Model to Preemptively Address the Health Care Needs of the State: Quantitative Analysis %A Castro,Lauren A %A Shelley,Courtney D %A Osthus,Dave %A Michaud,Isaac %A Mitchell,Jason %A Manore,Carrie A %A Del Valle,Sara Y %+ Information Systems & Modeling Group, Analytics, Intelligence and Technology Division, Los Alamos National Laboratory, PO Box 1663, Los Alamos, NM, 87545, United States, 1 505 667 7544, lcastro@lanl.gov %K COVID-19 %K forecasting %K health care %K prediction %K forecast %K model %K quantitative %K hospital %K ICU %K ventilator %K intensive care unit %K probability %K trend %K plan %D 2021 %7 9.6.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Prior to the COVID-19 pandemic, US hospitals relied on static projections of future trends for long-term planning and were only beginning to consider forecasting methods for short-term planning of staffing and other resources. With the overwhelming burden imposed by COVID-19 on the health care system, an emergent need exists to accurately forecast hospitalization needs within an actionable timeframe. Objective: Our goal was to leverage an existing COVID-19 case and death forecasting tool to generate the expected number of concurrent hospitalizations, occupied intensive care unit (ICU) beds, and in-use ventilators 1 day to 4 weeks in the future for New Mexico and each of its five health regions. Methods: We developed a probabilistic model that took as input the number of new COVID-19 cases for New Mexico from Los Alamos National Laboratory’s COVID-19 Forecasts Using Fast Evaluations and Estimation tool, and we used the model to estimate the number of new daily hospital admissions 4 weeks into the future based on current statewide hospitalization rates. The model estimated the number of new admissions that would require an ICU bed or use of a ventilator and then projected the individual lengths of hospital stays based on the resource need. By tracking the lengths of stay through time, we captured the projected simultaneous need for inpatient beds, ICU beds, and ventilators. We used a postprocessing method to adjust the forecasts based on the differences between prior forecasts and the subsequent observed data. Thus, we ensured that our forecasts could reflect a dynamically changing situation on the ground. Results: Forecasts made between September 1 and December 9, 2020, showed variable accuracy across time, health care resource needs, and forecast horizon. Forecasts made in October, when new COVID-19 cases were steadily increasing, had an average accuracy error of 20.0%, while the error in forecasts made in September, a month with low COVID-19 activity, was 39.7%. Across health care use categories, state-level forecasts were more accurate than those at the regional level. Although the accuracy declined as the forecast was projected further into the future, the stated uncertainty of the prediction improved. Forecasts were within 5% of their stated uncertainty at the 50% and 90% prediction intervals at the 3- to 4-week forecast horizon for state-level inpatient and ICU needs. However, uncertainty intervals were too narrow for forecasts of state-level ventilator need and all regional health care resource needs. Conclusions: Real-time forecasting of the burden imposed by a spreading infectious disease is a crucial component of decision support during a public health emergency. Our proposed methodology demonstrated utility in providing near-term forecasts, particularly at the state level. This tool can aid other stakeholders as they face COVID-19 population impacts now and in the future. %M 34003763 %R 10.2196/27888 %U https://publichealth.jmir.org/2021/6/e27888 %U https://doi.org/10.2196/27888 %U http://www.ncbi.nlm.nih.gov/pubmed/34003763 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 6 %P e28265 %T Correlation of Population SARS-CoV-2 Cycle Threshold Values to Local Disease Dynamics: Exploratory Observational Study %A Tso,Chak Foon %A Garikipati,Anurag %A Green-Saxena,Abigail %A Mao,Qingqing %A Das,Ritankar %+ Dascena, Inc, 12333 Sowden Rd, Ste B, Private Mailbox 65148, Houston, TX, 77080-2059, United States, 1 826 9508, qmao@dascena.com %K reverse transcription polymerase chain reaction %K testing %K cycle threshold %K COVID-19 %K epidemiology %K Rt %K exploratory %K correlation %K population %K threshold %K disease dynamic %K distribution %K transmission %D 2021 %7 3.6.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Despite the limitations in the use of cycle threshold (CT) values for individual patient care, population distributions of CT values may be useful indicators of local outbreaks. Objective: We aimed to conduct an exploratory analysis of potential correlations between the population distribution of cycle threshold (CT) values and COVID-19 dynamics, which were operationalized as percent positivity, transmission rate (Rt), and COVID-19 hospitalization count. Methods: In total, 148,410 specimens collected between September 15, 2020, and January 11, 2021, from the greater El Paso area were processed in the Dascena COVID-19 Laboratory. The daily median CT value, daily Rt, daily count of COVID-19 hospitalizations, daily change in percent positivity, and rolling averages of these features were plotted over time. Two-way scatterplots and linear regression were used to evaluate possible associations between daily median CT values and outbreak measures. Cross-correlation plots were used to determine whether a time delay existed between changes in daily median CT values and measures of community disease dynamics. Results: Daily median CT values negatively correlated with the daily Rt values (P<.001), the daily COVID-19 hospitalization counts (with a 33-day time delay; P<.001), and the daily changes in percent positivity among testing samples (P<.001). Despite visual trends suggesting time delays in the plots for median CT values and outbreak measures, a statistically significant delay was only detected between changes in median CT values and COVID-19 hospitalization counts (P<.001). Conclusions: This study adds to the literature by analyzing samples collected from an entire geographical area and contextualizing the results with other research investigating population CT values. %M 33999831 %R 10.2196/28265 %U https://publichealth.jmir.org/2021/6/e28265 %U https://doi.org/10.2196/28265 %U http://www.ncbi.nlm.nih.gov/pubmed/33999831 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 6 %P e27189 %T The Effect of Test Timing on the Probability of Positive SARS-CoV-2 Swab Test Results: Mixed Model Approach %A Benoni,Roberto %A Panunzi,Silvia %A Campagna,Irene %A Moretti,Francesca %A Lo Cascio,Giuliana %A Spiteri,Gianluca %A Porru,Stefano %A Tardivo,Stefano %+ Postgraduate School of Hygiene and Preventive Medicine, University of Verona, Strada Le Grazie, 8, Verona, 37134, Italy, 39 0458027659, roberto.benoni90@gmail.com %K close contact %K COVID-19 %K health care workers %K health surveillance %K swab test timing %D 2021 %7 3.6.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: During the COVID-19 pandemic, swab tests proved to be effective in containing the infection and served as a means for early diagnosis and contact tracing. However, little evidence exists regarding the correct timing for the execution of the swab test, especially for asymptomatic individuals and health care workers. Objective: The objective of this study was to analyze changes in the positive findings over time in individual SARS-CoV-2 swab tests during a health surveillance program. Methods: The study was conducted with 2071 health care workers at the University Hospital of Verona, with a known date of close contact with a patient with COVID-19, between February 29 and April 17, 2020. The health care workers underwent a health surveillance program with repeated swab tests to track their virological status. A generalized additive mixed model was used to investigate how the probability of a positive test result changes over time since the last known date of close contact, in an overall sample of individuals who tested positive for COVID-19 and in a subset of individuals with an initial negative swab test finding before being proven positive, to assess different surveillance time intervals. Results: Among the 2071 health care workers in this study, 191 (9.2%) tested positive for COVID-19, and 103 (54%) were asymptomatic with no differences based on sex or age. Among 49 (25.7%) cases, the initial swab test yielded negative findings after close contact with a patient with COVID-19. Sex, age, symptoms, and the time of sampling were not different between individuals with an initial negative swab test finding and those who initially tested positive after close contact. In the overall sample, the estimated probability of testing positive was 0.74 on day 1 after close contact, which increased to 0.77 between days 5 and 8. In the 3 different scenarios for scheduled repeated testing intervals (3, 5, and 7 days) in the subgroup of individuals with an initially negative swab test finding, the probability peaked on the sixth, ninth and tenth, and 13th and 14th days, respectively. Conclusions: Swab tests can initially yield false-negative outcomes. The probability of testing positive increases from day 1, peaking between days 5 and 8 after close contact with a patient with COVID-19. Early testing, especially in this final time window, is recommended together with a health surveillance program scheduled in close intervals. %M 34003761 %R 10.2196/27189 %U https://publichealth.jmir.org/2021/6/e27189 %U https://doi.org/10.2196/27189 %U http://www.ncbi.nlm.nih.gov/pubmed/34003761 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 6 %P e27590 %T Epidemiology of Diphtheria in Yemen, 2017-2018: Surveillance Data Analysis %A Moghalles,Suaad Ameen %A Aboasba,Basher Ahmed %A Alamad,Mohammed Abdullah %A Khader,Yousef Saleh %+ Yemen Field Epidemiology Training Programme, Ministry of Public Health and Population, Hadh Street, Sana'a, 00967, Yemen, 967 735 800 572, smughalles@gmail.com %K diphtheria %K epidemiology %K incidence %K case fatality rate %D 2021 %7 2.6.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: As a consequence of war and the collapse of the health system in Yemen, which prevented many people from accessing health facilities to obtain primary health care, vaccination coverage was affected, leading to a deadly diphtheria epidemic at the end of 2017. Objective: This study aimed to describe the epidemiology of diphtheria in Yemen and determine its incidence and case fatality rate. Methods: Data were obtained from the diphtheria surveillance program 2017-2018, using case definitions of the World Health Organization. A probable case was defined as a case involving a person having laryngitis, pharyngitis, or tonsillitis and an adherent membrane of the tonsils, pharynx, and/or nose. A confirmed case was defined as a probable case that was laboratory confirmed or linked epidemiologically to a laboratory-confirmed case. Data from the Central Statistical Organization was used to calculate the incidence per 100,000 population. A P value <.05 was considered significant. Results: A total of 2243 cases were reported during the period between July 2017 and August 2018. About 49% (1090/2243, 48.6%) of the cases were males. About 44% (978/2243, 43.6%) of the cases involved children aged 5 to 15 years. Respiratory tract infection was the predominant symptom (2044/2243, 91.1%), followed by pseudomembrane (1822/2243, 81.2%). Based on the vaccination status, the percentages of partially vaccinated, vaccinated, unvaccinated, and unknown status patients were 6.6% (148/2243), 30.8% (690/2243), 48.6% (10902243), and 14.0% (315/2243), respectively. The overall incidence of diphtheria was 8 per 100,000 population. The highest incidence was among the age group <15 years (11 per 100,000 population), and the lowest incidence was among the age group ≥15 years (5 per 100,000 population). The overall case fatality rate among all age groups was 5%, and it was higher (10%) in the age group <5 years. Five governorates that were difficult to access (Raymah, Abyan, Sa'ada, Lahj, and Al Jawf) had a very high case fatality rate (22%). Conclusions: Diphtheria affected a large number of people in Yemen in 2017-2018. The majority of patients were partially or not vaccinated. Children aged ≤15 years were more affected, with higher fatality among children aged <5 years. Five governorates that were difficult to access had a case fatality rate twice that of the World Health Organization estimate (5%-10%). To control the diphtheria epidemic in Yemen, it is recommended to increase routine vaccination coverage and booster immunizations, increase public health awareness toward diphtheria, and strengthen the surveillance system for early detection and immediate response. %M 34076583 %R 10.2196/27590 %U https://publichealth.jmir.org/2021/6/e27590 %U https://doi.org/10.2196/27590 %U http://www.ncbi.nlm.nih.gov/pubmed/34076583 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 6 %P e26784 %T Risk Assessment of Importation and Local Transmission of COVID-19 in South Korea: Statistical Modeling Approach %A Lee,Hyojung %A Kim,Yeahwon %A Kim,Eunsu %A ‍Lee,Sunmi %+ Kyung Hee University, 1732 Deogyeong-daero, Giheung-gu, Yongin-si, 17104, Republic of Korea, 82 312012409, sunmilee@khu.ac.kr %K COVID-19 %K transmission dynamics %K South Korea %K international travels %K imported and local transmission %K basic reproduction number %K effective reproduction number %K mitigation intervention strategies %K risk %K assessment %K transmission %K mitigation %K strategy %K travel %K mobility %K spread %K intervention %K diagnosis %K monitoring %K testing %D 2021 %7 1.6.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Despite recent achievements in vaccines, antiviral drugs, and medical infrastructure, the emergence of COVID-19 has posed a serious threat to humans worldwide. Most countries are well connected on a global scale, making it nearly impossible to implement perfect and prompt mitigation strategies for infectious disease outbreaks. In particular, due to the explosive growth of international travel, the complex network of human mobility enabled the rapid spread of COVID-19 globally. Objective: South Korea was one of the earliest countries to be affected by COVID-19. In the absence of vaccines and treatments, South Korea has implemented and maintained stringent interventions, such as large-scale epidemiological investigations, rapid diagnosis, social distancing, and prompt clinical classification of severely ill patients with appropriate medical measures. In particular, South Korea has implemented effective airport screenings and quarantine measures. In this study, we aimed to assess the country-specific importation risk of COVID-19 and investigate its impact on the local transmission of COVID-19. Methods: The country-specific importation risk of COVID-19 in South Korea was assessed. We investigated the relationships between country-specific imported cases, passenger numbers, and the severity of country-specific COVID-19 prevalence from January to October 2020. We assessed the country-specific risk by incorporating country-specific information. A renewal mathematical model was employed, considering both imported and local cases of COVID-19 in South Korea. Furthermore, we estimated the basic and effective reproduction numbers. Results: The risk of importation from China was highest between January and February 2020, while that from North America (the United States and Canada) was high from April to October 2020. The R0 was estimated at 1.87 (95% CI 1.47-2.34), using the rate of α=0.07 for secondary transmission caused by imported cases. The Rt was estimated in South Korea and in both Seoul and Gyeonggi. Conclusions: A statistical model accounting for imported and locally transmitted cases was employed to estimate R0 and Rt. Our results indicated that the prompt implementation of airport screening measures (contact tracing with case isolation and quarantine) successfully reduced local transmission caused by imported cases despite passengers arriving from high-risk countries throughout the year. Moreover, various mitigation interventions, including social distancing and travel restrictions within South Korea, have been effectively implemented to reduce the spread of local cases in South Korea. %M 33819165 %R 10.2196/26784 %U https://publichealth.jmir.org/2021/6/e26784 %U https://doi.org/10.2196/26784 %U http://www.ncbi.nlm.nih.gov/pubmed/33819165 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 5 %P e19544 %T Age-Stratified Infection Probabilities Combined With a Quarantine-Modified Model for COVID-19 Needs Assessments: Model Development Study %A Bongolan,Vena Pearl %A Minoza,Jose Marie Antonio %A de Castro,Romulo %A Sevilleja,Jesus Emmanuel %+ Department of Computer Science, University of the Philippines Diliman, UP AECH Bldg, Velasquez St, Quezon City, 1101, Philippines, 63 915 877 2298, bongolan@up.edu.ph %K COVID-19 %K epidemic modeling %K age stratification theory %K infection probability %K SEIR %K mathematical modelling %D 2021 %7 31.5.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Classic compartmental models such as the susceptible-exposed-infectious-removed (SEIR) model all have the weakness of assuming a homogenous population, where everyone has an equal chance of getting infected and dying. Since it was identified in Hubei, China, in December 2019, COVID-19 has rapidly spread around the world and been declared a pandemic. Based on data from Hubei, infection and death distributions vary with age. To control the spread of the disease, various preventive and control measures such as community quarantine and social distancing have been widely used. Objective: Our aim is to develop a model where age is a factor, considering the study area’s age stratification. Additionally, we want to account for the effects of quarantine on the SEIR model. Methods: We use the age-stratified COVID-19 infection and death distributions from Hubei, China (more than 44,672 infections as of February 11, 2020) as an estimate or proxy for a study area’s infection and mortality probabilities for each age group. We then apply these probabilities to the actual age-stratified population of Quezon City, Philippines, to predict infectious individuals and deaths at peak. Testing with different countries shows the predicted number of infectious individuals skewing with the country’s median age and age stratification, as expected. We added a Q parameter to the SEIR model to include the effects of quarantine (Q-SEIR). Results: The projections from the age-stratified probabilities give much lower predicted incidences of infection than the Q-SEIR model. As expected, quarantine tends to delay the peaks for both the exposed and infectious groups, and to “flatten” the curve or lower the predicted values for each compartment. These two estimates were used as a range to inform the local government’s planning and response to the COVID-19 threat. Conclusions: Age stratification combined with a quarantine-modified model has good qualitative agreement with observations on infections and death rates. That younger populations will have lower death rates due to COVID-19 is a fair expectation for a disease where most fatalities are among older adults. %M 33900929 %R 10.2196/19544 %U https://www.jmir.org/2021/5/e19544 %U https://doi.org/10.2196/19544 %U http://www.ncbi.nlm.nih.gov/pubmed/33900929 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 5 %P e23305 %T Effective Training Data Extraction Method to Improve Influenza Outbreak Prediction from Online News Articles: Deep Learning Model Study %A Jang,Beakcheol %A Kim,Inhwan %A Kim,Jong Wook %+ Department of Computer Science, Sangmyung Univerisity, 20, Hongjimun 2-gil, Jongno-gu, Seoul, 03016, Republic of Korea, 82 027817590, jkim@smu.ac.kr %K influenza %K training data extraction %K keyword %K sorting %K word embedding %K Pearson correlation coefficient %K long short-term memory %K surveillance %K infodemiology %K infoveillance %K model %D 2021 %7 25.5.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: Each year, influenza affects 3 to 5 million people and causes 290,000 to 650,000 fatalities worldwide. To reduce the fatalities caused by influenza, several countries have established influenza surveillance systems to collect early warning data. However, proper and timely warnings are hindered by a 1- to 2-week delay between the actual disease outbreaks and the publication of surveillance data. To address the issue, novel methods for influenza surveillance and prediction using real-time internet data (such as search queries, microblogging, and news) have been proposed. Some of the currently popular approaches extract online data and use machine learning to predict influenza occurrences in a classification mode. However, many of these methods extract training data subjectively, and it is difficult to capture the latent characteristics of the data correctly. There is a critical need to devise new approaches that focus on extracting training data by reflecting the latent characteristics of the data. Objective: In this paper, we propose an effective method to extract training data in a manner that reflects the hidden features and improves the performance by filtering and selecting only the keywords related to influenza before the prediction. Methods: Although word embedding provides a distributed representation of words by encoding the hidden relationships between various tokens, we enhanced the word embeddings by selecting keywords related to the influenza outbreak and sorting the extracted keywords using the Pearson correlation coefficient in order to solely keep the tokens with high correlation with the actual influenza outbreak. The keyword extraction process was followed by a predictive model based on long short-term memory that predicts the influenza outbreak. To assess the performance of the proposed predictive model, we used and compared a variety of word embedding techniques. Results: Word embedding without our proposed sorting process showed 0.8705 prediction accuracy when 50.2 keywords were selected on average. Conversely, word embedding using our proposed sorting process showed 0.8868 prediction accuracy and an improvement in prediction accuracy of 12.6%, although smaller amounts of training data were selected, with only 20.6 keywords on average. Conclusions: The sorting stage empowers the embedding process, which improves the feature extraction process because it acts as a knowledge base for the prediction component. The model outperformed other current approaches that use flat extraction before prediction. %M 34032577 %R 10.2196/23305 %U https://medinform.jmir.org/2021/5/e23305 %U https://doi.org/10.2196/23305 %U http://www.ncbi.nlm.nih.gov/pubmed/34032577 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 5 %P e22431 %T SARS-CoV-2: The Second Wave in Europe %A Fokas,Athanassios S %A Kastis,George A %+ Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Wilberforce Rd, Cambridge, CB3 0WA, United Kingdom, 44 1223 339, tf227@cam.ac.uk %K mathematical modelling of epidemics %K COVID-19 %K SARS CoV-2 %K pandemic %K lockdown in Europe %D 2021 %7 18.5.2021 %9 Viewpoint %J J Med Internet Res %G English %X Although the SARS-CoV-2 virus has already undergone several mutations, the impact of these mutations on its infectivity and virulence remains controversial. In this viewpoint, we present arguments suggesting that SARS-CoV-2 mutants responsible for the second wave have less virulence but much higher infectivity. This suggestion is based on the results of the forecasting and mechanistic models developed by our study group. In particular, in May 2020, the analysis of our mechanistic model predicted that the easing of lockdown measures will lead to a dramatic second wave of the COVID-19 outbreak. However, after the lockdown was lifted in many European countries, the resulting number of reported infected cases and especially the number of deaths remained low for approximately two months. This raised the false hope that a substantial second wave will be avoided and that the COVID-19 epidemic in these European countries was nearing an end. Unfortunately, since the first week of August 2020, the number of reported infected cases increased dramatically. Furthermore, this was accompanied by an increasingly large number of deaths. The rate of reported infected cases in the second wave was much higher than that in the first wave, whereas the rate of deaths was lower. This trend is consistent with higher infectivity and lower virulence. Even if the mutated form of SARS-CoV-2 is less virulent, the very high number of reported infected cases implies that a large number of people will perish. %M 33939621 %R 10.2196/22431 %U https://www.jmir.org/2021/5/e22431 %U https://doi.org/10.2196/22431 %U http://www.ncbi.nlm.nih.gov/pubmed/33939621 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 5 %P e25753 %T SARS-CoV-2 Surveillance System in Canada: Longitudinal Trend Analysis %A Post,Lori %A Boctor,Michael J %A Issa,Tariq Z %A Moss,Charles B %A Murphy,Robert Leo %A Achenbach,Chad J %A Ison,Michael G %A Resnick,Danielle %A Singh,Lauren %A White,Janine %A Welch,Sarah B %A Oehmke,James F %+ Buehler Center for Health Policy and Economics, Feinberg School of Medicine, Northwestern University, 420 E Superior, Chicago, IL, 60611, United States, 1 203 980 7107, lori.post@northwestern.edu %K global COVID surveillance %K COVID-19 %K COVID-21 %K new COVID strains %K Canada Public Health Surveillance %K Great COVID Shutdown %K Canadian COVID-19 %K surveillance metrics %K wave 2 Canada COVID-19 %K dynamic panel data %K generalized method of the moments %K Canadian econometrics %K Canada SARS-CoV-2 %K Canadian COVID-19 surveillance system %K Canadian COVID transmission speed %K Canadian COVID transmission acceleration %K COVID transmission deceleration %K COVID transmission jerk %K COVID 7-day lag %K Alberta %K British Columbia %K Manitoba %K New Brunswick %K Newfoundland and Labrador %K Northwest Territories %K Nova Scotia %K Nunavut %K Ontario %K Prince Edward Island %K Quebec %K Saskatchewan %K Yukon %D 2021 %7 10.5.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The COVID-19 global pandemic has disrupted structures and communities across the globe. Numerous regions of the world have had varying responses in their attempts to contain the spread of the virus. Factors such as public health policies, governance, and sociopolitical climate have led to differential levels of success at controlling the spread of SARS-CoV-2. Ultimately, a more advanced surveillance metric for COVID-19 transmission is necessary to help government systems and national leaders understand which responses have been effective and gauge where outbreaks occur. Objective: The goal of this study is to provide advanced COVID-19 surveillance metrics for Canada at the country, province, and territory level that account for shifts in the pandemic including speed, acceleration, jerk, and persistence. Enhanced surveillance identifies risks for explosive growth and regions that have controlled outbreaks successfully. Methods: Using a longitudinal trend analysis study design, we extracted 62 days of COVID-19 data from Canadian public health registries for 13 provinces and territories. We used an empirical difference equation to measure the daily number of cases in Canada as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano-Bond estimator in R. Results: We compare the week of February 7-13, 2021, with the week of February 14-20, 2021. Canada, as a whole, had a decrease in speed from 8.4 daily new cases per 100,000 population to 7.5 daily new cases per 100,000 population. The persistence of new cases during the week of February 14-20 reported 7.5 cases that are a result of COVID-19 transmissions 7 days earlier. The two most populous provinces of Ontario and Quebec both experienced decreases in speed from 7.9 and 11.5 daily new cases per 100,000 population for the week of February 7-13 to speeds of 6.9 and 9.3 for the week of February 14-20, respectively. Nunavut experienced a significant increase in speed during this time, from 3.3 daily new cases per 100,000 population to 10.9 daily new cases per 100,000 population. Conclusions: Canada excelled at COVID-19 control early on in the pandemic, especially during the first COVID-19 shutdown. The second wave at the end of 2020 resulted in a resurgence of the outbreak, which has since been controlled. Enhanced surveillance identifies outbreaks and where there is the potential for explosive growth, which informs proactive health policy. %M 33852410 %R 10.2196/25753 %U https://publichealth.jmir.org/2021/5/e25753 %U https://doi.org/10.2196/25753 %U http://www.ncbi.nlm.nih.gov/pubmed/33852410 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 5 %N 5 %P e23251 %T Survival Analysis of Patients With COVID-19 in India by Demographic Factors: Quantitative Study %A Kundu,Sampurna %A Chauhan,Kirti %A Mandal,Debarghya %+ Department of Mathematical Demography and Statistics, International Institute for Population Sciences, Govandi Station Road, Mumbai, 400088, India, 91 9073111858, sampurna34@gmail.com %K survival analysis %K COVID-19 %K patient data %K Kaplan-Meier %K hazard model %K modeling %K survival %K mortality %K demographic %K India %K transmission %D 2021 %7 6.5.2021 %9 Original Paper %J JMIR Form Res %G English %X Background: Studies of the transmission dynamics of COVID-19 have depicted the rate, patterns, and predictions of cases of this pandemic disease. To combat transmission of the disease in India, the government declared a lockdown on March 25, 2020. Even after this strict lockdown was enacted nationwide, the number of COVID-19 cases increased and surpassed 450,000. A positive point to note is that the number of recovered cases began to slowly exceed that of active cases. The survival of patients, taking death as the event that varies by age group and sex, is noteworthy. Objective: The aim of this study was to conduct a survival analysis to establish the variability in survivorship of patients with COVID-19 in India by age group and sex at different levels, that is, the national, state, and district levels. Methods: The study period was taken from the date of the first reported case of COVID-19 in India, which was January 30, 2020, up to June 30, 2020. Due to the amount of underreported data and removal of missing columns, a total sample of 26,815 patients was considered. Kaplan-Meier survival estimation, the Cox proportional hazard model, and the multilevel survival model were used to perform the survival analysis. Results: The Kaplan-Meier survival function showed that the probability of survival of patients with COVID-19 declined during the study period of 5 months, which was supplemented by the log rank test (P<.001) and Wilcoxon test (P<.001) to compare the survival functions. Significant variability was observed in the age groups, as evident from all the survival estimates; with increasing age, the risk of dying of COVID-19 increased. The Cox proportional hazard model reiterated that male patients with COVID-19 had a 1.14 times higher risk of dying than female patients (hazard ratio 1.14; SE 0.11; 95% CI 0.93-1.38). Western and Central India showed decreasing survival rates in the framed time period, while Eastern, North Eastern, and Southern India showed slightly better results in terms of survival. Conclusions: This study depicts a grave scenario of decreasing survival rates in various regions of India and shows variability in these rates by age and sex. In essence, we can safely conclude that the critical appraisal of the survival rate and thorough analysis of patient data in this study equipped us to identify risk groups and perform comparative studies of various segments in India. International Registered Report Identifier (IRRID): RR2-10.1101/2020.08.01.20162115 %M 33882017 %R 10.2196/23251 %U https://formative.jmir.org/2021/5/e23251 %U https://doi.org/10.2196/23251 %U http://www.ncbi.nlm.nih.gov/pubmed/33882017 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 4 %P e27419 %T Returning to a Normal Life via COVID-19 Vaccines in the United States: A Large-scale Agent-Based Simulation Study %A Li,Junjiang %A Giabbanelli,Philippe %+ Department of Computer Science & Software Engineering, Miami University, 205 Benton Hall, Oxford, OH, 45056, United States, 1 513 529 0147, aqualonne@free.fr %K agent-based model %K cloud-based simulations %K COVID-19 %K large-scale simulations %K vaccine %K model %K simulation %K United States %K agent-based %K effective %K willingness %K capacity %K plan %K strategy %K outcome %K interaction %K intervention %K scenario %K impact %D 2021 %7 29.4.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: In 2020, COVID-19 has claimed more than 300,000 deaths in the United States alone. Although nonpharmaceutical interventions were implemented by federal and state governments in the United States, these efforts have failed to contain the virus. Following the Food and Drug Administration's approval of two COVID-19 vaccines, however, the hope for the return to normalcy has been renewed. This hope rests on an unprecedented nationwide vaccine campaign, which faces many logistical challenges and is also contingent on several factors whose values are currently unknown. Objective: We study the effectiveness of a nationwide vaccine campaign in response to different vaccine efficacies, the willingness of the population to be vaccinated, and the daily vaccine capacity under two different federal plans. To characterize the possible outcomes most accurately, we also account for the interactions between nonpharmaceutical interventions and vaccines through 6 scenarios that capture a range of possible impacts from nonpharmaceutical interventions. Methods: We used large-scale, cloud-based, agent-based simulations by implementing the vaccination campaign using COVASIM, an open-source agent-based model for COVID-19 that has been used in several peer-reviewed studies and accounts for individual heterogeneity and a multiplicity of contact networks. Several modifications to the parameters and simulation logic were made to better align the model with current evidence. We chose 6 nonpharmaceutical intervention scenarios and applied the vaccination intervention following both the plan proposed by Operation Warp Speed (former Trump administration) and the plan of one million vaccines per day, proposed by the Biden administration. We accounted for unknowns in vaccine efficacies and levels of population compliance by varying both parameters. For each experiment, the cumulative infection growth was fitted to a logistic growth model, and the carrying capacities and the growth rates were recorded. Results: For both vaccination plans and all nonpharmaceutical intervention scenarios, the presence of the vaccine intervention considerably lowers the total number of infections when life returns to normal, even when the population compliance to vaccines is as low as 20%. We noted an unintended consequence; given the vaccine availability estimates under both federal plans and the focus on vaccinating individuals by age categories, a significant reduction in nonpharmaceutical interventions results in a counterintuitive situation in which higher vaccine compliance then leads to more total infections. Conclusions: Although potent, vaccines alone cannot effectively end the pandemic given the current availability estimates and the adopted vaccination strategy. Nonpharmaceutical interventions need to continue and be enforced to ensure high compliance so that the rate of immunity established by vaccination outpaces that induced by infections. %M 33872188 %R 10.2196/27419 %U https://medinform.jmir.org/2021/4/e27419 %U https://doi.org/10.2196/27419 %U http://www.ncbi.nlm.nih.gov/pubmed/33872188 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 4 %P e25695 %T Surveillance of the Second Wave of COVID-19 in Europe: Longitudinal Trend Analyses %A Post,Lori %A Culler,Kasen %A Moss,Charles B %A Murphy,Robert L %A Achenbach,Chad J %A Ison,Michael G %A Resnick,Danielle %A Singh,Lauren Nadya %A White,Janine %A Boctor,Michael J %A Welch,Sarah B %A Oehmke,James Francis %+ Buehler Center for Health Policy and Economics, Feinberg School of Medicine, Northwestern University, 420 E Superior, Chicago, IL, 60611, United States, 1 203 980 7107, lori.post@northwestern.edu %K SARS-CoV-2 surveillance %K wave two %K second wave %K global COVID surveillance %K Europe Public Health Surveillance %K Europe COVID %K Europe surveillance metrics %K dynamic panel data %K generalized method of the moments %K Europe econometrics %K Europe SARS-CoV-2 %K Europe COVID surveillance system %K European COVID transmission speed %K European COVID transmission acceleration %K COVID transmission deceleration %K COVID transmission jerk %K COVID 7-day lag %K SARS-CoV-2 %K Arellano-Bond estimator %K GMM %K Albania %K Andorra %K Austria %K Belarus %K Belgium %K Bosnia and Herzegovina %K Bulgaria %K Croatia %K Czech Republic %K Denmark %K Estonia %K Finland %K France %K Germany %K Greece %K Greenland %K Hungary %K Iceland %K Ireland %K Isle of Man %K Italy %K Latvia %K Liechtenstein %K Lithuania %K Luxembourg %K Moldova %K Monaco %K Montenegro %K Netherlands %K Norway %K Poland %K Portugal %K Romania %K San Marino %K Serbia %K Slovakia %K Slovenia %K Spain %K Sweden %K Switzerland %K Ukraine %K United Kingdom %K Vatican City %D 2021 %7 28.4.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The COVID-19 pandemic has severely impacted Europe, resulting in a high caseload and deaths that varied by country. The second wave of the COVID-19 pandemic has breached the borders of Europe. Public health surveillance is necessary to inform policy and guide leaders. Objective: This study aimed to provide advanced surveillance metrics for COVID-19 transmission that account for weekly shifts in the pandemic, speed, acceleration, jerk, and persistence, to better understand countries at risk for explosive growth and those that are managing the pandemic effectively. Methods: We performed a longitudinal trend analysis and extracted 62 days of COVID-19 data from public health registries. We used an empirical difference equation to measure the daily number of cases in Europe as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano-Bond estimator in R. Results: New COVID-19 cases slightly decreased from 158,741 (week 1, January 4-10, 2021) to 152,064 (week 2, January 11-17, 2021), and cumulative cases increased from 22,507,271 (week 1) to 23,890,761 (week 2), with a weekly increase of 1,383,490 between January 10 and January 17. France, Germany, Italy, Spain, and the United Kingdom had the largest 7-day moving averages for new cases during week 1. During week 2, the 7-day moving average for France and Spain increased. From week 1 to week 2, the speed decreased (37.72 to 33.02 per 100,000), acceleration decreased (0.39 to –0.16 per 100,000), and jerk increased (–1.30 to 1.37 per 100,000). Conclusions: The United Kingdom, Spain, and Portugal, in particular, are at risk for a rapid expansion in COVID-19 transmission. An examination of the European region suggests that there was a decrease in the COVID-19 caseload between January 4 and January 17, 2021. Unfortunately, the rates of jerk, which were negative for Europe at the beginning of the month, reversed course and became positive, despite decreases in speed and acceleration. Finally, the 7-day persistence rate was higher during week 2 than during week 1. These measures indicate that the second wave of the pandemic may be subsiding, but some countries remain at risk for new outbreaks and increased transmission in the absence of rapid policy responses. %M 33818391 %R 10.2196/25695 %U https://publichealth.jmir.org/2021/4/e25695 %U https://doi.org/10.2196/25695 %U http://www.ncbi.nlm.nih.gov/pubmed/33818391 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 4 %P e25728 %T Latin America and the Caribbean SARS-CoV-2 Surveillance: Longitudinal Trend Analysis %A Post,Lori %A Ohiomoba,Ramael O %A Maras,Ashley %A Watts,Sean J %A Moss,Charles B %A Murphy,Robert Leo %A Ison,Michael G %A Achenbach,Chad J %A Resnick,Danielle %A Singh,Lauren Nadya %A White,Janine %A Chaudhury,Azraa S %A Boctor,Michael J %A Welch,Sarah B %A Oehmke,James Francis %+ Buehler Center for Health Policy and Economics, Feinberg School of Medicine, Northwestern University, 420 E Superior, Chicago, IL, 60611, United States, 1 203 980 7107, lori.post@northwestern.edu %K 7-day persistence %K acceleration %K Arellano–Bond estimator %K COVID-19 surveillance system %K COVID-19 %K dynamic panel data %K econometrics %K economic %K generalized method of moments %K global COVID-19 surveillance %K Latin America and the Caribbean %K longitudinal %K metric %K persistence %K policy %K public health surveillance %K SARS-CoV-2 %K second wave %K surveillance metrics %K transmission deceleration %K transmission jerk %K transmission speed %K trend analysis %D 2021 %7 27.4.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The COVID-19 pandemic has placed unprecedented stress on economies, food systems, and health care resources in Latin America and the Caribbean (LAC). Existing surveillance provides a proxy of the COVID-19 caseload and mortalities; however, these measures make it difficult to identify the dynamics of the pandemic and places where outbreaks are likely to occur. Moreover, existing surveillance techniques have failed to measure the dynamics of the pandemic. Objective: This study aimed to provide additional surveillance metrics for COVID-19 transmission to track changes in the speed, acceleration, jerk, and persistence in the transmission of the pandemic more accurately than existing metrics. Methods: Through a longitudinal trend analysis, we extracted COVID-19 data over 45 days from public health registries. We used an empirical difference equation to monitor the daily number of cases in the LAC as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano–Bond estimator in R. COVID-19 transmission rates were tracked for the LAC between September 30 and October 6, 2020, and between October 7 and 13, 2020. Results: The LAC saw a reduction in the speed, acceleration, and jerk for the week of October 13, 2020, compared to the week of October 6, 2020, accompanied by reductions in new cases and the 7-day moving average. For the week of October 6, 2020, Belize reported the highest acceleration and jerk, at 1.7 and 1.8, respectively, which is particularly concerning, given its high mortality rate. The Bahamas also had a high acceleration at 1.5. In total, 11 countries had a positive acceleration during the week of October 6, 2020, whereas only 6 countries had a positive acceleration for the week of October 13, 2020. The TAC displayed an overall positive trend, with a speed of 10.40, acceleration of 0.27, and jerk of –0.31, all of which decreased in the subsequent week to 9.04, –0.81, and –0.03, respectively. Conclusions: Metrics such as new cases, cumulative cases, deaths, and 7-day moving averages provide a static view of the pandemic but fail to identify where and the speed at which SARS-CoV-2 infects new individuals, the rate of acceleration or deceleration of the pandemic, and weekly comparison of the rate of acceleration of the pandemic indicate impending explosive growth or control of the pandemic. Enhanced surveillance will inform policymakers and leaders in the LAC about COVID-19 outbreaks. %M 33852413 %R 10.2196/25728 %U https://publichealth.jmir.org/2021/4/e25728 %U https://doi.org/10.2196/25728 %U http://www.ncbi.nlm.nih.gov/pubmed/33852413 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 4 %P e26628 %T Machine Learning–Based Prediction of Growth in Confirmed COVID-19 Infection Cases in 114 Countries Using Metrics of Nonpharmaceutical Interventions and Cultural Dimensions: Model Development and Validation %A Yeung,Arnold YS %A Roewer-Despres,Francois %A Rosella,Laura %A Rudzicz,Frank %+ Department of Computer Science, University of Toronto, 27 King's College Cir, Toronto, ON, M5S 3H7, Canada, 1 416 978 2011, arnoldyeung@cs.toronto.edu %K COVID-19 %K machine learning %K nonpharmaceutical interventions %K cultural dimensions %K random forest %K AdaBoost %K forecast %K informatics %K epidemiology %K artificial intelligence %D 2021 %7 23.4.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: National governments worldwide have implemented nonpharmaceutical interventions to control the COVID-19 pandemic and mitigate its effects. Objective: The aim of this study was to investigate the prediction of future daily national confirmed COVID-19 infection growth—the percentage change in total cumulative cases—across 14 days for 114 countries using nonpharmaceutical intervention metrics and cultural dimension metrics, which are indicative of specific national sociocultural norms. Methods: We combined the Oxford COVID-19 Government Response Tracker data set, Hofstede cultural dimensions, and daily reported COVID-19 infection case numbers to train and evaluate five non–time series machine learning models in predicting confirmed infection growth. We used three validation methods—in-distribution, out-of-distribution, and country-based cross-validation—for the evaluation, each of which was applicable to a different use case of the models. Results: Our results demonstrate high R2 values between the labels and predictions for the in-distribution method (0.959) and moderate R2 values for the out-of-distribution and country-based cross-validation methods (0.513 and 0.574, respectively) using random forest and adaptive boosting (AdaBoost) regression. Although these models may be used to predict confirmed infection growth, the differing accuracies obtained from the three tasks suggest a strong influence of the use case. Conclusions: This work provides new considerations in using machine learning techniques with nonpharmaceutical interventions and cultural dimensions as metrics to predict the national growth of confirmed COVID-19 infections. %M 33844636 %R 10.2196/26628 %U https://www.jmir.org/2021/4/e26628 %U https://doi.org/10.2196/26628 %U http://www.ncbi.nlm.nih.gov/pubmed/33844636 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 4 %P e21459 %T Automatable Distributed Regression Analysis of Vertically Partitioned Data Facilitated by PopMedNet: Feasibility and Enhancement Study %A Her,Qoua %A Kent,Thomas %A Samizo,Yuji %A Slavkovic,Aleksandra %A Vilk,Yury %A Toh,Sengwee %+ Department of Population Medicine, Harvard Medical School, 401 Park Drive, 4th Floor East, Boston, MA, 02215, United States, 1 6178674885, qouaher@gmail.com %K distributed regression analysis %K distributed data networks %K privacy-protecting analytics %K vertically partitioned data %K informatics %K data networks %K data %D 2021 %7 23.4.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: In clinical research, important variables may be collected from multiple data sources. Physical pooling of patient-level data from multiple sources often raises several challenges, including proper protection of patient privacy and proprietary interests. We previously developed an SAS-based package to perform distributed regression—a suite of privacy-protecting methods that perform multivariable-adjusted regression analysis using only summary-level information—with horizontally partitioned data, a setting where distinct cohorts of patients are available from different data sources. We integrated the package with PopMedNet, an open-source file transfer software, to facilitate secure file transfer between the analysis center and the data-contributing sites. The feasibility of using PopMedNet to facilitate distributed regression analysis (DRA) with vertically partitioned data, a setting where the data attributes from a cohort of patients are available from different data sources, was unknown. Objective: The objective of the study was to describe the feasibility of using PopMedNet and enhancements to PopMedNet to facilitate automatable vertical DRA (vDRA) in real-world settings. Methods: We gathered the statistical and informatic requirements of using PopMedNet to facilitate automatable vDRA. We enhanced PopMedNet based on these requirements to improve its technical capability to support vDRA. Results: PopMedNet can enable automatable vDRA. We identified and implemented two enhancements to PopMedNet that improved its technical capability to perform automatable vDRA in real-world settings. The first was the ability to simultaneously upload and download multiple files, and the second was the ability to directly transfer summary-level information between the data-contributing sites without a third-party analysis center. Conclusions: PopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research in real-world settings. %M 33890866 %R 10.2196/21459 %U https://medinform.jmir.org/2021/4/e21459 %U https://doi.org/10.2196/21459 %U http://www.ncbi.nlm.nih.gov/pubmed/33890866 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 4 %P e26042 %T Impact of Firearm Surveillance on Gun Control Policy: Regression Discontinuity Analysis %A Post,Lori %A Mason,Maryann %A Singh,Lauren Nadya %A Wleklinski,Nicholas P %A Moss,Charles B %A Mohammad,Hassan %A Issa,Tariq Z %A Akhetuamhen,Adesuwa I %A Brandt,Cynthia A %A Welch,Sarah B %A Oehmke,James Francis %+ Buehler Center for Health Policy and Economics, Feinberg School of Medicine, Northwestern University, 420 E Superior, Chicago, IL, 60611, United States, 1 203 980 7107, lori.post@northwestern.edu %K firearm surveillance %K assault weapons ban %K large-capacity magazines %K guns control policy %K mass shootings %K regression lines of discontinuity %D 2021 %7 22.4.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Public mass shootings are a significant public health problem that require ongoing systematic surveillance to test and inform policies that combat gun injuries. Although there is widespread agreement that something needs to be done to stop public mass shootings, opinions on exactly which policies that entails vary, such as the prohibition of assault weapons and large-capacity magazines. Objective: The aim of this study was to determine if the Federal Assault Weapons Ban (FAWB) (1994-2004) reduced the number of public mass shootings while it was in place. Methods: We extracted public mass shooting surveillance data from the Violence Project that matched our inclusion criteria of 4 or more fatalities in a public space during a single event. We performed regression discontinuity analysis, taking advantage of the imposition of the FAWB, which included a prohibition on large-capacity magazines in addition to assault weapons. We estimated a regression model of the 5-year moving average number of public mass shootings per year for the period of 1966 to 2019 controlling for population growth and homicides in general, introduced regression discontinuities in the intercept and a time trend for years coincident with the federal legislation (ie, 1994-2004), and also allowed for a differential effect of the homicide rate during this period. We introduced a second set of trend and intercept discontinuities for post-FAWB years to capture the effects of termination of the policy. We used the regression results to predict what would have happened from 1995 to 2019 had there been no FAWB and also to project what would have happened from 2005 onward had it remained in place. Results: The FAWB resulted in a significant decrease in public mass shootings, number of gun deaths, and number of gun injuries. We estimate that the FAWB prevented 11 public mass shootings during the decade the ban was in place. A continuation of the FAWB would have prevented 30 public mass shootings that killed 339 people and injured an additional 1139 people. Conclusions: This study demonstrates the utility of public health surveillance on gun violence. Surveillance informs policy on whether a ban on assault weapons and large-capacity magazines reduces public mass shootings. As society searches for effective policies to prevent the next mass shooting, we must consider the overwhelming evidence that bans on assault weapons and/or large-capacity magazines work. %M 33783360 %R 10.2196/26042 %U https://publichealth.jmir.org/2021/4/e26042 %U https://doi.org/10.2196/26042 %U http://www.ncbi.nlm.nih.gov/pubmed/33783360 %0 Journal Article %@ 2563-6316 %I JMIR Publications %V 2 %N 2 %P e21269 %T Impact of COVID-19 Testing Strategies and Lockdowns on Disease Management Across Europe, South America, and the United States: Analysis Using Skew-Normal Distributions %A De Leo,Stefano %+ Department of Applied Mathematics, State University of Campinas, Rua Sérgio Buarque de Holanda, 651, Campinas, 13083-859, Brazil, 55 1935215958, deleo@ime.unicamp.br %K COVID-19 %K testing strategy %K skew-normal distributions %K lockdown %K forecast %K modeling %K outbreak %K infectious disease %K prediction %D 2021 %7 21.4.2021 %9 Original Paper %J JMIRx Med %G English %X Background: As COVID-19 infections worldwide exceed 6 million confirmed cases, the data reveal that the first wave of the outbreak is coming to an end in many European countries. There is variation in the testing strategies (eg, massive testing vs testing only those displaying symptoms) and the strictness of lockdowns imposed by countries around the world. For example, Brazil’s mitigation measures lie between the strict lockdowns imposed by many European countries and the more liberal approach taken by Sweden. This can influence COVID-19 metrics (eg, total deaths, confirmed cases) in unexpected ways. Objective: This study aimed to evaluate the effectiveness of local authorities’ strategies in managing the COVID-19 pandemic in Europe, South America, and the United States. Methods: The early stage of the COVID-19 outbreak in Brazil was compared to Europe using the weekly transmission rate. Using the European data as a basis for our analysis, we examined the spread of COVID-19 and modeled curves pertaining to daily confirmed cases and deaths per million using skew-normal probability density functions. For Sweden, the United Kingdom, and the United States, we forecasted the end of the pandemic, and for Brazil, we predicted the peak value for daily deaths per million. We also discussed additional factors that could play an important role in the fight against COVID-19, such as the fast response of local authorities, testing strategies, number of beds in the intensive care unit, and isolation strategies adopted. Results: The European data analysis demonstrated that the transmission rate of COVID-19 increased similarly for all countries in the initial stage of the pandemic but changed as the total confirmed cases per million in each country grew. This was caused by the variation in timely action by local authorities in adopting isolation measures and/or massive testing strategies. The behavior of daily confirmed cases for the United States and Brazil during the early stage of the outbreak was similar to that of Italy and Sweden, respectively. For daily deaths per million, transmission in the United States was similar to that of Switzerland, whereas for Brazil, it was greater than the counts for Portugal, Germany, and Austria (which had, in terms of total deaths per million, the best results in Europe) but lower than other European countries. Conclusions: The fitting skew parameters used to model the curves for daily confirmed cases per million and daily deaths per million allow for a more realistic prediction of the end of the pandemic and permit us to compare the mitigation measures adopted by local authorities by analyzing their respective skew-normal parameters. The massive testing strategy adopted in the early stage of the pandemic by German authorities made a positive difference compared to other countries like Italy where an effective testing strategy was adopted too late. This explains why, despite a strictly indiscriminate lockdown, Italy’s mortality rate was one of the highest in the world. %M 34032814 %R 10.2196/21269 %U https://xmed.jmir.org/2021/2/e21269 %U https://doi.org/10.2196/21269 %U http://www.ncbi.nlm.nih.gov/pubmed/34032814 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 4 %P e21468 %T A Recursive Model of the Spread of COVID-19: Modelling Study %A Ilyin,Sergey O %+ AV Topchiev Institute of Petrochemical Synthesis, Russian Academy of Sciences, 29 Leninsky prospekt, Moscow, 119991, Russian Federation, 7 9168276852, s.o.ilyin@gmail.com %K epidemiology %K COVID-19 %K model %K modelling %K prediction %K spread %K infection %K effective %K contagious %K transmission %D 2021 %7 19.4.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The major medical and social challenge of the 21st century is COVID-19, caused by the novel coronavirus SARS-CoV-2. Critical issues include the rate at which the coronavirus spreads and the effect of quarantine measures and population vaccination on this rate. Knowledge of the laws of the spread of COVID-19 will enable assessment of the effectiveness and reasonableness of the quarantine measures used, as well as determination of the necessary level of vaccination needed to overcome this crisis. Objective: This study aims to establish the laws of the spread of COVID-19 and to use them to develop a mathematical model to predict changes in the number of active cases over time, possible human losses, and the rate of recovery of patients, to make informed decisions about the number of necessary beds in hospitals, the introduction and type of quarantine measures, and the required threshold of vaccination of the population. Methods: This study analyzed the onset of COVID-19 spread in countries such as China, Italy, Spain, the United States, the United Kingdom, Japan, France, and Germany based on publicly available statistical data. The change in the number of COVID-19 cases, deaths, and recovered persons over time was examined, considering the possible introduction of quarantine measures and isolation of infected people in these countries. Based on the data, the virus transmissibility and the average duration of the disease at different stages were evaluated, and a model based on the principle of recursion was developed. Its key features are the separation of active (nonisolated) infected persons into a distinct category and the prediction of their number based on the average duration of the disease in the inactive phase and the concentration of these persons in the population in the preceding days. Results: Specific values for SARS-CoV-2 transmissibility and COVID-19 duration were estimated for different countries. In China, the viral transmissibility was 3.12 before quarantine measures were implemented and 0.36 after these measures were lifted. For the other countries, the viral transmissibility was 2.28-2.76 initially, and it then decreased to 0.87-1.29 as a result of quarantine measures. Therefore, it can be expected that the spread of SARS-CoV-2 will be suppressed if 56%-64% of the total population becomes vaccinated or survives COVID-19. Conclusions: The quarantine measures adopted in most countries are too weak compared to those previously used in China. Therefore, it is not expected that the spread of COVID-19 will stop and the disease will cease to exist naturally or owing to quarantine measures. Active vaccination of the population is needed to prevent the spread of COVID-19. Furthermore, the required specific percentage of vaccinated individuals depends on the magnitude of viral transmissibility, which can be evaluated using the proposed model and statistical data for the country of interest. %M 33871381 %R 10.2196/21468 %U https://publichealth.jmir.org/2021/4/e21468 %U https://doi.org/10.2196/21468 %U http://www.ncbi.nlm.nih.gov/pubmed/33871381 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 4 %P e24389 %T Adaptive Susceptible-Infectious-Removed Model for Continuous Estimation of the COVID-19 Infection Rate and Reproduction Number in the United States: Modeling Study %A Shapiro,Mark B %A Karim,Fazle %A Muscioni,Guido %A Augustine,Abel Saju %+ Anthem, Inc, 220 Virginia Avenue, Indianapolis, IN, 46204, United States, 1 708 295 8150, mark.shapiro@anthem.com %K compartmental models %K COVID-19 %K decision-making %K estimate %K infection rate %K infectious disease %K modeling %K pandemic %K prediction %K reproduction number %K SARS-CoV-2 %K United States %D 2021 %7 7.4.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: The dynamics of the COVID-19 pandemic vary owing to local population density and policy measures. During decision-making, policymakers consider an estimate of the effective reproduction number Rt, which is the expected number of secondary infections spread by a single infected individual. Objective: We propose a simple method for estimating the time-varying infection rate and the Rt. Methods: We used a sliding window approach with a Susceptible-Infectious-Removed (SIR) model. We estimated the infection rate from the reported cases over a 7-day window to obtain a continuous estimation of Rt. A proposed adaptive SIR (aSIR) model was applied to analyze the data at the state and county levels. Results: The aSIR model showed an excellent fit for the number of reported COVID-19 cases, and the 1-day forecast mean absolute prediction error was <2.6% across all states. However, the 7-day forecast mean absolute prediction error approached 16.2% and strongly overestimated the number of cases when the Rt was rapidly decreasing. The maximal Rt displayed a wide range of 2.0 to 4.5 across all states, with the highest values for New York (4.4) and Michigan (4.5). We found that the aSIR model can rapidly adapt to an increase in the number of tests and an associated increase in the reported cases of infection. Our results also suggest that intensive testing may be an effective method of reducing Rt. Conclusions: The aSIR model provides a simple and accurate computational tool for continuous Rt estimation and evaluation of the efficacy of mitigation measures. %M 33755577 %R 10.2196/24389 %U https://www.jmir.org/2021/4/e24389 %U https://doi.org/10.2196/24389 %U http://www.ncbi.nlm.nih.gov/pubmed/33755577 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 4 %P e24292 %T Community and Campus COVID-19 Risk Uncertainty Under University Reopening Scenarios: Model-Based Analysis %A Benneyan,James %A Gehrke,Christopher %A Ilies,Iulian %A Nehls,Nicole %+ Healthcare Systems Engineering Institute, Northeastern University, 360 Huntington Avenue 177H, Boston, MA, 02115, United States, 1 617 373 6450, j.benneyan@northeastern.edu %K COVID-19 %K university reopening %K community impact %K epidemic model %K model %K community %K university %K safety %K strategy %K risk %K infectious disease %D 2021 %7 7.4.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Significant uncertainty has existed about the safety of reopening college and university campuses before the COVID-19 pandemic is better controlled. Moreover, little is known about the effects that on-campus students may have on local higher-risk communities. Objective: We aimed to estimate the range of potential community and campus COVID-19 exposures, infections, and mortality under various university reopening plans and uncertainties. Methods: We developed campus-only, community-only, and campus × community epidemic differential equations and agent-based models, with inputs estimated via published and grey literature, expert opinion, and parameter search algorithms. Campus opening plans (spanning fully open, hybrid, and fully virtual approaches) were identified from websites and publications. Additional student and community exposures, infections, and mortality over 16-week semesters were estimated under each scenario, with 10% trimmed medians, standard deviations, and probability intervals computed to omit extreme outliers. Sensitivity analyses were conducted to inform potential effective interventions. Results: Predicted 16-week campus and additional community exposures, infections, and mortality for the base case with no precautions (or negligible compliance) varied significantly from their medians (4- to 10-fold). Over 5% of on-campus students were infected after a mean of 76 (SD 17) days, with the greatest increase (first inflection point) occurring on average on day 84 (SD 10.2 days) of the semester and with total additional community exposures, infections, and mortality ranging from 1-187, 13-820, and 1-21 per 10,000 residents, respectively. Reopening precautions reduced infections by 24%-26% and mortality by 36%-50% in both populations. Beyond campus and community reproductive numbers, sensitivity analysis indicated no dominant factors that interventions could primarily target to reduce the magnitude and variability in outcomes, suggesting the importance of comprehensive public health measures and surveillance. Conclusions: Community and campus COVID-19 exposures, infections, and mortality resulting from reopening campuses are highly unpredictable regardless of precautions. Public health implications include the need for effective surveillance and flexible campus operations. %M 33667173 %R 10.2196/24292 %U https://publichealth.jmir.org/2021/4/e24292 %U https://doi.org/10.2196/24292 %U http://www.ncbi.nlm.nih.gov/pubmed/33667173 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 4 %P e24192 %T An Agent-Based Model of the Local Spread of SARS-CoV-2: Modeling Study %A Staffini,Alessio %A Svensson,Akiko Kishi %A Chung,Ung-Il %A Svensson,Thomas %+ Precision Health, Department of Bioengineering, Graduate School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan, 81 080 7058 1309, alessio.staffini@bocconialumni.it %K computational epidemiology %K COVID-19 %K SARS-CoV-2 %K agent-based modeling %K public health %K computational models %K modeling %K agent %K spread %K computation %K epidemiology %K policy %D 2021 %7 6.4.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: The spread of SARS-CoV-2, originating in Wuhan, China, was classified as a pandemic by the World Health Organization on March 11, 2020. The governments of affected countries have implemented various measures to limit the spread of the virus. The starting point of this paper is the different government approaches, in terms of promulgating new legislative regulations to limit the virus diffusion and to contain negative effects on the populations. Objective: This paper aims to study how the spread of SARS-CoV-2 is linked to government policies and to analyze how different policies have produced different results on public health. Methods: Considering the official data provided by 4 countries (Italy, Germany, Sweden, and Brazil) and from the measures implemented by each government, we built an agent-based model to study the effects that these measures will have over time on different variables such as the total number of COVID-19 cases, intensive care unit (ICU) bed occupancy rates, and recovery and case-fatality rates. The model we implemented provides the possibility of modifying some starting variables, and it was thus possible to study the effects that some policies (eg, keeping the national borders closed or increasing the ICU beds) would have had on the spread of the infection. Results: The 4 considered countries have adopted different containment measures for COVID-19, and the forecasts provided by the model for the considered variables have given different results. Italy and Germany seem to be able to limit the spread of the infection and any eventual second wave, while Sweden and Brazil do not seem to have the situation under control. This situation is also reflected in the forecasts of pressure on the National Health Services, which see Sweden and Brazil with a high occupancy rate of ICU beds in the coming months, with a consequent high number of deaths. Conclusions: In line with what we expected, the obtained results showed that the countries that have taken restrictive measures in terms of limiting the population mobility have managed more successfully than others to contain the spread of COVID-19. Moreover, the model demonstrated that herd immunity cannot be reached even in countries that have relied on a strategy without strict containment measures. %M 33750735 %R 10.2196/24192 %U https://medinform.jmir.org/2021/4/e24192 %U https://doi.org/10.2196/24192 %U http://www.ncbi.nlm.nih.gov/pubmed/33750735 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 3 %P e27317 %T Analyzing Cross-country Pandemic Connectedness During COVID-19 Using a Spatial-Temporal Database: Network Analysis %A Chu,Amanda MY %A Chan,Jacky NL %A Tsang,Jenny TY %A Tiwari,Agnes %A So,Mike KP %+ Department of Information Systems, Business Statistics and Operations Management, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China (Hong Kong), 852 23587726, immkpso@ust.hk %K air traffic %K coronavirus %K COVID-19 %K human mobility %K network analysis %K travel restrictions %D 2021 %7 29.3.2021 %9 Rapid Surveillance Report %J JMIR Public Health Surveill %G English %X Communicable diseases including COVID-19 pose a major threat to public health worldwide. To curb the spread of communicable diseases effectively, timely surveillance and prediction of the risk of pandemics are essential. The aim of this study is to analyze free and publicly available data to construct useful travel data records for network statistics other than common descriptive statistics. This study describes analytical findings of time-series plots and spatial-temporal maps to illustrate or visualize pandemic connectedness. We analyzed data retrieved from the web-based Collaborative Arrangement for the Prevention and Management of Public Health Events in Civil Aviation dashboard, which contains up-to-date and comprehensive meta-information on civil flights from 193 national governments in accordance with the airport, country, city, latitude, and the longitude of flight origin and the destination. We used the database to visualize pandemic connectedness through the workflow of travel data collection, network construction, data aggregation, travel statistics calculation, and visualization with time-series plots and spatial-temporal maps. We observed similar patterns in the time-series plots of worldwide daily flights from January to early-March of 2019 and 2020. A sharp reduction in the number of daily flights recorded in mid-March 2020 was likely related to large-scale air travel restrictions owing to the COVID-19 pandemic. The levels of connectedness between places are strong indicators of the risk of a pandemic. Since the initial reports of COVID-19 cases worldwide, a high network density and reciprocity in early-March 2020 served as early signals of the COVID-19 pandemic and were associated with the rapid increase in COVID-19 cases in mid-March 2020. The spatial-temporal map of connectedness in Europe on March 13, 2020, shows the highest level of connectedness among European countries, which reflected severe outbreaks of COVID-19 in late March and early April of 2020. As a quality control measure, we used the aggregated numbers of international flights from April to October 2020 to compare the number of international flights officially reported by the International Civil Aviation Organization with the data collected from the Collaborative Arrangement for the Prevention and Management of Public Health Events in Civil Aviation dashboard, and we observed high consistency between the 2 data sets. The flexible design of the database provides users access to network connectedness at different periods, places, and spatial levels through various network statistics calculation methods in accordance with their needs. The analysis can facilitate early recognition of the risk of a current communicable disease pandemic and newly emerging communicable diseases in the future. %M 33711799 %R 10.2196/27317 %U https://publichealth.jmir.org/2021/3/e27317 %U https://doi.org/10.2196/27317 %U http://www.ncbi.nlm.nih.gov/pubmed/33711799 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 3 %P e25696 %T Modeling Predictive Age-Dependent and Age-Independent Symptoms and Comorbidities of Patients Seeking Treatment for COVID-19: Model Development and Validation Study %A Huang,Yingxiang %A Radenkovic,Dina %A Perez,Kevin %A Nadeau,Kari %A Verdin,Eric %A Furman,David %+ Buck Institute for Research on Aging, 8001 Redwood Blvd, Novato, CA, 94945, United States, 1 (415) 209 2000, DFurman@buckinstitute.org %K clinical informatics %K predictive modeling %K COVID-19 %K app %K model %K prediction %K symptom %K informatics %K age %K morbidity %K hospital %D 2021 %7 25.3.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: The COVID-19 pandemic continues to ravage and burden hospitals around the world. The epidemic started in Wuhan, China, and was subsequently recognized by the World Health Organization as an international public health emergency and declared a pandemic in March 2020. Since then, the disruptions caused by the COVID-19 pandemic have had an unparalleled effect on all aspects of life. Objective: With increasing total hospitalization and intensive care unit admissions, a better understanding of features related to patients with COVID-19 could help health care workers stratify patients based on the risk of developing a more severe case of COVID-19. Using predictive models, we strive to select the features that are most associated with more severe cases of COVID-19. Methods: Over 3 million participants reported their potential symptoms of COVID-19, along with their comorbidities and demographic information, on a smartphone-based app. Using data from the >10,000 individuals who indicated that they had tested positive for COVID-19 in the United Kingdom, we leveraged the Elastic Net regularized binary classifier to derive the predictors that are most correlated with users having a severe enough case of COVID-19 to seek treatment in a hospital setting. We then analyzed such features in relation to age and other demographics and their longitudinal trend. Results: The most predictive features found include fever, use of immunosuppressant medication, use of a mobility aid, shortness of breath, and severe fatigue. Such features are age-related, and some are disproportionally high in minority populations. Conclusions: Predictors selected from the predictive models can be used to stratify patients into groups based on how much medical attention they are expected to require. This could help health care workers devote valuable resources to prevent the escalation of the disease in vulnerable populations. %M 33621185 %R 10.2196/25696 %U https://www.jmir.org/2021/3/e25696 %U https://doi.org/10.2196/25696 %U http://www.ncbi.nlm.nih.gov/pubmed/33621185 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 3 %P e24925 %T Short-Range Forecasting of COVID-19 During Early Onset at County, Health District, and State Geographic Levels Using Seven Methods: Comparative Forecasting Study %A Lynch,Christopher J %A Gore,Ross %+ Virginia Modeling, Analysis, and Simulation Center, Old Dominion University, 1030 University Blvd, Suffolk, VA, 23435, United States, 1 7576866248, cjlynch@odu.edu %K coronavirus disease 2019 %K COVID-19 %K infectious disease %K emerging outbreak %K forecasting %K modeling and simulation %K public health %K modeling disease outbreaks %D 2021 %7 23.3.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Forecasting methods rely on trends and averages of prior observations to forecast COVID-19 case counts. COVID-19 forecasts have received much media attention, and numerous platforms have been created to inform the public. However, forecasting effectiveness varies by geographic scope and is affected by changing assumptions in behaviors and preventative measures in response to the pandemic. Due to time requirements for developing a COVID-19 vaccine, evidence is needed to inform short-term forecasting method selection at county, health district, and state levels. Objective: COVID-19 forecasts keep the public informed and contribute to public policy. As such, proper understanding of forecasting purposes and outcomes is needed to advance knowledge of health statistics for policy makers and the public. Using publicly available real-time data provided online, we aimed to evaluate the performance of seven forecasting methods utilized to forecast cumulative COVID-19 case counts. Forecasts were evaluated based on how well they forecast 1, 3, and 7 days forward when utilizing 1-, 3-, 7-, or all prior–day cumulative case counts during early virus onset. This study provides an objective evaluation of the forecasting methods to identify forecasting model assumptions that contribute to lower error in forecasting COVID-19 cumulative case growth. This information benefits professionals, decision makers, and the public relying on the data provided by short-term case count estimates at varied geographic levels. Methods: We created 1-, 3-, and 7-day forecasts at the county, health district, and state levels using (1) a naïve approach, (2) Holt-Winters (HW) exponential smoothing, (3) a growth rate approach, (4) a moving average (MA) approach, (5) an autoregressive (AR) approach, (6) an autoregressive moving average (ARMA) approach, and (7) an autoregressive integrated moving average (ARIMA) approach. Forecasts relied on Virginia’s 3464 historical county-level cumulative case counts from March 7 to April 22, 2020, as reported by The New York Times. Statistically significant results were identified using 95% CIs of median absolute error (MdAE) and median absolute percentage error (MdAPE) metrics of the resulting 216,698 forecasts. Results: The next-day MA forecast with 3-day look-back length obtained the lowest MdAE (median 0.67, 95% CI 0.49-0.84, P<.001) and statistically significantly differed from 39 out of 59 alternatives (66%) to 53 out of 59 alternatives (90%) at each geographic level at a significance level of .01. For short-range forecasting, methods assuming stationary means of prior days’ counts outperformed methods with assumptions of weak stationarity or nonstationarity means. MdAPE results revealed statistically significant differences across geographic levels. Conclusions: For short-range COVID-19 cumulative case count forecasting at the county, health district, and state levels during early onset, the following were found: (1) the MA method was effective for forecasting 1-, 3-, and 7-day cumulative case counts; (2) exponential growth was not the best representation of case growth during early virus onset when the public was aware of the virus; and (3) geographic resolution was a factor in the selection of forecasting methods. %M 33621186 %R 10.2196/24925 %U https://www.jmir.org/2021/3/e24925 %U https://doi.org/10.2196/24925 %U http://www.ncbi.nlm.nih.gov/pubmed/33621186 %0 Journal Article %@ 2563-6316 %I JMIR Publications %V 2 %N 1 %P e22617 %T A Framework for a Statistical Characterization of Epidemic Cycles: COVID-19 Case Study %A De Carvalho,Eduardo Atem %A De Carvalho,Rogerio Atem %+ Innovation Hub, Instituto Federal Fluminense, R Cel Walter Kramer, 357, Campos, 28080-565, Brazil, 55 22 27375692, ratem@iff.edu.br %K COVID-19 %K SARS-CoV-2 %K pandemics %K infection control %K models %K experimental %K longitudinal studies %K statistical modeling %K epidemic cycles %D 2021 %7 18.3.2021 %9 Original Paper %J JMIRx Med %G English %X Background: Since the beginning of the COVID-19 pandemic, researchers and health authorities have sought to identify the different parameters that drive its local transmission cycles to make better decisions regarding prevention and control measures. Different modeling approaches have been proposed in an attempt to predict the behavior of these local cycles. Objective: This paper presents a framework to characterize the different variables that drive the local, or epidemic, cycles of the COVID-19 pandemic, in order to provide a set of relatively simple, yet efficient, statistical tools to be used by local health authorities to support decision making. Methods: Virtually closed cycles were compared to cycles in progress from different locations that present similar patterns in the figures that describe them. With the aim to compare populations of different sizes at different periods of time and locations, the cycles were normalized, allowing an analysis based on the core behavior of the numerical series. A model for the reproduction number was derived from the experimental data, and its performance was presented, including the effect of subnotification (ie, underreporting). A variation of the logistic model was used together with an innovative inventory model to calculate the actual number of infected persons, analyze the incubation period, and determine the actual onset of local epidemic cycles. Results: The similarities among cycles were demonstrated. A pattern between the cycles studied, which took on a triangular shape, was identified and used to make predictions about the duration of future cycles. Analyses on effective reproduction number (Rt) and subnotification effects for Germany, Italy, and Sweden were presented to show the performance of the framework introduced here. After comparing data from the three countries, it was possible to determine the probable dates of the actual onset of the epidemic cycles for each country, the typical duration of the incubation period for the disease, and the total number of infected persons during each cycle. In general terms, a probable average incubation time of 5 days was found, and the method used here was able to estimate the end of the cycles up to 34 days in advance, while demonstrating that the impact of the subnotification level (ie, error) on the effective reproduction number was <5%. Conclusions: It was demonstrated that, with relatively simple mathematical tools, it is possible to obtain a reliable understanding of the behavior of COVID-19 local epidemic cycles, by introducing an integrated framework for identifying cycle patterns and calculating the variables that drive it, namely: the Rt, the subnotification effects on estimations, the most probable actual cycles start dates, the total number of infected, and the most likely incubation period for SARS-CoV-2. %M 34077489 %R 10.2196/22617 %U https://xmed.jmir.org/2021/1/e22617 %U https://doi.org/10.2196/22617 %U http://www.ncbi.nlm.nih.gov/pubmed/34077489 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 3 %P e21606 %T The Influence of Social Distancing on COVID-19 Mortality in US Counties: Cross-sectional Study %A Tran,Phoebe %A Tran,Lam %A Tran,Liem %+ Department of Geography, University of Tennessee, 1000 Phillip Fulmer Way, BGB 306, Knoxville, TN, United States, 1 865 974 6034, ltran1@utk.edu %K COVID-19 %K marginal effects %K mortality %K negative binomial model %K social distancing %D 2021 %7 18.3.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Previous studies on the impact of social distancing on COVID-19 mortality in the United States have predominantly examined this relationship at the national level and have not separated COVID-19 deaths in nursing homes from total COVID-19 deaths. This approach may obscure differences in social distancing behaviors by county in addition to the actual effectiveness of social distancing in preventing COVID-19 deaths. Objective: This study aimed to determine the influence of county-level social distancing behavior on COVID-19 mortality (deaths per 100,000 people) across US counties over the period of the implementation of stay-at-home orders in most US states (March-May 2020). Methods: Using social distancing data from tracked mobile phones in all US counties, we estimated the relationship between social distancing (average proportion of mobile phone usage outside of home between March and May 2020) and COVID-19 mortality (when the state in which the county is located reported its first confirmed case of COVID-19 and up to May 31, 2020) with a mixed-effects negative binomial model while distinguishing COVID-19 deaths in nursing homes from total COVID-19 deaths and accounting for social distancing– and COVID-19–related factors (including the period between the report of the first confirmed case of COVID-19 and May 31, 2020; population density; social vulnerability; and hospital resource availability). Results from the mixed-effects negative binomial model were then used to generate marginal effects at the mean, which helped separate the influence of social distancing on COVID-19 deaths from other covariates while calculating COVID-19 deaths per 100,000 people. Results: We observed that a 1% increase in average mobile phone usage outside of home between March and May 2020 led to a significant increase in COVID-19 mortality by a factor of 1.18 (P<.001), while every 1% increase in the average proportion of mobile phone usage outside of home in February 2020 was found to significantly decrease COVID-19 mortality by a factor of 0.90 (P<.001). Conclusions: As stay-at-home orders have been lifted in many US states, continued adherence to other social distancing measures, such as avoiding large gatherings and maintaining physical distance in public, are key to preventing additional COVID-19 deaths in counties across the country. %M 33497348 %R 10.2196/21606 %U https://publichealth.jmir.org/2021/3/e21606 %U https://doi.org/10.2196/21606 %U http://www.ncbi.nlm.nih.gov/pubmed/33497348 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 2 %P e20335 %T Evaluating Apple Inc Mobility Trend Data Related to the COVID-19 Outbreak in Japan: Statistical Analysis %A Kurita,Junko %A Sugishita,Yoshiyuki %A Sugawara,Tamie %A Ohkusa,Yasushi %+ Department of Nursing, Tokiwa University, 1-430-1 Miwa, Mito, Ibraki, 3108585, Japan, 81 29 232 2511, kuritaj@tokiwa.ac.jp %K peak %K COVID-19 %K effective reproduction number %K mobility trend data %K Apple %K countermeasure %D 2021 %7 15.2.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: In Japan, as a countermeasure against the COVID-19 outbreak, both the national and local governments issued voluntary restrictions against going out from residences at the end of March 2020 in preference to the lockdowns instituted in European and North American countries. The effect of such measures can be studied with mobility data, such as data which is generated by counting the number of requests made to Apple Maps for directions in select countries/regions, sub-regions, and cities. Objective: We investigate the associations of mobility data provided by Apple Inc and an estimate an an effective reproduction number R(t). Methods: We regressed R(t) on a polynomial function of daily Apple data, estimated using the whole period, and analyzed subperiods delimited by March 10, 2020. Results: In the estimation results, R(t) was 1.72 when voluntary restrictions against going out ceased and mobility reverted to a normal level. However, the critical level of reducing R(t) to <1 was obtained at 89.3% of normal mobility. Conclusions: We demonstrated that Apple mobility data are useful for short-term prediction of R(t). The results indicate that the number of trips should decrease by 10% until herd immunity is achieved and that higher voluntary restrictions against going out might not be necessary for avoiding a re-emergence of the outbreak. %M 33481755 %R 10.2196/20335 %U http://publichealth.jmir.org/2021/2/e20335/ %U https://doi.org/10.2196/20335 %U http://www.ncbi.nlm.nih.gov/pubmed/33481755 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 1 %P e23034 %T Using the Novel Mortality-Prevalence Ratio to Evaluate Potentially Undocumented SARS-CoV-2 Infection: Correlational Study %A Lin,Sheng-Hsuan %A Fu,Shih-Chen %A Kao,Chu-Lan Michael %+ Institute of Statistics, National Chiao Tung University, Assembly Building I, 4th Floor, 1001 University Road, Hsinchu, 30010, Taiwan, 886 35712121 ext 56822, chulankao@gmail.com %K COVID-19 %K prevalence %K mortality %K undocumented infection %K mortality-prevalence ratio %K China %D 2021 %7 27.1.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The high prevalence of COVID-19 has resulted in 200,000 deaths as of early 2020. The corresponding mortality rate among different countries and times varies. Objective: This study aims to investigate the relationship between the mortality rate and prevalence of COVID-19 within a country. Methods: We collected data from the Johns Hopkins Coronavirus Resource Center. These data included the daily cumulative death count, recovered count, and confirmed count for each country. This study focused on a total of 36 countries with over 10,000 confirmed COVID-19 cases. Mortality was the main outcome and dependent variable, and it was computed by dividing the number of COVID-19 deaths by the number of confirmed cases. Results: The results of our global panel regression analysis showed that there was a highly significant correlation between prevalence and mortality (ρ=0.8304; P<.001). We found that every increment of 1 confirmed COVID-19 case per 1000 individuals led to a 1.29268% increase in mortality, after controlling for country-specific baseline mortality and time-fixed effects. Over 70% of excess mortality could be attributed to prevalence, and the heterogeneity among countries’ mortality-prevalence ratio was significant (P<.001). Further, our results showed that China had an abnormally high and significant mortality-prevalence ratio compared to other countries (P<.001). This unusual deviation in the mortality-prevalence ratio disappeared with the removal of the data that was collected from China after February 17, 2020. It is worth noting that the prevalence of a disease relies on accurate diagnoses and comprehensive surveillance, which can be difficult to achieve due to practical or political concerns. Conclusions: The association between COVID-19 mortality and prevalence was observed and quantified as the mortality-prevalence ratio. Our results highlight the importance of constraining disease transmission to decrease mortality rates. The comparison of mortality-prevalence ratios between countries can be a powerful method for detecting, or even quantifying, the proportion of individuals with undocumented SARS-CoV-2 infection. %M 33332282 %R 10.2196/23034 %U http://publichealth.jmir.org/2021/1/e23034/ %U https://doi.org/10.2196/23034 %U http://www.ncbi.nlm.nih.gov/pubmed/33332282 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 4 %P e23902 %T County-Level Social Distancing and Policy Impact in the United States: A Dynamical Systems Model %A McKee,Kevin L %A Crandell,Ian C %A Hanlon,Alexandra L %+ Center for Biostatistics and Health Data Science, Virginia Tech, One Riverside Circle, Suite 104, Roanoke, VA, 24016, United States, 1 703 593 1690, klmckee@vt.edu %K pandemic %K SARS-CoV-2 %K infection control %K COVID-19 %K social distancing %K lockdown %K nonpharmaceutical interventions %K public health %K intervention %K model %K infectious disease %K policy %D 2020 %7 23.12.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Social distancing and public policy have been crucial for minimizing the spread of SARS-CoV-2 in the United States. Publicly available, county-level time series data on mobility are derived from individual devices with global positioning systems, providing a variety of indices of social distancing behavior per day. Such indices allow a fine-grained approach to modeling public behavior during the pandemic. Previous studies of social distancing and policy have not accounted for the occurrence of pre-policy social distancing and other dynamics reflected in the long-term trajectories of public mobility data. Objective: We propose a differential equation state-space model of county-level social distancing that accounts for distancing behavior leading up to the first official policies, equilibrium dynamics reflected in the long-term trajectories of mobility, and the specific impacts of four kinds of policy. The model is fit to each US county individually, producing a nationwide data set of novel estimated mobility indices. Methods: A differential equation model was fit to three indicators of mobility for each of 3054 counties, with T=100 occasions per county of the following: distance traveled, visitations to key sites, and the log number of interpersonal encounters. The indicators were highly correlated and assumed to share common underlying latent trajectory, dynamics, and responses to policy. Maximum likelihood estimation with the Kalman-Bucy filter was used to estimate the model parameters. Bivariate distributional plots and descriptive statistics were used to examine the resulting county-level parameter estimates. The association of chronology with policy impact was also considered. Results: Mobility dynamics show moderate correlations with two census covariates: population density (Spearman r ranging from 0.11 to 0.31) and median household income (Spearman r ranging from –0.03 to 0.39). Stay-at-home order effects were negatively correlated with both (r=–0.37 and r=–0.38, respectively), while the effects of the ban on all gatherings were positively correlated with both (r=0.51, r=0.39). Chronological ordering of policies was a moderate to strong determinant of their effect per county (Spearman r ranging from –0.12 to –0.56), with earlier policies accounting for most of the change in mobility, and later policies having little or no additional effect. Conclusions: Chronological ordering, population density, and median household income were all associated with policy impact. The stay-at-home order and the ban on gatherings had the largest impacts on mobility on average. The model is implemented in a graphical online app for exploring county-level statistics and running counterfactual simulations. Future studies can incorporate the model-derived indices of social distancing and policy impacts as important social determinants of COVID-19 health outcomes. %M 33296866 %R 10.2196/23902 %U http://publichealth.jmir.org/2020/4/e23902/ %U https://doi.org/10.2196/23902 %U http://www.ncbi.nlm.nih.gov/pubmed/33296866 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e24614 %T Reduction of COVID-19 Incidence and Nonpharmacologic Interventions: Analysis Using a US County–Level Policy Data Set %A Ebrahim,Senan %A Ashworth,Henry %A Noah,Cray %A Kadambi,Adesh %A Toumi,Asmae %A Chhatwal,Jagpreet %+ Harvard Medical School, 25 Shattuck St, Boston, MA, 02115, United States, 1 617 432 1000, senan@hikmahealth.org %K communicable diseases %K COVID-19 %K data set %K pandemic %K policy %K public health %K data %K intervention %K effectiveness %K incidence %K time series %D 2020 %7 21.12.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Worldwide, nonpharmacologic interventions (NPIs) have been the main tool used to mitigate the COVID-19 pandemic. This includes social distancing measures (closing businesses, closing schools, and quarantining symptomatic persons) and contact tracing (tracking and following exposed individuals). While preliminary research across the globe has shown these policies to be effective, there is currently a lack of information on the effectiveness of NPIs in the United States. Objective: The purpose of this study was to create a granular NPI data set at the county level and then analyze the relationship between NPI policies and changes in reported COVID-19 cases. Methods: Using a standardized crowdsourcing methodology, we collected time-series data on 7 key NPIs for 1320 US counties. Results: This open-source data set is the largest and most comprehensive collection of county NPI policy data and meets the need for higher-resolution COVID-19 policy data. Our analysis revealed a wide variation in county-level policies both within and among states (P<.001). We identified a correlation between workplace closures and lower growth rates of COVID-19 cases (P=.004). We found weak correlations between shelter-in-place enforcement and measures of Democratic local voter proportion (R=0.21) and elected leadership (R=0.22). Conclusions: This study is the first large-scale NPI analysis at the county level demonstrating a correlation between NPIs and decreased rates of COVID-19. Future work using this data set will explore the relationship between county-level policies and COVID-19 transmission to optimize real-time policy formulation. %M 33302253 %R 10.2196/24614 %U http://www.jmir.org/2020/12/e24614/ %U https://doi.org/10.2196/24614 %U http://www.ncbi.nlm.nih.gov/pubmed/33302253 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 12 %P e20597 %T Missing-Data Handling Methods for Lifelogs-Based Wellness Index Estimation: Comparative Analysis With Panel Data %A Kim,Ki-Hun %A Kim,Kwang-Jae %+ Faculty of Industrial Design Engineering, Delft University of Technology, Landbergstraat 15, Delft, 2628 CE, Netherlands, 31 625244785, K.Kim-1@tudelft.nl %K lifelogs-based wellness index %K missing-data handling %K health behavior lifelogs %K panel data %K smart wellness service %D 2020 %7 17.12.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: A lifelogs-based wellness index (LWI) is a function for calculating wellness scores based on health behavior lifelogs (eg, daily walking steps and sleep times collected via a smartwatch). A wellness score intuitively shows the users of smart wellness services the overall condition of their health behaviors. LWI development includes estimation (ie, estimating coefficients in LWI with data). A panel data set comprising health behavior lifelogs allows LWI estimation to control for unobserved variables, thereby resulting in less bias. However, these data sets typically have missing data due to events that occur in daily life (eg, smart devices stop collecting data when batteries are depleted), which can introduce biases into LWI coefficients. Thus, the appropriate choice of method to handle missing data is important for reducing biases in LWI estimations with panel data. However, there is a lack of research in this area. Objective: This study aims to identify a suitable missing-data handling method for LWI estimation with panel data. Methods: Listwise deletion, mean imputation, expectation maximization–based multiple imputation, predictive-mean matching–based multiple imputation, k-nearest neighbors–based imputation, and low-rank approximation–based imputation were comparatively evaluated by simulating an existing case of LWI development. A panel data set comprising health behavior lifelogs of 41 college students over 4 weeks was transformed into a reference data set without any missing data. Then, 200 simulated data sets were generated by randomly introducing missing data at proportions from 1% to 80%. The missing-data handling methods were each applied to transform the simulated data sets into complete data sets, and coefficients in a linear LWI were estimated for each complete data set. For each proportion for each method, a bias measure was calculated by comparing the estimated coefficient values with values estimated from the reference data set. Results: Methods performed differently depending on the proportion of missing data. For 1% to 30% proportions, low-rank approximation–based imputation, predictive-mean matching–based multiple imputation, and expectation maximization–based multiple imputation were superior. For 31% to 60% proportions, low-rank approximation–based imputation and predictive-mean matching–based multiple imputation performed best. For over 60% proportions, only low-rank approximation–based imputation performed acceptably. Conclusions: Low-rank approximation–based imputation was the best of the 6 data-handling methods regardless of the proportion of missing data. This superiority is generalizable to other panel data sets comprising health behavior lifelogs given their verified low-rank nature, for which low-rank approximation–based imputation is known to perform effectively. This result will guide missing-data handling in reducing coefficient biases in new development cases of linear LWIs with panel data. %M 33331831 %R 10.2196/20597 %U http://medinform.jmir.org/2020/12/e20597/ %U https://doi.org/10.2196/20597 %U http://www.ncbi.nlm.nih.gov/pubmed/33331831 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 4 %P e23624 %T An Epidemiological Model Considering Isolation to Predict COVID-19 Trends in Tokyo, Japan: Numerical Analysis %A Utamura,Motoaki %A Koizumi,Makoto %A Kirikami,Seiichi %+ Research Laboratory for Nuclear Reactors, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo, 1528550, Japan, 81 3 5477 3464, titech02715@gmail.com %K coronavirus %K COVID-19 %K epidemiological model %K prediction %K Tokyo %K delay differential equation %K SIR model %K model %K epidemiology %K isolation %K trend %D 2020 %7 16.12.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: COVID-19 currently poses a global public health threat. Although Tokyo, Japan, is no exception to this, it was initially affected by only a small-level epidemic. Nevertheless, medical collapse nearly happened since no predictive methods were available to assess infection counts. A standard susceptible-infectious-removed (SIR) epidemiological model has been widely used, but its applicability is limited often to the early phase of an epidemic in the case of a large collective population. A full numerical simulation of the entire period from beginning until end would be helpful for understanding COVID-19 trends in (separate) counts of inpatient and infectious cases and can also aid the preparation of hospital beds and development of quarantine strategies. Objective: This study aimed to develop an epidemiological model that considers the isolation period to simulate a comprehensive trend of the initial epidemic in Tokyo that yields separate counts of inpatient and infectious cases. It was also intended to induce important corollaries of governing equations (ie, effective reproductive number) and equations for the final count. Methods: Time-series data related to SARS-CoV-2 from February 28 to May 23, 2020, from Tokyo and antibody testing conducted by the Japanese government were adopted for this study. A novel epidemiological model based on a discrete delay differential equation (apparent time-lag model [ATLM]) was introduced. The model can predict trends in inpatient and infectious cases in the field. Various data such as daily new confirmed cases, cumulative infections, inpatients, and PCR (polymerase chain reaction) test positivity ratios were used to verify the model. This approach also derived an alternative formulation equivalent to the standard SIR model. Results: In a typical parameter setting, the present ATLM provided 20% less infectious cases in the field compared to the standard SIR model prediction owing to isolation. The basic reproductive number was inferred as 2.30 under the condition that the time lag T from infection to detection and isolation is 14 days. Based on this, an adequate vaccine ratio to avoid an outbreak was evaluated for 57% of the population. We assessed the date (May 23) that the government declared a rescission of the state of emergency. Taking into consideration the number of infectious cases in the field, a date of 1 week later (May 30) would have been most effective. Furthermore, simulation results with a shorter time lag of T=7 and a larger transmission rate of α=1.43α0 suggest that infections at large should reduce by half and inpatient numbers should be similar to those of the first wave of COVID-19. Conclusions: A novel mathematical model was proposed and examined using SARS-CoV-2 data for Tokyo. The simulation agreed with data from the beginning of the pandemic. Shortening the period from infection to hospitalization is effective against outbreaks without rigorous public health interventions and control. %M 33259325 %R 10.2196/23624 %U http://publichealth.jmir.org/2020/4/e23624/ %U https://doi.org/10.2196/23624 %U http://www.ncbi.nlm.nih.gov/pubmed/33259325 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e24286 %T Dynamic Public Health Surveillance to Track and Mitigate the US COVID-19 Epidemic: Longitudinal Trend Analysis Study %A Post,Lori Ann %A Issa,Tariq Ziad %A Boctor,Michael J %A Moss,Charles B %A Murphy,Robert L %A Ison,Michael G %A Achenbach,Chad J %A Resnick,Danielle %A Singh,Lauren Nadya %A White,Janine %A Faber,Joshua Marco Mitchell %A Culler,Kasen %A Brandt,Cynthia A %A Oehmke,James Francis %+ Buehler Center for Health Policy & Economics and Departments of Emergency Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Dr, 9th Floor, Suite 9-9035 Rubloff Building, Chicago, IL, 60611, United States, 1 203 980 7107, lori.post@northwestern.edu %K global COVID-19 surveillance %K United States public health surveillance %K US COVID-19 %K surveillance metrics %K dynamic panel data %K generalized method of the moments %K United States econometrics %K US SARS-CoV-2 %K US COVID-19 surveillance system %K US COVID-19 transmission speed %K COVID-19 transmission acceleration %K COVID-19 speed %K COVID-19 acceleration %K COVID-19 jerk %K COVID-19 persistence %K Arellano-Bond estimator %K COVID-19 %D 2020 %7 3.12.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The emergence of SARS-CoV-2, the virus that causes COVID-19, has led to a global pandemic. The United States has been severely affected, accounting for the most COVID-19 cases and deaths worldwide. Without a coordinated national public health plan informed by surveillance with actionable metrics, the United States has been ineffective at preventing and mitigating the escalating COVID-19 pandemic. Existing surveillance has incomplete ascertainment and is limited by the use of standard surveillance metrics. Although many COVID-19 data sources track infection rates, informing prevention requires capturing the relevant dynamics of the pandemic. Objective: The aim of this study is to develop dynamic metrics for public health surveillance that can inform worldwide COVID-19 prevention efforts. Advanced surveillance techniques are essential to inform public health decision making and to identify where and when corrective action is required to prevent outbreaks. Methods: Using a longitudinal trend analysis study design, we extracted COVID-19 data from global public health registries. We used an empirical difference equation to measure daily case numbers for our use case in 50 US states and the District of Colombia as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano-Bond estimator in R. Results: Examination of the United States and state data demonstrated that most US states are experiencing outbreaks as measured by these new metrics of speed, acceleration, jerk, and persistence. Larger US states have high COVID-19 caseloads as a function of population size, density, and deficits in adherence to public health guidelines early in the epidemic, and other states have alarming rates of speed, acceleration, jerk, and 7-day persistence in novel infections. North and South Dakota have had the highest rates of COVID-19 transmission combined with positive acceleration, jerk, and 7-day persistence. Wisconsin and Illinois also have alarming indicators and already lead the nation in daily new COVID-19 infections. As the United States enters its third wave of COVID-19, all 50 states and the District of Colombia have positive rates of speed between 7.58 (Hawaii) and 175.01 (North Dakota), and persistence, ranging from 4.44 (Vermont) to 195.35 (North Dakota) new infections per 100,000 people. Conclusions: Standard surveillance techniques such as daily and cumulative infections and deaths are helpful but only provide a static view of what has already occurred in the pandemic and are less helpful in prevention. Public health policy that is informed by dynamic surveillance can shift the country from reacting to COVID-19 transmissions to being proactive and taking corrective action when indicators of speed, acceleration, jerk, and persistence remain positive week over week. Implicit within our dynamic surveillance is an early warning system that indicates when there is problematic growth in COVID-19 transmissions as well as signals when growth will become explosive without action. A public health approach that focuses on prevention can prevent major outbreaks in addition to endorsing effective public health policies. Moreover, subnational analyses on the dynamics of the pandemic allow us to zero in on where transmissions are increasing, meaning corrective action can be applied with precision in problematic areas. Dynamic public health surveillance can inform specific geographies where quarantines are necessary while preserving the economy in other US areas. %M 33216726 %R 10.2196/24286 %U https://www.jmir.org/2020/12/e24286 %U https://doi.org/10.2196/24286 %U http://www.ncbi.nlm.nih.gov/pubmed/33216726 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e20144 %T Novel Indicator to Ascertain the Status and Trend of COVID-19 Spread: Modeling Study %A Nakano,Takashi %A Ikeda,Yoichi %+ Research Center for Nuclear Physics, Osaka University, 10-1 Mihogaoka, Ibaraki, Osaka, 567-0047, Japan, 81 6 6879 8900, nakano@rcnp.osaka-u.ac.jp %K communicable diseases %K COVID-19 %K SARS-CoV-2 %K model %K modeling %K virus %K infectious disease %K spread %D 2020 %7 30.11.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: In the fight against the pandemic of COVID-19, it is important to ascertain the status and trend of the infection spread quickly and accurately. Objective: The purpose of our study is to formulate a new and simple indicator that represents the COVID-19 spread rate by using publicly available data. Methods: The new indicator K is a backward difference approximation of the logarithmic derivative of the cumulative number of cases with a time interval of 7 days. It is calculated as a ratio of the number of newly confirmed cases in a week to the total number of cases. Results: The analysis of the current status of COVID-19 spreading over countries showed an approximate linear decrease in the time evolution of the K value. The slope of the linear decrease differed from country to country. In addition, it was steeper for East and Southeast Asian countries than for European countries. The regional difference in the slope seems to reflect both social and immunological circumstances for each country. Conclusions: The approximate linear decrease of the K value indicates that the COVID-19 spread does not grow exponentially but starts to attenuate from the early stage. The K trajectory in a wide range was successfully reproduced by a phenomenological model with the constant attenuation assumption, indicating that the total number of the infected people follows the Gompertz curve. Focusing on the change in the value of K will help to improve and refine epidemiological models of COVID-19. %M 33180742 %R 10.2196/20144 %U http://www.jmir.org/2020/11/e20144/ %U https://doi.org/10.2196/20144 %U http://www.ncbi.nlm.nih.gov/pubmed/33180742 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e24248 %T A SARS-CoV-2 Surveillance System in Sub-Saharan Africa: Modeling Study for Persistence and Transmission to Inform Policy %A Post,Lori Ann %A Argaw,Salem T %A Jones,Cameron %A Moss,Charles B %A Resnick,Danielle %A Singh,Lauren Nadya %A Murphy,Robert Leo %A Achenbach,Chad J %A White,Janine %A Issa,Tariq Ziad %A Boctor,Michael J %A Oehmke,James Francis %+ Buehler Center for Health Policy & Economics and Departments of Emergency Medicine, Feinberg School of Medicine, Northwestern University, 750 N Lake Shore Dr, 9th Floor, Suite 9-9035 Rubloff Building, Chicago, IL, 60611, United States, 1 203 980 7107, lori.post@northwestern.edu %K global COVID-19 surveillance %K African public health surveillance %K sub-Saharan African COVID-19 %K African surveillance metrics %K dynamic panel data %K generalized method of the moments %K African econometrics %K African SARS-CoV-2 %K African COVID-19 surveillance system %K African COVID-19 transmission speed %K African COVID-19 transmission acceleration %K COVID-19 transmission deceleration %K COVID-19 transmission jerk %K COVID-19 7-day persistence %K Sao Tome and Principe %K Senegal %K Seychelles %K Sierra Leone %K Somalia %K South Africa %K South Sudan %K Sudan %K Suriname %K Swaziland %K Tanzania %K Togo %K Uganda %K Zambia %K Zimbabwe %K Gambia %K Ghana %K Guinea %K Guinea-Bissau %K Kenya %K Lesotho %K Liberia %K Madagascar %K Malawi %K Mali %K Mauritania %K Mauritius %K Mozambique %K Namibia %K Niger %K Nigeria %K Rwanda %K Angola %K Benin %K Botswana %K Burkina Faso %K Burundi %K Cameroon %K Central African Republic %K Chad %K Comoros %K Congo %K Cote d'Ivoire %K Democratic Republic of Congo %K Equatorial Guinea %K Eritrea %K Ethiopia %K Gabon %D 2020 %7 19.11.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Since the novel coronavirus emerged in late 2019, the scientific and public health community around the world have sought to better understand, surveil, treat, and prevent the disease, COVID-19. In sub-Saharan Africa (SSA), many countries responded aggressively and decisively with lockdown measures and border closures. Such actions may have helped prevent large outbreaks throughout much of the region, though there is substantial variation in caseloads and mortality between nations. Additionally, the health system infrastructure remains a concern throughout much of SSA, and the lockdown measures threaten to increase poverty and food insecurity for the subcontinent’s poorest residents. The lack of sufficient testing, asymptomatic infections, and poor reporting practices in many countries limit our understanding of the virus’s impact, creating a need for better and more accurate surveillance metrics that account for underreporting and data contamination. Objective: The goal of this study is to improve infectious disease surveillance by complementing standardized metrics with new and decomposable surveillance metrics of COVID-19 that overcome data limitations and contamination inherent in public health surveillance systems. In addition to prevalence of observed daily and cumulative testing, testing positivity rates, morbidity, and mortality, we derived COVID-19 transmission in terms of speed, acceleration or deceleration, change in acceleration or deceleration (jerk), and 7-day transmission rate persistence, which explains where and how rapidly COVID-19 is transmitting and quantifies shifts in the rate of acceleration or deceleration to inform policies to mitigate and prevent COVID-19 and food insecurity in SSA. Methods: We extracted 60 days of COVID-19 data from public health registries and employed an empirical difference equation to measure daily case numbers in 47 sub-Saharan countries as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano-Bond estimator in R. Results: Kenya, Ghana, Nigeria, Ethiopia, and South Africa have the most observed cases of COVID-19, and the Seychelles, Eritrea, Mauritius, Comoros, and Burundi have the fewest. In contrast, the speed, acceleration, jerk, and 7-day persistence indicate rates of COVID-19 transmissions differ from observed cases. In September 2020, Cape Verde, Namibia, Eswatini, and South Africa had the highest speed of COVID-19 transmissions at 13.1, 7.1, 3.6, and 3 infections per 100,0000, respectively; Zimbabwe had an acceleration rate of transmission, while Zambia had the largest rate of deceleration this week compared to last week, referred to as a jerk. Finally, the 7-day persistence rate indicates the number of cases on September 15, 2020, which are a function of new infections from September 8, 2020, decreased in South Africa from 216.7 to 173.2 and Ethiopia from 136.7 to 106.3 per 100,000. The statistical approach was validated based on the regression results; they determined recent changes in the pattern of infection, and during the weeks of September 1-8 and September 9-15, there were substantial country differences in the evolution of the SSA pandemic. This change represents a decrease in the transmission model R value for that week and is consistent with a de-escalation in the pandemic for the sub-Saharan African continent in general. Conclusions: Standard surveillance metrics such as daily observed new COVID-19 cases or deaths are necessary but insufficient to mitigate and prevent COVID-19 transmission. Public health leaders also need to know where COVID-19 transmission rates are accelerating or decelerating, whether those rates increase or decrease over short time frames because the pandemic can quickly escalate, and how many cases today are a function of new infections 7 days ago. Even though SSA is home to some of the poorest countries in the world, development and population size are not necessarily predictive of COVID-19 transmission, meaning higher income countries like the United States can learn from African countries on how best to implement mitigation and prevention efforts. International Registered Report Identifier (IRRID): RR2-10.2196/21955 %M 33211026 %R 10.2196/24248 %U https://www.jmir.org/2020/11/e24248 %U https://doi.org/10.2196/24248 %U http://www.ncbi.nlm.nih.gov/pubmed/33211026 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 4 %P e21688 %T Availability and Quality of Surveillance and Survey Data on HIV Prevalence Among Sex Workers, Men Who Have Sex With Men, People Who Inject Drugs, and Transgender Women in Low- and Middle-Income Countries: Review of Available Data (2001-2017) %A Arias Garcia,Sonia %A Chen,Jia %A Calleja,Jesus Garcia %A Sabin,Keith %A Ogbuanu,Chinelo %A Lowrance,David %A Zhao,Jinkou %+ The Global Fund to Fight AIDS, Tuberculosis and Malaria, Chemin du Pommier 40, Grand-Saconnex, Geneva, 1218, Switzerland, 41 794117847, jinkou.zhao@theglobalfund.org %K Key populations %K HIV prevalence %K men who have sex with men %K people who inject drugs %K sex workers %K transgender women %K low- and middle-income countries %D 2020 %7 17.11.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: In 2019, 62% of new HIV infections occurred among key populations (KPs) and their sexual partners. The World Health Organization (WHO) recommends implementation of bio-behavioral surveys every 2-3 years to obtain HIV prevalence data for all KPs. However, the collection of these data is often less frequent and geographically limited. Objective: This study intended to assess the availability and quality of HIV prevalence data among sex workers (SWs), men who have sex with men (MSM), people who inject drugs, and transgender women (transwomen) in low- and middle-income countries. Methods: Data were obtained from survey reports, national reports, journal articles, and other grey literature available to the Global Fund, Joint United Nations Programme on HIV/AIDS, and WHO or from other open sources. Elements reviewed included names of subnational units, HIV prevalence, sampling method, and size. Based on geographical coverage, availability of trends over time, and recency of estimates, data were categorized by country and grouped as follows: nationally adequate, locally adequate but nationally inadequate, no recent data, no trends available, and no data. Results: Among the 123 countries assessed, 91.9% (113/123) presented at least 1 HIV prevalence data point for any KP; 78.0% (96/123) presented data for at least 2 groups; and 51.2% (63/123), for at least 3 groups. Data on all 4 groups were available for only 14.6% (18/123) of the countries. HIV prevalence data for SWs, MSM, people who inject drugs, and transwomen were available in 86.2% (106/123), 80.5% (99/123), 45.5% (56/123), and 23.6% (29/123) of the countries, respectively. Only 10.6% (13/123) of the countries presented nationally adequate data for any KP between 2001 and 2017; 6 for SWs; 2 for MSM; and 5 for people who inject drugs. Moreover, 26.8% (33/123) of the countries were categorized as locally adequate but nationally inadequate, mostly for SWs and MSM. No trend data on SWs and MSM were available for 38.2% (47/123) and 43.9% (54/123) of the countries, respectively, while no data on people who inject drugs and transwomen were available for 76.4% (94/123) and 54.5% (67/123) of the countries, respectively. An increase in the number of data points was observed for MSM and transwomen. Overall increases were noted in the number and proportions of data points, especially for MSM, people who inject drugs, and transwomen, with sample sizes exceeding 100. Conclusions: Despite general improvements in health data availability and quality, the availability of HIV prevalence data among the most vulnerable populations in low- and middle-income countries remains insufficient. Data collection should be expanded to include behavioral, clinical, and epidemiologic data through context-specific differentiated survey approaches while emphasizing data use for program improvements. Ending the HIV epidemic by 2030 is possible only if the epidemic is controlled among KPs. %M 33200996 %R 10.2196/21688 %U http://publichealth.jmir.org/2020/4/e21688/ %U https://doi.org/10.2196/21688 %U http://www.ncbi.nlm.nih.gov/pubmed/33200996 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e23139 %T Evaluating Identity Disclosure Risk in Fully Synthetic Health Data: Model Development and Validation %A El Emam,Khaled %A Mosquera,Lucy %A Bass,Jason %+ School of Epidemiology and Public Health, Faculty of Medicine, University of Ottawa, 401 Smyth Road, Ottawa, ON, K1H 8L1, Canada, 1 6137975412, kelemam@ehealthinformation.ca %K synthetic data %K privacy %K data sharing %K data access %K de-identification %K open data %D 2020 %7 16.11.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: There has been growing interest in data synthesis for enabling the sharing of data for secondary analysis; however, there is a need for a comprehensive privacy risk model for fully synthetic data: If the generative models have been overfit, then it is possible to identify individuals from synthetic data and learn something new about them. Objective: The purpose of this study is to develop and apply a methodology for evaluating the identity disclosure risks of fully synthetic data. Methods: A full risk model is presented, which evaluates both identity disclosure and the ability of an adversary to learn something new if there is a match between a synthetic record and a real person. We term this “meaningful identity disclosure risk.” The model is applied on samples from the Washington State Hospital discharge database (2007) and the Canadian COVID-19 cases database. Both of these datasets were synthesized using a sequential decision tree process commonly used to synthesize health and social science data. Results: The meaningful identity disclosure risk for both of these synthesized samples was below the commonly used 0.09 risk threshold (0.0198 and 0.0086, respectively), and 4 times and 5 times lower than the risk values for the original datasets, respectively. Conclusions: We have presented a comprehensive identity disclosure risk model for fully synthetic data. The results for this synthesis method on 2 datasets demonstrate that synthesis can reduce meaningful identity disclosure risks considerably. The risk model can be applied in the future to evaluate the privacy of fully synthetic data. %M 33196453 %R 10.2196/23139 %U http://www.jmir.org/2020/11/e23139/ %U https://doi.org/10.2196/23139 %U http://www.ncbi.nlm.nih.gov/pubmed/33196453 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 4 %P e24291 %T Analysis of the COVID-19 Epidemic Transmission Network in Mainland China: K-Core Decomposition Study %A Qin,Lei %A Wang,Yidan %A Sun,Qiang %A Zhang,Xiaomei %A Shia,Ben-Chang %A Liu,Chengcheng %+ School of Statistics, Capital University of Economics and Business, No 121 Huaxiang Zhangjia Road, Fengtai District, Beijing, 100070, China, 86 188 1152 1258, ccliu@cueb.edu.cn %K COVID-19 %K epidemic network %K prevention and control %K k-core decomposition %D 2020 %7 13.11.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Since the outbreak of COVID-19 in December 2019 in Wuhan, Hubei Province, China, frequent interregional contacts and the high rate of infection spread have catalyzed the formation of an epidemic network. Objective: The aim of this study was to identify influential nodes and highlight the hidden structural properties of the COVID-19 epidemic network, which we believe is central to prevention and control of the epidemic. Methods: We first constructed a network of the COVID-19 epidemic among 31 provinces in mainland China; after some basic characteristics were revealed by the degree distribution, the k-core decomposition method was employed to provide static and dynamic evidence to determine the influential nodes and hierarchical structure. We then exhibited the influence power of the above nodes and the evolution of this power. Results: Only a small fraction of the provinces studied showed relatively strong outward or inward epidemic transmission effects. The three provinces of Hubei, Beijing, and Guangzhou showed the highest out-degrees, and the three highest in-degrees were observed for the provinces of Beijing, Henan, and Liaoning. In terms of the hierarchical structure of the COVID-19 epidemic network over the whole period, more than half of the 31 provinces were located in the innermost core. Considering the correlation of the characteristics and coreness of each province, we identified some significant negative and positive factors. Specific to the dynamic transmission process of the COVID-19 epidemic, three provinces of Anhui, Beijing, and Guangdong always showed the highest coreness from the third to the sixth week; meanwhile, Hubei Province maintained the highest coreness until the fifth week and then suddenly dropped to the lowest in the sixth week. We also found that the out-strengths of the innermost nodes were greater than their in-strengths before January 27, 2020, at which point a reversal occurred. Conclusions: Increasing our understanding of how epidemic networks form and function may help reduce the damaging effects of COVID-19 in China as well as in other countries and territories worldwide. %M 33108309 %R 10.2196/24291 %U http://publichealth.jmir.org/2020/4/e24291/ %U https://doi.org/10.2196/24291 %U http://www.ncbi.nlm.nih.gov/pubmed/33108309 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 4 %P e22678 %T Transmission Dynamics of the COVID-19 Epidemic at the District Level in India: Prospective Observational Study %A Saurabh,Suman %A Verma,Mahendra Kumar %A Gautam,Vaishali %A Kumar,Nitesh %A Goel,Akhil Dhanesh %A Gupta,Manoj Kumar %A Bhardwaj,Pankaj %A Misra,Sanjeev %+ Department of Community Medicine and Family Medicine, All India Institute of Medical Sciences, 2nd Floor, Academic Building, 342005, Jodhpur, India, 91 7766906623, drsumansaurabh@gmail.com %K Epidemiology %K SARS-CoV-2 %K COVID-19 %K serial interval %K basic reproduction number %K projection %K outbreak response %K India %K mathematical modeling %K infectious disease %D 2020 %7 15.10.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: On March 9, 2020, the first COVID-19 case was reported in Jodhpur, Rajasthan, in the northwestern part of India. Understanding the epidemiology of COVID-19 at a local level is becoming increasingly important to guide measures to control the pandemic. Objective: The aim of this study was to estimate the serial interval and basic reproduction number (R0) to understand the transmission dynamics of the COVID-19 outbreak at a district level. We used standard mathematical modeling approaches to assess the utility of these factors in determining the effectiveness of COVID-19 responses and projecting the size of the epidemic. Methods: Contact tracing of individuals infected with SARS-CoV-2 was performed to obtain the serial intervals. The median and 95th percentile values of the SARS-CoV-2 serial interval were obtained from the best fits with the weibull, log-normal, log-logistic, gamma, and generalized gamma distributions. Aggregate and instantaneous R0 values were derived with different methods using the EarlyR and EpiEstim packages in R software. Results: The median and 95th percentile values of the serial interval were 5.23 days (95% CI 4.72-5.79) and 13.20 days (95% CI 10.90-18.18), respectively. R0 during the first 30 days of the outbreak was 1.62 (95% CI 1.07-2.17), which subsequently decreased to 1.15 (95% CI 1.09-1.21). The peak instantaneous R0 values obtained using a Poisson process developed by Jombert et al were 6.53 (95% CI 2.12-13.38) and 3.43 (95% CI 1.71-5.74) for sliding time windows of 7 and 14 days, respectively. The peak R0 values obtained using the method by Wallinga and Teunis were 2.96 (95% CI 2.52-3.36) and 2.92 (95% CI 2.65-3.22) for sliding time windows of 7 and 14 days, respectively. R0 values of 1.21 (95% CI 1.09-1.34) and 1.12 (95% CI 1.03-1.21) for the 7- and 14-day sliding time windows, respectively, were obtained on July 6, 2020, using method by Jombert et al. Using the method by Wallinga and Teunis, values of 0.32 (95% CI 0.27-0.36) and 0.61 (95% CI 0.58-0.63) were obtained for the 7- and 14-day sliding time windows, respectively. The projection of cases over the next month was 2131 (95% CI 1799-2462). Reductions of transmission by 25% and 50% corresponding to reasonable and aggressive control measures could lead to 58.7% and 84.0% reductions in epidemic size, respectively. Conclusions: The projected transmission reductions indicate that strengthening control measures could lead to proportionate reductions of the size of the COVID-19 epidemic. Time-dependent instantaneous R0 estimation based on the process by Jombart et al was found to be better suited for guiding COVID-19 response at the district level than overall R0 or instantaneous R0 estimation by the Wallinga and Teunis method. A data-driven approach at the local level is proposed to be useful in guiding public health strategy and surge capacity planning. %M 33001839 %R 10.2196/22678 %U http://publichealth.jmir.org/2020/4/e22678/ %U https://doi.org/10.2196/22678 %U http://www.ncbi.nlm.nih.gov/pubmed/33001839 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 10 %P e21081 %T How Data Analytics and Big Data Can Help Scientists in Managing COVID-19 Diffusion: Modeling Study to Predict the COVID-19 Diffusion in Italy and the Lombardy Region %A Tosi,Davide %A Campi,Alessandro %+ Politecnico di Milano, Piazza Leonardo da Vinci 32, Milano, Italy, 39 0223993644, alessandro.campi@polimi.it %K COVID-19 %K SARS-CoV-2 %K big data %K data analytics %K predictive models %K prediction %K modeling %K Italy %K diffusion %D 2020 %7 14.10.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: COVID-19 is the most widely discussed topic worldwide in 2020, and at the beginning of the Italian epidemic, scientists tried to understand the virus diffusion and the epidemic curve of positive cases with controversial findings and numbers. Objective: In this paper, a data analytics study on the diffusion of COVID-19 in Italy and the Lombardy Region is developed to define a predictive model tailored to forecast the evolution of the diffusion over time. Methods: Starting with all available official data collected worldwide about the diffusion of COVID-19, we defined a predictive model at the beginning of March 2020 for the Italian country. Results: This paper aims at showing how this predictive model was able to forecast the behavior of the COVID-19 diffusion and how it predicted the total number of positive cases in Italy over time. The predictive model forecasted, for the Italian country, the end of the COVID-19 first wave by the beginning of June. Conclusions: This paper shows that big data and data analytics can help medical experts and epidemiologists in promptly designing accurate and generalized models to predict the different COVID-19 evolutionary phases in other countries and regions, and for second and third possible epidemic waves. %M 33027038 %R 10.2196/21081 %U http://www.jmir.org/2020/10/e21081/ %U https://doi.org/10.2196/21081 %U http://www.ncbi.nlm.nih.gov/pubmed/33027038 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 10 %P e21955 %T Dynamic Panel Surveillance of COVID-19 Transmission in the United States to Inform Health Policy: Observational Statistical Study %A Oehmke,James Francis %A Moss,Charles B %A Singh,Lauren Nadya %A Oehmke,Theresa Bristol %A Post,Lori Ann %+ Buehler Center for Health Policy and Economics, Feinberg School of Medicine, Northwestern University, 420 E Superior St, Chicago, IL, , United States, 1 203 980 7107, lori.post@northwestern.edu %K COVID-19 %K models %K surveillance %K reopening America %K contagion %K metrics %K surveillance %K health policy %K public health %D 2020 %7 5.10.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The Great COVID-19 Shutdown aimed to eliminate or slow the spread of SARS-CoV-2, the virus that causes COVID-19. The United States has no national policy, leaving states to independently implement public health guidelines that are predicated on a sustained decline in COVID-19 cases. Operationalization of “sustained decline” varies by state and county. Existing models of COVID-19 transmission rely on parameters such as case estimates or R0 and are dependent on intensive data collection efforts. Static statistical models do not capture all of the relevant dynamics required to measure sustained declines. Moreover, existing COVID-19 models use data that are subject to significant measurement error and contamination. Objective: This study will generate novel metrics of speed, acceleration, jerk, and 7-day lag in the speed of COVID-19 transmission using state government tallies of SARS-CoV-2 infections, including state-level dynamics of SARS-CoV-2 infections. This study provides the prototype for a global surveillance system to inform public health practice, including novel standardized metrics of COVID-19 transmission, for use in combination with traditional surveillance tools. Methods: Dynamic panel data models were estimated with the Arellano-Bond estimator using the generalized method of moments. This statistical technique allows for the control of a variety of deficiencies in the existing data. Tests of the validity of the model and statistical techniques were applied. Results: The statistical approach was validated based on the regression results, which determined recent changes in the pattern of infection. During the weeks of August 17-23 and August 24-30, 2020, there were substantial regional differences in the evolution of the US pandemic. Census regions 1 and 2 were relatively quiet with a small but significant persistence effect that remained relatively unchanged from the prior 2 weeks. Census region 3 was sensitive to the number of tests administered, with a high constant rate of cases. A weekly special analysis showed that these results were driven by states with a high number of positive test reports from universities. Census region 4 had a high constant number of cases and a significantly increased persistence effect during the week of August 24-30. This change represents an increase in the transmission model R value for that week and is consistent with a re-emergence of the pandemic. Conclusions: Reopening the United States comes with three certainties: (1) the “social” end of the pandemic and reopening are going to occur before the “medical” end even while the pandemic is growing. We need improved standardized surveillance techniques to inform leaders when it is safe to open sections of the country; (2) varying public health policies and guidelines unnecessarily result in varying degrees of transmission and outbreaks; and (3) even those states most successful in containing the pandemic continue to see a small but constant stream of new cases daily. %M 32924962 %R 10.2196/21955 %U https://www.jmir.org/2020/10/e21955 %U https://doi.org/10.2196/21955 %U http://www.ncbi.nlm.nih.gov/pubmed/32924962 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e20924 %T Dynamic Panel Estimate–Based Health Surveillance of SARS-CoV-2 Infection Rates to Inform Public Health Policy: Model Development and Validation %A Oehmke,James Francis %A Oehmke,Theresa B %A Singh,Lauren Nadya %A Post,Lori Ann %+ Department of Emergency Medicine, Feinberg School of Medicine, Northwestern University, 420 E. Superior St, Chicago, IL, 60611, United States, 1 203 980 7107, lori.post@northwestern.edu %K COVID-19 %K models %K surveillance %K COVID-19 surveillance system %K dynamic panel data %K infectious disease modeling %K reopening America %K COVID-19 guidelines %K COVID-19 health policy %D 2020 %7 22.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: SARS-CoV-2, the novel coronavirus that causes COVID-19, is a global pandemic with higher mortality and morbidity than any other virus in the last 100 years. Without public health surveillance, policy makers cannot know where and how the disease is accelerating, decelerating, and shifting. Unfortunately, existing models of COVID-19 contagion rely on parameters such as the basic reproduction number and use static statistical methods that do not capture all the relevant dynamics needed for surveillance. Existing surveillance methods use data that are subject to significant measurement error and other contaminants. Objective: The aim of this study is to provide a proof of concept of the creation of surveillance metrics that correct for measurement error and data contamination to determine when it is safe to ease pandemic restrictions. We applied state-of-the-art statistical modeling to existing internet data to derive the best available estimates of the state-level dynamics of COVID-19 infection in the United States. Methods: Dynamic panel data (DPD) models were estimated with the Arellano-Bond estimator using the generalized method of moments. This statistical technique enables control of various deficiencies in a data set. The validity of the model and statistical technique was tested. Results: A Wald chi-square test of the explanatory power of the statistical approach indicated that it is valid (χ210=1489.84, P<.001), and a Sargan chi-square test indicated that the model identification is valid (χ2946=935.52, P=.59). The 7-day persistence rate for the week of June 27 to July 3 was 0.5188 (P<.001), meaning that every 10,000 new cases in the prior week were associated with 5188 cases 7 days later. For the week of July 4 to 10, the 7-day persistence rate increased by 0.2691 (P=.003), indicating that every 10,000 new cases in the prior week were associated with 7879 new cases 7 days later. Applied to the reported number of cases, these results indicate an increase of almost 100 additional new cases per day per state for the week of July 4-10. This signifies an increase in the reproduction parameter in the contagion models and corroborates the hypothesis that economic reopening without applying best public health practices is associated with a resurgence of the pandemic. Conclusions: DPD models successfully correct for measurement error and data contamination and are useful to derive surveillance metrics. The opening of America involves two certainties: the country will be COVID-19–free only when there is an effective vaccine, and the “social” end of the pandemic will occur before the “medical” end. Therefore, improved surveillance metrics are needed to inform leaders of how to open sections of the United States more safely. DPD models can inform this reopening in combination with the extraction of COVID-19 data from existing websites. %M 32915762 %R 10.2196/20924 %U http://www.jmir.org/2020/9/e20924/ %U https://doi.org/10.2196/20924 %U http://www.ncbi.nlm.nih.gov/pubmed/32915762 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 3 %P e21152 %T Prediction of the Transition From Subexponential to the Exponential Transmission of SARS-CoV-2 in Chennai, India: Epidemic Nowcasting %A Krishnamurthy,Kamalanand %A Ambikapathy,Bakiya %A Kumar,Ashwani %A Britto,Lourduraj De %+ Vector Control Research Centre, Indian Council for Medical Research, Indra Nagar, Puducherry, 605006, India, 91 4132272841, rljbritto@gmail.com %K COVID-19 %K epidemic %K mathematical modeling %K probabilistic models %K public transport %K exponential transmission %D 2020 %7 18.9.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Several countries adopted lockdown to slowdown the exponential transmission of the coronavirus disease (COVID-19) epidemic. Disease transmission models and the epidemic forecasts at the national level steer the policy to implement appropriate intervention strategies and budgeting. However, it is critical to design a data-driven reliable model for nowcasting for smaller populations, in particular metro cities. Objective: The aim of this study is to analyze the transition of the epidemic from subexponential to exponential transmission in the Chennai metro zone and to analyze the probability of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) secondary infections while availing the public transport systems in the city. Methods: A single geographical zone “Chennai-Metro-Merge” was constructed by combining Chennai District with three bordering districts. Subexponential and exponential models were developed to analyze and predict the progression of the COVID-19 epidemic. Probabilistic models were applied to assess the probability of secondary infections while availing public transport after the release of the lockdown. Results: The model predicted that transition from subexponential to exponential transmission occurs around the eighth week after the reporting of a cluster of cases. The probability of secondary infections with a single index case in an enclosure of the city bus, the suburban train general coach, and the ladies coach was found to be 0.192, 0.074, and 0.114, respectively. Conclusions: Nowcasting at the early stage of the epidemic predicts the probable time point of the exponential transmission and alerts the public health system. After the lockdown release, public transportation will be the major source of SARS-CoV-2 transmission in metro cities, and appropriate strategies based on nowcasting are needed. %M 32609621 %R 10.2196/21152 %U https://publichealth.jmir.org/2020/3/e21152 %U https://doi.org/10.2196/21152 %U http://www.ncbi.nlm.nih.gov/pubmed/32609621 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 3 %P e18965 %T Flexible, Freely Available Stochastic Individual Contact Model for Exploring COVID-19 Intervention and Control Strategies: Development and Simulation %A Churches,Timothy %A Jorm,Louisa %+ Ingham Institute for Applied Medical Research, South Western Sydney Clinical School, Faculty of Medicine, University of New South Wales Sydney, 1 Campbell St, Liverpool, 2071, Australia, 61 468819609, timothy.churches@unsw.edu.au %K COVID-19 %K epidemic curve %K infection dynamics %K public health interventions %D 2020 %7 18.9.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Throughout March 2020, leaders in countries across the world were making crucial decisions about how and when to implement public health interventions to combat the coronavirus disease (COVID-19). They urgently needed tools to help them to explore what will work best in their specific circumstances of epidemic size and spread, and feasible intervention scenarios. Objective: We sought to rapidly develop a flexible, freely available simulation model for use by modelers and researchers to allow investigation of how various public health interventions implemented at various time points might change the shape of the COVID-19 epidemic curve. Methods: “COVOID” (COVID-19 Open-Source Infection Dynamics) is a stochastic individual contact model (ICM), which extends the ICMs provided by the open-source EpiModel package for the R statistical computing environment. To demonstrate its use and inform urgent decisions on March 30, 2020, we modeled similar intervention scenarios to those reported by other investigators using various model types, as well as novel scenarios. The scenarios involved isolation of cases, moderate social distancing, and stricter population “lockdowns” enacted over varying time periods in a hypothetical population of 100,000 people. On April 30, 2020, we simulated the epidemic curve for the three contiguous local areas (population 287,344) in eastern Sydney, Australia that recorded 5.3% of Australian cases of COVID-19 through to April 30, 2020, under five different intervention scenarios and compared the modeled predictions with the observed epidemic curve for these areas. Results: COVOID allocates each member of a population to one of seven compartments. The number of times individuals in the various compartments interact with each other and their probability of transmitting infection at each interaction can be varied to simulate the effects of interventions. Using COVOID on March 30, 2020, we were able to replicate the epidemic response patterns to specific social distancing intervention scenarios reported by others. The simulated curve for three local areas of Sydney from March 1 to April 30, 2020, was similar to the observed epidemic curve in terms of peak numbers of cases, total numbers of cases, and duration under a scenario representing the public health measures that were actually enacted, including case isolation and ramp-up of testing and social distancing measures. Conclusions: COVOID allows rapid modeling of many potential intervention scenarios, can be tailored to diverse settings, and requires only standard computing infrastructure. It replicates the epidemic curves produced by other models that require highly detailed population-level data, and its predicted epidemic curve, using parameters simulating the public health measures that were enacted, was similar in form to that actually observed in Sydney, Australia. Our team and collaborators are currently developing an extended open-source COVOID package comprising of a suite of tools to explore intervention scenarios using several categories of models. %M 32568729 %R 10.2196/18965 %U https://publichealth.jmir.org/2020/3/e18965 %U https://doi.org/10.2196/18965 %U http://www.ncbi.nlm.nih.gov/pubmed/32568729 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 3 %P e18281 %T Potential Early Identification of a Large Campylobacter Outbreak Using Alternative Surveillance Data Sources: Autoregressive Modelling and Spatiotemporal Clustering %A Adnan,Mehnaz %A Gao,Xiaoying %A Bai,Xiaohan %A Newbern,Elizabeth %A Sherwood,Jill %A Jones,Nicholas %A Baker,Michael %A Wood,Tim %A Gao,Wei %+ Institute of Environmental Science and Research, Kenepuru Science Centre, Porirua, 5022, New Zealand, 64 274044941, mehnaz.adnan@esr.cri.nz %K Campylobacter %K disease outbreaks %K forecasting %K spatio-temporal analysis %D 2020 %7 17.9.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Over one-third of the population of Havelock North, New Zealand, approximately 5500 people, were estimated to have been affected by campylobacteriosis in a large waterborne outbreak. Cases reported through the notifiable disease surveillance system (notified case reports) are inevitably delayed by several days, resulting in slowed outbreak recognition and delayed control measures. Early outbreak detection and magnitude prediction are critical to outbreak control. It is therefore important to consider alternative surveillance data sources and evaluate their potential for recognizing outbreaks at the earliest possible time. Objective: The first objective of this study is to compare and validate the selection of alternative data sources (general practice consultations, consumer helpline, Google Trends, Twitter microblogs, and school absenteeism) for their temporal predictive strength for Campylobacter cases during the Havelock North outbreak. The second objective is to examine spatiotemporal clustering of data from alternative sources to assess the size and geographic extent of the outbreak and to support efforts to attribute its source. Methods: We combined measures derived from alternative data sources during the 2016 Havelock North campylobacteriosis outbreak with notified case report counts to predict suspected daily Campylobacter case counts up to 5 days before cases reported in the disease surveillance system. Spatiotemporal clustering of the data was analyzed using Local Moran’s I statistics to investigate the extent of the outbreak in both space and time within the affected area. Results: Models that combined consumer helpline data with autoregressive notified case counts had the best out-of-sample predictive accuracy for 1 and 2 days ahead of notified case reports. Models using Google Trends and Twitter typically performed the best 3 and 4 days before case notifications. Spatiotemporal clusters showed spikes in school absenteeism and consumer helpline inquiries that preceded the notified cases in the city primarily affected by the outbreak. Conclusions: Alternative data sources can provide earlier indications of a large gastroenteritis outbreak compared with conventional case notifications. Spatiotemporal analysis can assist in refining the geographical focus of an outbreak and can potentially support public health source attribution efforts. Further work is required to assess the location of such surveillance data sources and methods in routine public health practice. %M 32940617 %R 10.2196/18281 %U http://publichealth.jmir.org/2020/3/e18281/ %U https://doi.org/10.2196/18281 %U http://www.ncbi.nlm.nih.gov/pubmed/32940617 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 8 %N 9 %P e17977 %T Data Imputation and Body Weight Variability Calculation Using Linear and Nonlinear Methods in Data Collected From Digital Smart Scales: Simulation and Validation Study %A Turicchi,Jake %A O'Driscoll,Ruairi %A Finlayson,Graham %A Duarte,Cristiana %A Palmeira,A L %A Larsen,Sofus C %A Heitmann,Berit L %A Stubbs,R James %+ School of Psychology, The University of Leeds, 2 Lifton Place, Leeds, LS2 9JS, United Kingdom, 44 7718300764, psjt@leeds.ac.uk %K weight variability %K weight fluctuation %K weight cycling %K weight instability %K imputation %K validation %K digital tracking %K smart scales %K body weight %K energy balance %D 2020 %7 11.9.2020 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Body weight variability (BWV) is common in the general population and may act as a risk factor for obesity or diseases. The correct identification of these patterns may have prognostic or predictive value in clinical and research settings. With advancements in technology allowing for the frequent collection of body weight data from electronic smart scales, new opportunities to analyze and identify patterns in body weight data are available. Objective: This study aims to compare multiple methods of data imputation and BWV calculation using linear and nonlinear approaches Methods: In total, 50 participants from an ongoing weight loss maintenance study (the NoHoW study) were selected to develop the procedure. We addressed the following aspects of data analysis: cleaning, imputation, detrending, and calculation of total and local BWV. To test imputation, missing data were simulated at random and using real patterns of missingness. A total of 10 imputation strategies were tested. Next, BWV was calculated using linear and nonlinear approaches, and the effects of missing data and data imputation on these estimates were investigated. Results: Body weight imputation using structural modeling with Kalman smoothing or an exponentially weighted moving average provided the best agreement with observed values (root mean square error range 0.62%-0.64%). Imputation performance decreased with missingness and was similar between random and nonrandom simulations. Errors in BWV estimations from missing simulated data sets were low (2%-7% with 80% missing data or a mean of 67, SD 40.1 available body weights) compared with that of imputation strategies where errors were significantly greater, varying by imputation method. Conclusions: The decision to impute body weight data depends on the purpose of the analysis. Directions for the best performing imputation methods are provided. For the purpose of estimating BWV, data imputation should not be conducted. Linear and nonlinear methods of estimating BWV provide reasonably accurate estimates under high proportions (80%) of missing data. %M 32915155 %R 10.2196/17977 %U http://mhealth.jmir.org/2020/9/e17977/ %U https://doi.org/10.2196/17977 %U http://www.ncbi.nlm.nih.gov/pubmed/32915155 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 3 %P e19446 %T Early Stage Machine Learning–Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach %A Mehta,Mihir %A Julaiti,Juxihong %A Griffin,Paul %A Kumara,Soundar %+ Purdue University, Regenstrief Center for Healthcare Engineering, West Lafayette, IN, 47907, United States, 1 765 496 7395, paulgriffin@purdue.edu %K COVID-19 %K coronavirus %K prediction model %K county-level vulnerability %K machine learning %K XGBoost %D 2020 %7 11.9.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The rapid spread of COVID-19 means that government and health services providers have little time to plan and design effective response policies. It is therefore important to quickly provide accurate predictions of how vulnerable geographic regions such as counties are to the spread of this virus. Objective: The aim of this study is to develop county-level prediction around near future disease movement for COVID-19 occurrences using publicly available data. Methods: We estimated county-level COVID-19 occurrences for the period March 14 to 31, 2020, based on data fused from multiple publicly available sources inclusive of health statistics, demographics, and geographical features. We developed a three-stage model using XGBoost, a machine learning algorithm, to quantify the probability of COVID-19 occurrence and estimate the number of potential occurrences for unaffected counties. Finally, these results were combined to predict the county-level risk. This risk was then used as an estimated after-five-day-vulnerability of the county. Results: The model predictions showed a sensitivity over 71% and specificity over 94% for models built using data from March 14 to 31, 2020. We found that population, population density, percentage of people aged >70 years, and prevalence of comorbidities play an important role in predicting COVID-19 occurrences. We observed a positive association at the county level between urbanicity and vulnerability to COVID-19. Conclusions: The developed model can be used for identification of vulnerable counties and potential data discrepancies. Limited testing facilities and delayed results introduce significant variation in reported cases, which produces a bias in the model. %M 32784193 %R 10.2196/19446 %U http://publichealth.jmir.org/2020/3/e19446/ %U https://doi.org/10.2196/19446 %U http://www.ncbi.nlm.nih.gov/pubmed/32784193 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e19907 %T Real-World Implications of a Rapidly Responsive COVID-19 Spread Model with Time-Dependent Parameters via Deep Learning: Model Development and Validation %A Jung,Se Young %A Jo,Hyeontae %A Son,Hwijae %A Hwang,Hyung Ju %+ Department of Mathematics, Pohang University of Science and Technology, 77, Cheongam-ro, Nam-gu, Pohang-si, Gyeongsangbuk-do, Pohang, 37673, Republic of Korea, 82 542792056, hjhwang@postech.ac.kr %K epidemic models %K SIR models %K time-dependent parameters %K neural networks %K deep learning %K COVID-19 %K modeling %K spread %K outbreak %D 2020 %7 9.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The COVID-19 pandemic has caused major disruptions worldwide since March 2020. The experience of the 1918 influenza pandemic demonstrated that decreases in the infection rates of COVID-19 do not guarantee continuity of the trend. Objective: The aim of this study was to develop a precise spread model of COVID-19 with time-dependent parameters via deep learning to respond promptly to the dynamic situation of the outbreak and proactively minimize damage. Methods: In this study, we investigated a mathematical model with time-dependent parameters via deep learning based on forward-inverse problems. We used data from the Korea Centers for Disease Control and Prevention (KCDC) and the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University for Korea and the other countries, respectively. Because the data consist of confirmed, recovered, and deceased cases, we selected the susceptible-infected-recovered (SIR) model and found approximated solutions as well as model parameters. Specifically, we applied fully connected neural networks to the solutions and parameters and designed suitable loss functions. Results: We developed an entirely new SIR model with time-dependent parameters via deep learning methods. Furthermore, we validated the model with the conventional Runge-Kutta fourth order model to confirm its convergent nature. In addition, we evaluated our model based on the real-world situation reported from the KCDC, the Korean government, and news media. We also crossvalidated our model using data from the CSSE for Italy, Sweden, and the United States. Conclusions: The methodology and new model of this study could be employed for short-term prediction of COVID-19, which could help the government prepare for a new outbreak. In addition, from the perspective of measuring medical resources, our model has powerful strength because it assumes all the parameters as time-dependent, which reflects the exact status of viral spread. %M 32877350 %R 10.2196/19907 %U http://www.jmir.org/2020/9/e19907/ %U https://doi.org/10.2196/19907 %U http://www.ncbi.nlm.nih.gov/pubmed/32877350 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 3 %P e12842 %T An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study %A Sambaturu,Prathyush %A Bhattacharya,Parantapa %A Chen,Jiangzhuo %A Lewis,Bryan %A Marathe,Madhav %A Venkatramanan,Srinivasan %A Vullikanti,Anil %+ University of Virginia, Biocomplexity Institute and Initiative, 995 Research Park Boulevard, Charlottesville, VA, 22911, United States, 1 540 577 3102, vsakumar@virginia.edu %K epidemic data analysis %K summarization %K spatio-temporal patterns %K transactional data mining %D 2020 %7 4.9.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general public, are often interested in deeper patterns and insights into how the disease is spreading, with additional context. Analysis by domain experts is needed for deriving such insights from incidence data. Objective: Our goal was to develop an automated approach for finding interesting spatio-temporal patterns in the spread of a disease over a large region, such as regions which have specific characteristics (eg, high incidence in a particular week, those which showed a sudden change in incidence) or regions which have significantly different incidence compared to earlier seasons. Methods: We developed techniques from the area of transactional data mining for characterizing and finding interesting spatio-temporal patterns in disease spread in an automated manner. A key part of our approach involved using the principle of minimum description length for representing a given target set in terms of combinations of attributes (referred to as clauses); we considered both positive and negative clauses, relaxed descriptions which approximately represent the set, and used integer programming to find such descriptions. Finally, we designed an automated approach, which examines a large space of sets corresponding to different spatio-temporal patterns, and ranks them based on the ratio of their size to their description length (referred to as the compression ratio). Results: We applied our methods using minimum description length to find spatio-temporal patterns in the spread of seasonal influenza in the United States using state level influenza-like illness activity indicator data from the CDC. We observed that the compression ratios were over 2.5 for 50% of the chosen sets, when approximate descriptions and negative clauses were allowed. Sets with high compression ratios (eg, over 2.5) corresponded to interesting patterns in the spatio-temporal dynamics of influenza-like illness. Our approach also outperformed description by solution in terms of the compression ratio. Conclusions: Our approach, which is an unsupervised machine learning method, can provide new insights into patterns and trends in the disease spread in an automated manner. Our results show that the description complexity is an effective approach for characterizing sets of interest, which can be easily extended to other diseases and regions beyond influenza in the US. Our approach can also be easily adapted for automated generation of narratives. %M 32701458 %R 10.2196/12842 %U http://publichealth.jmir.org/2020/3/e12842/ %U https://doi.org/10.2196/12842 %U http://www.ncbi.nlm.nih.gov/pubmed/32701458 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e21345 %T The P Value Line Dance: When Does the Music Stop? %A Bendtsen,Marcus %+ Department of Health, Medicine and Caring Sciences, Division of Society and Health, Linköping University, Linköping, 581 83, Sweden, 46 13 28 10 00, marcus.bendtsen@liu.se %K sample size %K randomized controlled trial %K Bayesian analysis %K P value %K dichotomization %K dichotomy %K error %K uncertainty %D 2020 %7 27.8.2020 %9 Viewpoint %J J Med Internet Res %G English %X When should a trial stop? Such a seemingly innocent question evokes concerns of type I and II errors among those who believe that certainty can be the product of uncertainty and among researchers who have been told that they need to carefully calculate sample sizes, consider multiplicity, and not spend P values on interim analyses. However, the endeavor to dichotomize evidence into significant and nonsignificant has led to the basic driving force of science, namely uncertainty, to take a back seat. In this viewpoint we discuss that if testing the null hypothesis is the ultimate goal of science, then we need not worry about writing protocols, consider ethics, apply for funding, or run any experiments at all—all null hypotheses will be rejected at some point—everything has an effect. The job of science should be to unearth the uncertainties of the effects of treatments, not to test their difference from zero. We also show the fickleness of P values, how they may one day point to statistically significant results; and after a few more participants have been recruited, the once statistically significant effect suddenly disappears. We show plots which we hope would intuitively highlight that all assessments of evidence will fluctuate over time. Finally, we discuss the remedy in the form of Bayesian methods, where uncertainty leads; and which allows for continuous decision making to stop or continue recruitment, as new data from a trial is accumulated. %M 32852275 %R 10.2196/21345 %U http://www.jmir.org/2020/8/e21345/ %U https://doi.org/10.2196/21345 %U http://www.ncbi.nlm.nih.gov/pubmed/32852275 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e20259 %T Prognostic Modeling of COVID-19 Using Artificial Intelligence in the United Kingdom: Model Development and Validation %A Abdulaal,Ahmed %A Patel,Aatish %A Charani,Esmita %A Denny,Sarah %A Mughal,Nabeela %A Moore,Luke %+ NIHR Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, Exhibition Rd, South Kensington, London, SW7 2AZ, United Kingdom, 44 783 436 6302, l.moore@imperial.ac.uk %K COVID-19 %K coronavirus %K machine learning %K deep learning %K modeling %K artificial intelligence %K neural network %K prediction %D 2020 %7 25.8.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak is a public health emergency and the case fatality rate in the United Kingdom is significant. Although there appear to be several early predictors of outcome, there are no currently validated prognostic models or scoring systems applicable specifically to patients with confirmed SARS-CoV-2. Objective: We aim to create a point-of-admission mortality risk scoring system using an artificial neural network (ANN). Methods: We present an ANN that can provide a patient-specific, point-of-admission mortality risk prediction to inform clinical management decisions at the earliest opportunity. The ANN analyzes a set of patient features including demographics, comorbidities, smoking history, and presenting symptoms and predicts patient-specific mortality risk during the current hospital admission. The model was trained and validated on data extracted from 398 patients admitted to hospital with a positive real-time reverse transcription polymerase chain reaction (RT-PCR) test for SARS-CoV-2. Results: Patient-specific mortality was predicted with 86.25% accuracy, with a sensitivity of 87.50% (95% CI 61.65%-98.45%) and specificity of 85.94% (95% CI 74.98%-93.36%). The positive predictive value was 60.87% (95% CI 45.23%-74.56%), and the negative predictive value was 96.49% (95% CI 88.23%-99.02%). The area under the receiver operating characteristic curve was 90.12%. Conclusions: This analysis demonstrates an adaptive ANN trained on data at a single site, which demonstrates the early utility of deep learning approaches in a rapidly evolving pandemic with no established or validated prognostic scoring systems. %M 32735549 %R 10.2196/20259 %U http://www.jmir.org/2020/8/e20259/ %U https://doi.org/10.2196/20259 %U http://www.ncbi.nlm.nih.gov/pubmed/32735549 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e20285 %T Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models %A Liu,Dianbo %A Clemente,Leonardo %A Poirier,Canelle %A Ding,Xiyu %A Chinazzi,Matteo %A Davis,Jessica %A Vespignani,Alessandro %A Santillana,Mauricio %+ Computational Health Informatics Program, Boston Children’s Hospital, 300 Longwood Avenue, Landmark 5th Floor East, Boston, MA, 02215, United States, 1 (617) 919 1795, msantill@g.harvard.edu %K COVID-19 %K coronavirus %K digital epidemiology %K modeling %K modeling disease outbreaks %K emerging outbreak %K machine learning %K precision public health %K machine learning in public health %K forecasting %K digital data %K mechanistic model %K hybrid simulation %K hybrid model %K simulation %D 2020 %7 17.8.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals and caused over 150,000 deaths, is an example of one of these catastrophic events. Objective: We present a timely and novel methodology that combines disease estimates from mechanistic models and digital traces, via interpretable machine learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces in real time. Methods: Our method uses the following as inputs: (a) official health reports, (b) COVID-19–related internet search activity, (c) news media activity, and (d) daily forecasts of COVID-19 activity from a metapopulation mechanistic model. Our machine learning methodology uses a clustering technique that enables the exploitation of geospatial synchronicities of COVID-19 activity across Chinese provinces and a data augmentation technique to deal with the small number of historical disease observations characteristic of emerging outbreaks. Results: Our model is able to produce stable and accurate forecasts 2 days ahead of the current time and outperforms a collection of baseline models in 27 out of 32 Chinese provinces. Conclusions: Our methodology could be easily extended to other geographies currently affected by COVID-19 to aid decision makers with monitoring and possibly prevention. %M 32730217 %R 10.2196/20285 %U http://www.jmir.org/2020/8/e20285/ %U https://doi.org/10.2196/20285 %U http://www.ncbi.nlm.nih.gov/pubmed/32730217 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e19615 %T Global Research on Coronaviruses: An R Package %A Warin,Thierry %+ HEC Montreal, 3000, chemin de la Côte-Sainte-Catherine, Montreal, QC, H3T 2A7, Canada, 1 514 340 6185, thierry.warin@hec.ca %K COVID-19 %K SARS-CoV-2 %K coronavirus %K R package %K bibliometric %K virus %K infectious disease %K reference %K informatics %D 2020 %7 11.8.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: In these trying times, we developed an R package about bibliographic references on coronaviruses. Working with reproducible research principles based on open science, disseminating scientific information, providing easy access to scientific production on this particular issue, and offering a rapid integration in researchers’ workflows may help save time in this race against the virus, notably in terms of public health. Objective: The goal is to simplify the workflow of interested researchers, with multidisciplinary research in mind. With more than 60,500 medical bibliographic references at the time of publication, this package is among the largest about coronaviruses. Methods: This package could be of interest to epidemiologists, researchers in scientometrics, biostatisticians, as well as data scientists broadly defined. This package collects references from PubMed and organizes the data in a data frame. We then built functions to sort through this collection of references. Researchers can also integrate the data into their pipeline and implement them in R within their code libraries. Results: We provide a short use case in this paper based on a bibliometric analysis of the references made available by this package. Classification techniques can also be used to go through the large volume of references and allow researchers to save time on this part of their research. Network analysis can be used to filter the data set. Text mining techniques can also help researchers calculate similarity indices and help them focus on the parts of the literature that are relevant for their research. Conclusions: This package aims at accelerating research on coronaviruses. Epidemiologists can integrate this package into their workflow. It is also possible to add a machine learning layer on top of this package to model the latest advances in research about coronaviruses, as we update this package daily. It is also the only one of this size, to the best of our knowledge, to be built in the R language. %M 32730218 %R 10.2196/19615 %U http://www.jmir.org/2020/8/e19615/ %U https://doi.org/10.2196/19615 %U http://www.ncbi.nlm.nih.gov/pubmed/32730218 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 3 %P e18880 %T General Model for COVID-19 Spreading With Consideration of Intercity Migration, Insufficient Testing, and Active Intervention: Modeling Study of Pandemic Progression in Japan and the United States %A Zhan,Choujun %A Tse,Chi Kong %A Lai,Zhikang %A Chen,Xiaoyun %A Mo,Mingshen %+ City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, 0000, Hong Kong, 852 92701816, cktse@ieee.org %K pandemic spreading %K SEICR model %K COVID-19 %K prediction %K effect of intervention %D 2020 %7 3.7.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The coronavirus disease (COVID-19) began to spread in mid-December 2019 from Wuhan, China, to most provinces in China and over 200 other countries through an active travel network. Limited by the ability of the country or city to perform tests, the officially reported number of confirmed cases is expected to be much smaller than the true number of infected cases. Objective: This study aims to develop a new susceptible-exposed-infected-confirmed-removed (SEICR) model for predicting the spreading progression of COVID-19 with consideration of intercity travel and the difference between the number of confirmed cases and actual infected cases, and to apply the model to provide a realistic prediction for the United States and Japan under different scenarios of active intervention. Methods: The model introduces a new state variable corresponding to the actual number of infected cases, integrates intercity travel data to track the movement of exposed and infected individuals among cities, and allows different levels of active intervention to be considered so that a realistic prediction of the number of infected individuals can be performed. Moreover, the model generates future progression profiles for different levels of intervention by setting the parameters relative to the values found from the data fitting. Results: By fitting the model with the data of the COVID-19 infection cases and the intercity travel data for Japan (January 15 to March 20, 2020) and the United States (February 20 to March 20, 2020), model parameters were found and then used to predict the pandemic progression in 47 regions of Japan and 50 states (plus a federal district) in the United States. The model revealed that, as of March 19, 2020, the number of infected individuals in Japan and the United States could be 20-fold and 5-fold as many as the number of confirmed cases, respectively. The results showed that, without tightening the implementation of active intervention, Japan and the United States will see about 6.55% and 18.2% of the population eventually infected, respectively, and with a drastic 10-fold elevated active intervention, the number of people eventually infected can be reduced by up to 95% in Japan and 70% in the United States. Conclusions: The new SEICR model has revealed the effectiveness of active intervention for controlling the spread of COVID-19. Stepping up active intervention would be more effective for Japan, and raising the level of public vigilance in maintaining personal hygiene and social distancing is comparatively more important for the United States. %M 32589145 %R 10.2196/18880 %U https://publichealth.jmir.org/2020/3/e18880 %U https://doi.org/10.2196/18880 %U http://www.ncbi.nlm.nih.gov/pubmed/32589145 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 2 %P e19353 %T Modeling COVID-19 Latent Prevalence to Assess a Public Health Intervention at a State and Regional Scale: Retrospective Cohort Study %A Turk,Philip J %A Chou,Shih-Hsiung %A Kowalkowski,Marc A %A Palmer,Pooja P %A Priem,Jennifer S %A Spencer,Melanie D %A Taylor,Yhenneko J %A McWilliams,Andrew D %+ Center for Outcomes Research and Evaluation, Atrium Health, 1300 Scott Ave, Office 124, Charlotte, NC, 28203, United States, 1 304 376 5377, Philip.Turk@atriumhealth.org %K COVID-19 %K public health surveillance %K novel coronavirus 2019 %K pandemic %K forecasting %K SIR model %K detection probability %K latent prevalence %D 2020 %7 19.6.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Emergence of the coronavirus disease (COVID-19) caught the world off guard and unprepared, initiating a global pandemic. In the absence of evidence, individual communities had to take timely action to reduce the rate of disease spread and avoid overburdening their health care systems. Although a few predictive models have been published to guide these decisions, most have not taken into account spatial differences and have included assumptions that do not match the local realities. Access to reliable information that is adapted to local context is critical for policy makers to make informed decisions during a rapidly evolving pandemic. Objective: The goal of this study was to develop an adapted susceptible-infected-removed (SIR) model to predict the trajectory of the COVID-19 pandemic in North Carolina and the Charlotte Metropolitan Region, and to incorporate the effect of a public health intervention to reduce disease spread while accounting for unique regional features and imperfect detection. Methods: Three SIR models were fit to infection prevalence data from North Carolina and the greater Charlotte Region and then rigorously compared. One of these models (SIR-int) accounted for a stay-at-home intervention and imperfect detection of COVID-19 cases. We computed longitudinal total estimates of the susceptible, infected, and removed compartments of both populations, along with other pandemic characteristics such as the basic reproduction number. Results: Prior to March 26, disease spread was rapid at the pandemic onset with the Charlotte Region doubling time of 2.56 days (95% CI 2.11-3.25) and in North Carolina 2.94 days (95% CI 2.33-4.00). Subsequently, disease spread significantly slowed with doubling times increased in the Charlotte Region to 4.70 days (95% CI 3.77-6.22) and in North Carolina to 4.01 days (95% CI 3.43-4.83). Reflecting spatial differences, this deceleration favored the greater Charlotte Region compared to North Carolina as a whole. A comparison of the efficacy of intervention, defined as 1 – the hazard ratio of infection, gave 0.25 for North Carolina and 0.43 for the Charlotte Region. In addition, early in the pandemic, the initial basic SIR model had good fit to the data; however, as the pandemic and local conditions evolved, the SIR-int model emerged as the model with better fit. Conclusions: Using local data and continuous attention to model adaptation, our findings have enabled policy makers, public health officials, and health systems to proactively plan capacity and evaluate the impact of a public health intervention. Our SIR-int model for estimated latent prevalence was reasonably flexible, highly accurate, and demonstrated efficacy of a stay-at-home order at both the state and regional level. Our results highlight the importance of incorporating local context into pandemic forecast modeling, as well as the need to remain vigilant and informed by the data as we enter into a critical period of the outbreak. %M 32427104 %R 10.2196/19353 %U http://publichealth.jmir.org/2020/2/e19353/ %U https://doi.org/10.2196/19353 %U http://www.ncbi.nlm.nih.gov/pubmed/32427104 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 2 %P e15044 %T Assessing Bias in Population Size Estimates Among Hidden Populations When Using the Service Multiplier Method Combined With Respondent-Driven Sampling Surveys: Survey Study %A Chabata,Sungai T %A Fearon,Elizabeth %A Webb,Emily L %A Weiss,Helen A %A Hargreaves,James R %A Cowan,Frances M %+ Centre for Sexual Health and HIV/AIDS Research, 4 Bath Road, Belgravia, Harare, Zimbabwe, 263 773577686, sungaichabata@gmail.com %K service multiplier method %K respondent-driven sampling %K population size estimation %K female sex workers %K key populations %K HIV %K Zimbabwe %D 2020 %7 15.6.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Population size estimates (PSEs) for hidden populations at increased risk of HIV, including female sex workers (FSWs), are important to inform public health policy and resource allocation. The service multiplier method (SMM) is commonly used to estimate the sizes of hidden populations. We used this method to obtain PSEs for FSWs at 9 sites in Zimbabwe and explored methods for assessing potential biases that could arise in using this approach. Objective: This study aimed to guide the assessment of biases that arise when estimating the population sizes of hidden populations using the SMM combined with respondent-driven sampling (RDS) surveys. Methods: We conducted RDS surveys at 9 sites in late 2013, where the Sisters with a Voice program (the program), which collects program visit data of FSWs, was also present. Using the SMM, we obtained PSEs for FSWs at each site by dividing the number of FSWs who attended the program, based on program records, by the RDS-II weighted proportion of FSWs who reported attending this program in the previous 6 months in the RDS surveys. Both the RDS weighting and SMM make a number of assumptions, potentially leading to biases if the assumptions are not met. To test these assumptions, we used convergence and bottleneck plots to assess seed dependence of RDS-II proportion estimates, chi-square tests to assess if there was an association between the characteristics of FSWs and their knowledge of program existence, and logistic regression to compare the characteristics of FSWs attending the program with those recruited to RDS surveys. Results: The PSEs ranged from 194 (95% CI 62-325) to 805 (95% CI 456-1142) across 9 sites from May to November 2013. The 95% CIs for the majority of sites were wide. In some sites, the RDS-II proportion of women who reported program use in the RDS surveys may have been influenced by the characteristics of selected seeds, and we also observed bottlenecks in some sites. There was no evidence of association between characteristics of FSWs and knowledge of program existence, and in the majority of sites, there was no evidence that the characteristics of the populations differed between RDS and program data. Conclusions: We used a series of rigorous methods to explore potential biases in our PSEs. We were able to identify the biases and their potential direction, but we could not determine the ultimate direction of these biases in our PSEs. We have evidence that the PSEs in most sites may be biased and a suggestion that the bias is toward underestimation, and this should be considered if the PSEs are to be used. These tests for bias should be included when undertaking population size estimation using the SMM combined with RDS surveys. %M 32459645 %R 10.2196/15044 %U http://publichealth.jmir.org/2020/2/e15044/ %U https://doi.org/10.2196/15044 %U http://www.ncbi.nlm.nih.gov/pubmed/32459645 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 6 %P e15073 %T Distributed Regression Analysis Application in Large Distributed Data Networks: Analysis of Precision and Operational Performance %A Her,Qoua %A Malenfant,Jessica %A Zhang,Zilu %A Vilk,Yury %A Young,Jessica %A Tabano,David %A Hamilton,Jack %A Johnson,Ron %A Raebel,Marsha %A Boudreau,Denise %A Toh,Sengwee %+ Harvard Medical School, Harvard Pilgrim Health Care Institute, 401 Park Drive, 4th Floor East, Boston, MA, 02215, United States, 1 617 867 4885, qouaher@gmail.com %K distributed regression analysis %K distributed data networks %K privacy-protecting analytics %K pharmacoepidemiology %K PopMedNet %D 2020 %7 4.6.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: A distributed data network approach combined with distributed regression analysis (DRA) can reduce the risk of disclosing sensitive individual and institutional information in multicenter studies. However, software that facilitates large-scale and efficient implementation of DRA is limited. Objective: This study aimed to assess the precision and operational performance of a DRA application comprising a SAS-based DRA package and a file transfer workflow developed within the open-source distributed networking software PopMedNet in a horizontally partitioned distributed data network. Methods: We executed the SAS-based DRA package to perform distributed linear, logistic, and Cox proportional hazards regression analysis on a real-world test case with 3 data partners. We used PopMedNet to iteratively and automatically transfer highly summarized information between the data partners and the analysis center. We compared the DRA results with the results from standard SAS procedures executed on the pooled individual-level dataset to evaluate the precision of the SAS-based DRA package. We computed the execution time of each step in the workflow to evaluate the operational performance of the PopMedNet-driven file transfer workflow. Results: All DRA results were precise (<10−12), and DRA model fit curves were identical or similar to those obtained from the corresponding pooled individual-level data analyses. All regression models required less than 20 min for full end-to-end execution. Conclusions: We integrated a SAS-based DRA package with PopMedNet and successfully tested the new capability within an active distributed data network. The study demonstrated the validity and feasibility of using DRA to enable more privacy-protecting analysis in multicenter studies. %M 32496200 %R 10.2196/15073 %U https://medinform.jmir.org/2020/6/e15073 %U https://doi.org/10.2196/15073 %U http://www.ncbi.nlm.nih.gov/pubmed/32496200 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 5 %P e18664 %T Optimization of Precontrol Methods and Analysis of a Dynamic Model for Brucellosis: Model Development and Validation %A Huang,Yihao %A Li,Mingtao %+ College of Mathematics, Shanxi University of Technology, 79, Yingze West St, Taiyuan, 030024, China, 86 13403459876, mingtaoli@sohu.com %K brucellosis %K dynamic model %K protective measures %K precontrol methods %D 2020 %7 27.5.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Brucella is a gram-negative, nonmotile bacterium without a capsule. The infection scope of Brucella is wide. The major source of infection is mammals such as cattle, sheep, goats, pigs, and dogs. Currently, human beings do not transmit Brucella to each other. When humans eat Brucella-contaminated food or contact animals or animal secretions and excretions infected with Brucella, they may develop brucellosis. Although brucellosis does not originate in humans, its diagnosis and cure are very difficult; thus, it has a huge impact on humans. Even with the rapid development of medical science, brucellosis is still a major problem for Chinese people. Currently, the number of patients with brucellosis in China is 100,000 per year. In addition, due to the ongoing improvement in the living standards of Chinese people, the demand for meat products has gradually increased, and increased meat transactions have greatly promoted the spread of brucellosis. Therefore, many researchers are concerned with investigating the transmission of Brucella as well as the diagnosis and treatment of brucellosis. Mathematical models have become an important tool for the study of infectious diseases. Mathematical models can reflect the spread of infectious diseases and be used to study the effect of different inhibition methods on infectious diseases. The effect of control measures to obtain effective suppression can provide theoretical support for the suppression of infectious diseases. Therefore, it is the objective of this study to build a suitable mathematical model for brucellosis infection. Objective: We aimed to study the optimized precontrol methods of brucellosis using a dynamic threshold–based microcomputer model and to provide critical theoretical support for the prevention and control of brucellosis. Methods: By studying the transmission characteristics of Brucella and building a Brucella transmission model, the precontrol methods were designed and presented to the key populations (Brucella-susceptible populations). We investigated the utilization of protective tools by the key populations before and after precontrol methods. Results: An improvement in the amount of glove-wearing was evident and significant (P<.001), increasing from 51.01% before the precontrol methods to 66.22% after the precontrol methods, an increase of 15.21%. However, the amount of hat-wearing did not improve significantly (P=.95). Hat-wearing among the key populations increased from 57.3% before the precontrol methods to 58.6% after the precontrol methods, an increase of 1.3%. Conclusions: By demonstrating the optimized precontrol methods for a brucellosis model built on a dynamic threshold–based microcomputer model, this study provides theoretical support for the suppression of Brucella and the improved usage of protective measures by key populations. %M 32459180 %R 10.2196/18664 %U https://medinform.jmir.org/2020/5/e18664 %U https://doi.org/10.2196/18664 %U http://www.ncbi.nlm.nih.gov/pubmed/32459180 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 5 %P e18627 %T Application of a Mathematical Model in Determining the Spread of the Rabies Virus: Simulation Study %A Huang,Yihao %A Li,Mingtao %+ School of Computer and Information Technology, Shanxi University, 92 Wucheng Road, Taiyuan, 030006, China, 86 15834136789, 297535248@qq.com %K rabies %K computer model %K suppression measures %K basic reproductive number %D 2020 %7 27.5.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Rabies is an acute infectious disease of the central nervous system caused by the rabies virus. The mortality rate of rabies is almost 100%. For some countries with poor sanitation, the spread of rabies among dogs is very serious. Objective: The objective of this paper was to study the ecological transmission mode of rabies to make theoretical contributions to the suppression of rabies in China. Methods: A mathematical model of the transmission mode of rabies was constructed using relevant data from the literature and officially published figures in China. Using this model, we fitted the data of the number of patients with rabies and predicted the future number of patients with rabies. In addition, we studied the effectiveness of different rabies suppression measures. Results: The results of the study indicated that the number of people infected with rabies will rise in the first stage, and then decrease. The model forecasted that in about 10 years, the number of rabies cases will be controlled within a relatively stable range. According to the prediction results of the model reported in this paper, the number of rabies cases will eventually plateau at approximately 500 people every year. Relatively effective rabies suppression measures include controlling the birth rate of domestic and wild dogs as well as increasing the level of rabies immunity in domestic dogs. Conclusions: The basic reproductive number of rabies in China is still greater than 1. That is, China currently has insufficient measures to control rabies. The research on the transmission mode of rabies and control measures in this paper can provide theoretical support for rabies control in China. %M 32459185 %R 10.2196/18627 %U http://medinform.jmir.org/2020/5/e18627/ %U https://doi.org/10.2196/18627 %U http://www.ncbi.nlm.nih.gov/pubmed/32459185 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 2 %P e11512 %T Cluster Detection Mechanisms for Syndromic Surveillance Systems: Systematic Review and Framework Development %A Yeng,Prosper Kandabongee %A Woldaregay,Ashenafi Zebene %A Solvoll,Terje %A Hartvigsen,Gunnar %+ Department of Computer Science, University of Tromsø, The Arctic University of Norway, NTNU Gjøvik Teknologiveien 22, Gjøvik, 2815, Norway, 47 96992743, prosper.yeng@gmail.com %K sentinel surveillance %K space-time clustering %K aberration detection %D 2020 %7 26.5.2020 %9 Review %J JMIR Public Health Surveill %G English %X Background: The time lag in detecting disease outbreaks remains a threat to global health security. The advancement of technology has made health-related data and other indicator activities easily accessible for syndromic surveillance of various datasets. At the heart of disease surveillance lies the clustering algorithm, which groups data with similar characteristics (spatial, temporal, or both) to uncover significant disease outbreak. Despite these developments, there is a lack of updated reviews of trends and modelling options in cluster detection algorithms. Objective: Our purpose was to systematically review practically implemented disease surveillance clustering algorithms relating to temporal, spatial, and spatiotemporal clustering mechanisms for their usage and performance efficacies, and to develop an efficient cluster detection mechanism framework. Methods: We conducted a systematic review exploring Google Scholar, ScienceDirect, PubMed, IEEE Xplore, ACM Digital Library, and Scopus. Between January and March 2018, we conducted the literature search for articles published to date in English in peer-reviewed journals. The main eligibility criteria were studies that (1) examined a practically implemented syndromic surveillance system with cluster detection mechanisms, including over-the-counter medication, school and work absenteeism, and disease surveillance relating to the presymptomatic stage; and (2) focused on surveillance of infectious diseases. We identified relevant articles using the title, keywords, and abstracts as a preliminary filter with the inclusion criteria, and then conducted a full-text review of the relevant articles. We then developed a framework for cluster detection mechanisms for various syndromic surveillance systems based on the review. Results: The search identified a total of 5936 articles. Removal of duplicates resulted in 5839 articles. After an initial review of the titles, we excluded 4165 articles, with 1674 remaining. Reading of abstracts and keywords eliminated 1549 further records. An in-depth assessment of the remaining 125 articles resulted in a total of 27 articles for inclusion in the review. The result indicated that various clustering and aberration detection algorithms have been empirically implemented or assessed with real data and tested. Based on the findings of the review, we subsequently developed a framework to include data processing, clustering and aberration detection, visualization, and alerts and alarms. Conclusions: The review identified various algorithms that have been practically implemented and tested. These results might foster the development of effective and efficient cluster detection mechanisms in empirical syndromic surveillance systems relating to a broad spectrum of space, time, or space-time. %M 32357126 %R 10.2196/11512 %U http://publichealth.jmir.org/2020/2/e11512/ %U https://doi.org/10.2196/11512 %U http://www.ncbi.nlm.nih.gov/pubmed/32357126 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 2 %P e18638 %T Mathematical Modeling of COVID-19 Control and Prevention Based on Immigration Population Data in China: Model Development and Validation %A Huang,Qiangsheng %A Kang,Yu Sunny %+ Ping An Technology (Shenzhen) Co, Ltd, Ping An Wealth Building, 1088 Yuanshen Road, Pudong New District, Shanghai, 200135, China, 86 13761879218, hqsh@live.cn %K COVID-19 %K 2019-ncov %K epidemic control and prevention %K epidemic risk time series model %K incoming immigration population %K new diagnoses per day %D 2020 %7 25.5.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: At the end of February 2020, the spread of coronavirus disease (COVID-19) in China had drastically slowed and appeared to be under control compared to the peak data in early February of that year. However, the outcomes of COVID-19 control and prevention measures varied between regions (ie, provinces and municipalities) in China; moreover, COVID-19 has become a global pandemic, and the spread of the disease has accelerated in countries outside China. Objective: This study aimed to establish valid models to evaluate the effectiveness of COVID-19 control and prevention among various regions in China. These models also targeted regions with control and prevention problems by issuing immediate warnings. Methods: We built a mathematical model, the Epidemic Risk Time Series Model, and used it to analyze two sets of data, including the daily COVID-19 incidence (ie, newly diagnosed cases) as well as the daily immigration population size. Results: Based on the results of the model evaluation, some regions, such as Shanghai and Zhejiang, were successful in COVID-19 control and prevention, whereas other regions, such as Heilongjiang, yielded poor performance. The evaluation result was highly correlated with the basic reproduction number (R0) value, and the result was evaluated in a timely manner at the beginning of the disease outbreak. Conclusions: The Epidemic Risk Time Series Model was designed to evaluate the effectiveness of COVID-19 control and prevention in different regions in China based on analysis of immigration population data. Compared to other methods, such as R0, this model enabled more prompt issue of early warnings. This model can be generalized and applied to other countries to evaluate their COVID-19 control and prevention. %M 32396132 %R 10.2196/18638 %U http://publichealth.jmir.org/2020/2/e18638/ %U https://doi.org/10.2196/18638 %U http://www.ncbi.nlm.nih.gov/pubmed/32396132 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 2 %P e15917 %T Comparing Methods for Record Linkage for Public Health Action: Matching Algorithm Validation Study %A Avoundjian,Tigran %A Dombrowski,Julia C %A Golden,Matthew R %A Hughes,James P %A Guthrie,Brandon L %A Baseman,Janet %A Sadinle,Mauricio %+ Department of Epidemiology, School of Public Health, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195, United States, 1 5431065, tavoun@uw.edu %K medical record linkage %K public health surveillance %K public health practice %K data management %D 2020 %7 30.4.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Many public health departments use record linkage between surveillance data and external data sources to inform public health interventions. However, little guidance is available to inform these activities, and many health departments rely on deterministic algorithms that may miss many true matches. In the context of public health action, these missed matches lead to missed opportunities to deliver interventions and may exacerbate existing health inequities. Objective: This study aimed to compare the performance of record linkage algorithms commonly used in public health practice. Methods: We compared five deterministic (exact, Stenger, Ocampo 1, Ocampo 2, and Bosh) and two probabilistic record linkage algorithms (fastLink and beta record linkage [BRL]) using simulations and a real-world scenario. We simulated pairs of datasets with varying numbers of errors per record and the number of matching records between the two datasets (ie, overlap). We matched the datasets using each algorithm and calculated their recall (ie, sensitivity, the proportion of true matches identified by the algorithm) and precision (ie, positive predictive value, the proportion of matches identified by the algorithm that were true matches). We estimated the average computation time by performing a match with each algorithm 20 times while varying the size of the datasets being matched. In a real-world scenario, HIV and sexually transmitted disease surveillance data from King County, Washington, were matched to identify people living with HIV who had a syphilis diagnosis in 2017. We calculated the recall and precision of each algorithm compared with a composite standard based on the agreement in matching decisions across all the algorithms and manual review. Results: In simulations, BRL and fastLink maintained a high recall at nearly all data quality levels, while being comparable with deterministic algorithms in terms of precision. Deterministic algorithms typically failed to identify matches in scenarios with low data quality. All the deterministic algorithms had a shorter average computation time than the probabilistic algorithms. BRL had the slowest overall computation time (14 min when both datasets contained 2000 records). In the real-world scenario, BRL had the lowest trade-off between recall (309/309, 100.0%) and precision (309/312, 99.0%). Conclusions: Probabilistic record linkage algorithms maximize the number of true matches identified, reducing gaps in the coverage of interventions and maximizing the reach of public health action. %M 32352389 %R 10.2196/15917 %U http://publichealth.jmir.org/2020/2/e15917/ %U https://doi.org/10.2196/15917 %U http://www.ncbi.nlm.nih.gov/pubmed/32352389 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 2 %P e18606 %T Emergence of a Novel Coronavirus (COVID-19): Protocol for Extending Surveillance Used by the Royal College of General Practitioners Research and Surveillance Centre and Public Health England %A de Lusignan,Simon %A Lopez Bernal,Jamie %A Zambon,Maria %A Akinyemi,Oluwafunmi %A Amirthalingam,Gayatri %A Andrews,Nick %A Borrow,Ray %A Byford,Rachel %A Charlett,André %A Dabrera,Gavin %A Ellis,Joanna %A Elliot,Alex J %A Feher,Michael %A Ferreira,Filipa %A Krajenbrink,Else %A Leach,Jonathan %A Linley,Ezra %A Liyanage,Harshana %A Okusi,Cecilia %A Ramsay,Mary %A Smith,Gillian %A Sherlock,Julian %A Thomas,Nicholas %A Tripathy,Manasa %A Williams,John %A Howsam,Gary %A Joy,Mark %A Hobbs,Richard %+ Nuffield Department of Primary Care Health Sciences, University of Oxford, Eagle House,Walton Well Road, Oxford, OX2 6ED, United Kingdom, 44 01865289344, simon.delusignan@phc.ox.ac.uk %K general practice %K medical record systems %K computerized %K sentinel surveillance %K coronavirus %K COVID-19 %K SARS-CoV-2 %K surveillance %K infections %K pandemic %K records as topic %K serology %D 2020 %7 2.4.2020 %9 Protocol %J JMIR Public Health Surveill %G English %X Background: The Royal College of General Practitioners (RCGP) Research and Surveillance Centre (RSC) and Public Health England (PHE) have successfully worked together on the surveillance of influenza and other infectious diseases for over 50 years, including three previous pandemics. With the emergence of the international outbreak of the coronavirus infection (COVID-19), a UK national approach to containment has been established to test people suspected of exposure to COVID-19. At the same time and separately, the RCGP RSC’s surveillance has been extended to monitor the temporal and geographical distribution of COVID-19 infection in the community as well as assess the effectiveness of the containment strategy. Objectives: The aims of this study are to surveil COVID-19 in both asymptomatic populations and ambulatory cases with respiratory infections, ascertain both the rate and pattern of COVID-19 spread, and assess the effectiveness of the containment policy. Methods: The RCGP RSC, a network of over 500 general practices in England, extract pseudonymized data weekly. This extended surveillance comprises of five components: (1) Recording in medical records of anyone suspected to have or who has been exposed to COVID-19. Computerized medical records suppliers have within a week of request created new codes to support this. (2) Extension of current virological surveillance and testing people with influenza-like illness or lower respiratory tract infections (LRTI)—with the caveat that people suspected to have or who have been exposed to COVID-19 should be referred to the national containment pathway and not seen in primary care. (3) Serology sample collection across all age groups. This will be an extra blood sample taken from people who are attending their general practice for a scheduled blood test. The 100 general practices currently undertaking annual influenza virology surveillance will be involved in the extended virological and serological surveillance. (4) Collecting convalescent serum samples. (5) Data curation. We have the opportunity to escalate the data extraction to twice weekly if needed. Swabs and sera will be analyzed in PHE reference laboratories. Results: General practice clinical system providers have introduced an emergency new set of clinical codes to support COVID-19 surveillance. Additionally, practices participating in current virology surveillance are now taking samples for COVID-19 surveillance from low-risk patients presenting with LRTIs. Within the first 2 weeks of setup of this surveillance, we have identified 3 cases: 1 through the new coding system, the other 2 through the extended virology sampling. Conclusions: We have rapidly converted the established national RCGP RSC influenza surveillance system into one that can test the effectiveness of the COVID-19 containment policy. The extended surveillance has already seen the use of new codes with 3 cases reported. Rapid sharing of this protocol should enable scientific critique and shared learning. International Registered Report Identifier (IRRID): DERR1-10.2196/18606 %M 32240095 %R 10.2196/18606 %U https://publichealth.jmir.org/2020/2/e18606 %U https://doi.org/10.2196/18606 %U http://www.ncbi.nlm.nih.gov/pubmed/32240095 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 5 %N 1 %P e10906 %T Population Size Estimation of Venue-Based Female Sex Workers in Ho Chi Minh City, Vietnam: Capture-Recapture Exercise %A Le,Giang %A Khuu,Nghia %A Tieu,Van Thi Thu %A Nguyen,Phuc Duy %A Luong,Hoa Thi Yen %A Pham,Quang Duy %A Tran,Hau Phuc %A Nguyen,Thuong Vu %A Morgan,Meade %A Abdul-Quader,Abu S %+ Program Development Office, United States Agency for International Development, Vietnam, 2 Ngo Quyen Street, Hoan Kiem District, Hanoi,, Viet Nam, 84 0962032206, letonggiang@gmail.com %K population size estimation %K venue-based %K female sex workers %K Ho Chi Minh City %K capture-recapture %D 2019 %7 29.01.2019 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: There is limited population size estimation of female sex workers (FSWs) in Ho Chi Minh City (HCMC)—the largest city in Vietnam. Only 1 population size estimation among venue-based female sex workers (VFSWs) was conducted in 2012 in HCMC. Appropriate estimates of the sizes of key populations are critical for resource allocation to prevent HIV infection. Objective: The aim of this study was to estimate the population size of the VFSWs from December 2016 to January 2017 in HCMC, Vietnam. Methods: A multistage capture-recapture study was conducted in HCMC. The capture procedures included selection of districts using stratified probability proportion to size, mapping to identify venues, approaching all VFSWs to screen their eligibility, and then distribution of a unique object (a small pink makeup bag) to all eligible VFSWs in all identified venues. The recapture exercise included equal probability random selection of a sample of venues from the initial mapping and then approaching FSWs in those venues to determine the number and proportion of women who received the unique object. The proportion and associated confidence bounds, calculated using sampling weights and accounting for study design, were then divided by the number of objects distributed to calculate the number of VFSWs in the selected districts. This was then multiplied by the inverse of the proportion of districts selected to calculate the number of VFSWs in HCMC as a whole. Results: Out of 24 districts, 6 were selected for the study. Mapping identified 573 venues across which 2317 unique objects were distributed in the first capture. During the recapture round, 103 venues were selected and 645 VFSWs were approached and interviewed. Of those, 570 VFSWs reported receiving the unique object during the capture round. Total estimated VFSWs in the 6 selected districts were 2616 (95% CI 2445-3014), accounting for the fact that only 25% (6/24) of total districts were selected gives an overall estimate of 10,465 (95% CI 9782-12,055) VFSWs in HCMC. Conclusions: The capture-recapture exercise provided an estimated number of VFSWs in HCMC. However, for planning HIV prevention and care service needs among all FSWs, studies are needed to assess the number of sex workers who are not venue-based, including those who use social media platforms to sell services. %M 30694204 %R 10.2196/10906 %U http://publichealth.jmir.org/2019/1/e10906/ %U https://doi.org/10.2196/10906 %U http://www.ncbi.nlm.nih.gov/pubmed/30694204 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 5 %N 1 %P e11357 %T Period of Measurement in Time-Series Predictions of Disease Counts from 2007 to 2017 in Northern Nevada: Analytics Experiment %A Talaei-Khoei,Amir %A Wilson,James M %A Kazemi,Seyed-Farzan %+ Department of Information Systems, University of Nevada Reno, Ansari Business Building, 1664 N Virginia Street, Room 314F, Reno, NV, 89557, United States, 1 7754407005, atalaeikhoei@unr.edu %K autocorrelation %K disease counts %K prediction %K public health surveillance %K time-series analysis %D 2019 %7 15.01.2019 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The literature in statistics presents methods by which autocorrelation can identify the best period of measurement to improve the performance of a time-series prediction. The period of measurement plays an important role in improving the performance of disease-count predictions. However, from the operational perspective in public health surveillance, there is a limitation to the length of the measurement period that can offer meaningful and valuable predictions. Objective: This study aimed to establish a method that identifies the shortest period of measurement without significantly decreasing the prediction performance for time-series analysis of disease counts. Methods: The data used in this evaluation include disease counts from 2007 to 2017 in northern Nevada. The disease counts for chlamydia, salmonella, respiratory syncytial virus, gonorrhea, viral meningitis, and influenza A were predicted. Results: Our results showed that autocorrelation could not guarantee the best performance for prediction of disease counts. However, the proposed method with the change-point analysis suggests a period of measurement that is operationally acceptable and performance that is not significantly different from the best prediction. Conclusions: The use of change-point analysis with autocorrelation provides the best and most practical period of measurement. %M 30664479 %R 10.2196/11357 %U http://publichealth.jmir.org/2019/1/e11357/ %U https://doi.org/10.2196/11357 %U http://www.ncbi.nlm.nih.gov/pubmed/30664479 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 20 %N 6 %P e215 %T Detecting Suicidal Ideation on Forums: Proof-of-Concept Study %A Aladağ,Ahmet Emre %A Muderrisoglu,Serra %A Akbas,Naz Berfu %A Zahmacioglu,Oguzhan %A Bingol,Haluk O %+ Department of Computer Engineering, Bogazici University, Bebek, Istanbul, 34342, Turkey, 90 2123594523, emre.aladag@boun.edu.tr %K suicide %K suicidal ideation %K suicidality %K detection %K prevention %K classification model %K text mining %K machine learning %K artificial intelligence %K suicidal surveillance %D 2018 %7 21.06.2018 %9 Original Paper %J J Med Internet Res %G English %X Background: In 2016, 44,965 people in the United States died by suicide. It is common to see people with suicidal ideation seek help or leave suicide notes on social media before attempting suicide. Many prefer to express their feelings with longer passages on forums such as Reddit and blogs. Because these expressive posts follow regular language patterns, potential suicide attempts can be prevented by detecting suicidal posts as they are written. Objective: This study aims to build a classifier that differentiates suicidal and nonsuicidal forum posts via text mining methods applied on post titles and bodies. Methods: A total of 508,398 Reddit posts longer than 100 characters and posted between 2008 and 2016 on SuicideWatch, Depression, Anxiety, and ShowerThoughts subreddits were downloaded from the publicly available Reddit dataset. Of these, 10,785 posts were randomly selected and 785 were manually annotated as suicidal or nonsuicidal. Features were extracted using term frequency-inverse document frequency, linguistic inquiry and word count, and sentiment analysis on post titles and bodies. Logistic regression, random forest, and support vector machine (SVM) classification algorithms were applied on resulting corpus and prediction performance is evaluated. Results: The logistic regression and SVM classifiers correctly identified suicidality of posts with 80% to 92% accuracy and F1 score, respectively, depending on different data compositions closely followed by random forest, compared to baseline ZeroR algorithm achieving 50% accuracy and 66% F1 score. Conclusions: This study demonstrated that it is possible to detect people with suicidal ideation on online forums with high accuracy. The logistic regression classifier in this study can potentially be embedded on blogs and forums to make the decision to offer real-time online counseling in case a suicidal post is being written. %M 29929945 %R 10.2196/jmir.9840 %U http://www.jmir.org/2018/6/e215/ %U https://doi.org/10.2196/jmir.9840 %U http://www.ncbi.nlm.nih.gov/pubmed/29929945 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 3 %N 3 %P e59 %T Sample Size Calculations for Population Size Estimation Studies Using Multiplier Methods With Respondent-Driven Sampling Surveys %A Fearon,Elizabeth %A Chabata,Sungai T %A Thompson,Jennifer A %A Cowan,Frances M %A Hargreaves,James R %+ Department of Social and Environmental Health Research, London School of Hygiene and Tropical Medicine, 15-17 Tavistock Place, London, WC1H 9SH, United Kingdom, 44 20 7927 2877, Elizabeth.Fearon@lshtm.ac.uk %K population surveillance %K sample size %K sampling studies %K surveys and questionnaires %K research design %K data collection %K sex workers %K HIV %D 2017 %7 14.09.2017 %9 Short Paper %J JMIR Public Health Surveill %G English %X Background: While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. Objective: To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. Methods: The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. Results: There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. Conclusions: We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey. %M 28912117 %R 10.2196/publichealth.7909 %U http://publichealth.jmir.org/2017/3/e59/ %U https://doi.org/10.2196/publichealth.7909 %U http://www.ncbi.nlm.nih.gov/pubmed/28912117 %0 Journal Article %@ 1947-2579 %I JMIR Publications %V 3 %N 2 %P e3602 %T Roles of Health Literacy in Relation to Social Determinants of Health and Recommendations for Informatics-Based Interventions: Systematic Review %D 2011 %7 ..2011 %9 %J Online J Public Health Inform %G English %X Objectives: To elucidate current issues related to health statistics dissemination efforts on the Internet in Indonesia and to propose a new dissemination website as a solution.Methods: A cross-sectional survey was conducted. Sources of statistics were identified using link relationship and Google™ search. Menu used to locate statistics, mode of presentation and means of access to statistics, and available statistics were assessed for each site. Assessment results were used to derive design specification; a prototype system was developed and evaluated with usability test.Results: 49 sources were identified on 18 governmental, 8 international and 5 non-government websites. Of 49 menus identified, 33% used non-intuitive titles and lead to inefficient search. 69% of them were on government websites. Of 31 websites, only 39% and 23% used graph/chart and map for presentation. Further, only 32%, 39% and 19% provided query, export and print feature. While >50% sources reported morbidity, risk factor and service provision statistics, <40% sources reported health resource and mortality statistics. Statistics portal website was developed using Joomla!™ content management system. Usability test demonstrated its potential to improve data accessibility.Discussion and conclusion: In this study, government’s efforts to disseminate statistics in Indonesia are supported by non-governmental and international organizations and existing their information may not be very useful because it is: a) not widely distributed, b) difficult to locate, and c) not effectively communicated. Actions are needed to ensure information usability, and one of such actions is the development of statistics portal website. %M 23569612 %R 10.5210/ojphi.v3i2.3602 %U %U https://doi.org/10.5210/ojphi.v3i2.3602 %U http://www.ncbi.nlm.nih.gov/pubmed/23569612