TY - JOUR AU - Cvijanovic, Dane AU - Grubor, Nikola AU - Rajovic, Nina AU - Vucevic, Mira AU - Miltenovic, Svetlana AU - Laban, Marija AU - Mostic, Tatjana AU - Tasic, Radica AU - Matejic, Bojana AU - Milic, Natasa PY - 2025/4/17 TI - Assessing COVID-19 Mortality in Serbia?s Capital: Model-Based Analysis of Excess Deaths JO - JMIR Public Health Surveill SP - e56877 VL - 11 KW - COVID-19 KW - COVID-19 impact KW - SARS-Cov-2 KW - coronavirus KW - respiratory KW - infectious disease KW - pulmonary KW - pandemic KW - excess mortality KW - death rate KW - death toll KW - centralized health care KW - urban KW - Serbia KW - dense population KW - public health KW - surveillance N2 - Background: Concerns have been raised about discrepancies in COVID-19 mortality data, particularly between preliminary and final datasets of vital statistics in Serbia. In the original preliminary dataset, released daily during the ongoing pandemic, there was an underestimation of deaths in contrast to those reported in the subsequently released yearly dataset of vital statistics. Objective: This study aimed to assess the accuracy of the final mortality dataset and justify its use in further analyses. In addition, we quantified the relative impact of COVID-19 on the death rate in the Serbian capital?s population. In the process, we aimed to explore whether any evidence of cause-of-death misattribution existed in the final published datasets. Methods: Data were sourced from the electronic databases of the Statistical Office of the Republic of Serbia. The dataset included yearly recorded deaths and the causes of death of all citizens currently living in the territory of Belgrade, the capital of the Republic of Serbia, from 2015 to 2021. Standardization and modeling techniques were utilized to quantify the direct impact of COVID-19 and to estimate excess deaths. To account for year-to-year trends, we used a mixed-effects hierarchical Poisson generalized linear regression model to predict mortality for 2020 and 2021. The model was fitted to the mortality data observed from 2015 to 2019 and used to generate mortality predictions for 2020 and 2021. Actual death rates were then compared to the obtained predictions and used to generate excess mortality estimates. Results: The total number of excess deaths, calculated from model estimates, was 3175 deaths (99% CI 1715-4094) for 2020 and 8321 deaths (99% CI 6975-9197) for 2021. The ratio of estimated excess deaths to reported COVID-19 deaths was 1.07. The estimated increase in mortality during 2020 and 2021 was 12.93% (99% CI 15.74%-17.33%) and 39.32% (99% CI 35.91%-39.32%) from the expected values, respectively. Those aged 0?19 years experienced an average decrease in mortality of 22.43% and 23.71% during 2020 and 2021, respectively. For those aged up to 39 years, there was a slight increase in mortality (4.72%) during 2020. However, in 2021, even those aged 20?39 years had an estimated increase in mortality of 32.95%. For people aged 60?79 years, there was an estimated increase in mortality of 16.95% and 38.50% in 2020 and 2021, respectively. For those aged >80 years, the increase was estimated at 11.50% and 34.14% in 2020 and 2021, respectively. The model-predicted deaths matched the non-COVID-19 deaths recorded in the territory of Belgrade. This concordance between the predicted and recorded non-COVID-19 deaths provides evidence that the cause-of-death misattribution did not occur in the territory of Belgrade. Conclusions: The finalized mortality dataset for Belgrade can be safely used in COVID-19 impact analysis. Belgrade experienced a significant increase in mortality during 2020 and 2021, with most of the excess mortality attributable to SARS-CoV-2. Concerns about increased mortality from causes other than COVID-19 in Belgrade seem misplaced as their impact appears negligible. UR - https://publichealth.jmir.org/2025/1/e56877 UR - http://dx.doi.org/10.2196/56877 ID - info:doi/10.2196/56877 ER - TY - JOUR AU - Oliveira, Fonseca Juliane AU - Vasconcelos, O. Adriano AU - Alencar, L. Andrêza AU - Cunha, L. Maria Célia S. AU - Marcilio, Izabel AU - Barral-Netto, Manoel AU - P Ramos, Ivan Pablo PY - 2025/4/1 TI - Balancing Human Mobility and Health Care Coverage in Sentinel Surveillance of Brazilian Indigenous Areas: Mathematical Optimization Approach JO - JMIR Public Health Surveill SP - e69048 VL - 11 KW - representative sentinel surveillance KW - early pathogen detection KW - indigenous health KW - human mobility KW - surveillance network optimization KW - infectious disease surveillance KW - public health strategy KW - Brazil N2 - Background: Optimizing sentinel surveillance site allocation for early pathogen detection remains a challenge, particularly in ensuring coverage of vulnerable and underserved populations. Objective: This study evaluates the current respiratory pathogen surveillance network in Brazil and proposes an optimized sentinel site distribution that balances Indigenous population coverage and national human mobility patterns. Methods: We compiled Indigenous Special Health District (Portuguese: Distrito Sanitário Especial Indígena [DSEI]) locations from the Brazilian Ministry of Health and estimated national mobility routes by using the Ford-Fulkerson algorithm, incorporating air, road, and water transportation data. To optimize sentinel site selection, we implemented a linear optimization algorithm that maximizes (1) Indigenous region representation and (2) human mobility coverage. We validated our approach by comparing results with Brazil?s current influenza sentinel network and analyzing the health attraction index from the Brazilian Institute of Geography and Statistics to assess the feasibility and potential benefits of our optimized surveillance network. Results: The current Brazilian network includes 199 municipalities, representing 3.6% (199/5570) of the country?s cities. The optimized sentinel site design, while keeping the same number of municipalities, ensures 100% coverage of all 34 DSEI regions while rearranging 108 (54.3%) of the 199 cities from the existing flu sentinel system. This would result in a more representative sentinel network, addressing gaps in 9 of 34 previously uncovered DSEI regions, which span 750,515 km² and have a population of 1.11 million. Mobility coverage would improve by 16.8 percentage points, from 52.4% (4,598,416 paths out of 8,780,046 total paths) to 69.2% (6,078,747 paths out of 8,780,046 total paths). Additionally, all newly selected cities serve as hubs for medium- or high-complexity health care, ensuring feasibility for pathogen surveillance. Conclusions: The proposed framework optimizes sentinel site allocation to enhance disease surveillance and early detection. By maximizing DSEI coverage and integrating human mobility patterns, this approach provides a more effective and equitable surveillance network, which would particularly benefit underserved Indigenous regions. UR - https://publichealth.jmir.org/2025/1/e69048 UR - http://dx.doi.org/10.2196/69048 ID - info:doi/10.2196/69048 ER - TY - JOUR AU - Holst, Christine AU - Woloshin, Steven AU - Oxman, D. Andrew AU - Rose, Christopher AU - Rosenbaum, Sarah AU - Munthe-Kaas, Menzies Heather PY - 2025/3/18 TI - Alternative Presentations of Overall and Statistical Uncertainty for Adults? Understanding of the Results of a Randomized Trial of a Public Health Intervention: Parallel Web-Based Randomized Trials JO - JMIR Public Health Surveill SP - e62828 VL - 11 KW - communication KW - Grading of Recommendations Assessment, Development, and Evaluation language KW - GRADE language KW - statistical uncertainty KW - overall uncertainty KW - randomized trial N2 - Background: Well-designed public health messages can help people make informed choices, while poorly designed messages or persuasive messages can confuse, lead to poorly informed decisions, and diminish trust in health authorities and research. Communicating uncertainties to the public about the results of health research is challenging, necessitating research on effective ways to disseminate this important aspect of randomized trials. Objective: This study aimed to evaluate people?s understanding of overall and statistical uncertainty when presented with alternative ways of expressing randomized trial results. Methods: Two parallel, web-based, individually randomized trials (3×2 factorial designs) were conducted in the United States and Norway. Participants were randomized to 1 of 6 versions of a text (summary) communicating results from a study examining the effects of wearing glasses to prevent COVID-19 infection. The summaries varied in how overall uncertainty (?Grading of Recommendations Assessment, Development and Evaluation [GRADE] language,? ?plain language,? or ?no explicit language?) and statistical uncertainty (whether a margin of error was shown or not) were presented. Participants completed a web-based questionnaire exploring 4 coprimary outcomes: 3 to measure understanding of overall uncertainty (benefits, harms, and sufficiency of evidence), and one to measure statistical uncertainty. Participants were adults who do not wear glasses recruited from web-based research panels in the United States and Norway. Results of the trials were analyzed separately and combined in a meta-analysis. Results: In the US and Norwegian trials, 730 and 497 individuals were randomized, respectively; data for 543 (74.4%) and 452 (90.9%) were analyzed. More participants had a correct understanding of uncertainty when presented with plain language (United States: 37/99, 37% and Norway: 40/76, 53%) than no explicit language (United States: 18/86, 21% and Norway: 34/80, 42%). Similar positive effect was seen for the GRADE language in the United States (26/79, 33%) but not in Norway (30/71, 42%). There were only small differences between groups for understanding the uncertainty of harms. Plain language improved correct understanding of evidence sufficiency (odds ratio 2.05, 95% CI 1.17-3.57), compared to no explicit language. The effect of GRADE language was inconclusive (odds ratio 1.34, 95% CI 0.79-2.28). The understanding of statistical uncertainty was improved when the participants were shown the margin of error compared to not being shown: Norway: 16/75, 21% to 24/71, 34% vs 1/71, 1% to 2/76, 3% and the United States: 21/101, 21% to 32/90, 36% vs 0/86, 0% to 3/79, 4%). Conclusions: Plain language, but not GRADE language, was better than no explicit language in helping people understand overall uncertainty of benefits and harms. Reporting margin of error improved understanding of statistical uncertainty around the effect of wearing glasses, but only for a minority of participants. Trial Registration: ClinicalTrials.gov NCT05642754; https://tinyurl.com/4mhjsm7s UR - https://publichealth.jmir.org/2025/1/e62828 UR - http://dx.doi.org/10.2196/62828 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/62828 ER - TY - JOUR AU - Rosenfeld, Daniel AU - Brennan, Sean AU - Wallach, Andrew AU - Long, Theodore AU - Keeley, Chris AU - Kurien, Joseph Sarah PY - 2025/3/13 TI - COVID-19 Testing Equity in New York City During the First 2 Years of the Pandemic: Demographic Analysis of Free Testing Data JO - JMIR Public Health Surveill SP - e52972 VL - 11 KW - COVID-19 testing KW - health disparities KW - equity in testing KW - New York City KW - socioeconomic factors KW - testing accessibility KW - health care inequalities KW - demographic analysis KW - COVID-19 mortality KW - coronavirus KW - SARS-CoV-2 KW - pandemic KW - equitable testing KW - cost KW - poor neighborhood KW - resources N2 - Background: COVID-19 has caused over 46,000 deaths in New York City, with a disproportional impact on certain communities. As part of the COVID-19 response, the city has directly administered over 6 million COVID-19 tests (in addition to millions of indirectly administered tests not covered in this analysis) at no cost to individuals, resulting in nearly half a million positive results. Given that the prevalence of testing, throughout the pandemic, has tended to be higher in more affluent areas, these tests were targeted to areas with fewer resources. Objective: This study aimed to evaluate the impact of New York City?s COVID-19 testing program; specifically, we aimed to review its ability to provide equitable testing in economically, geographically, and demographically diverse populations. Of note, in addition to the brick-and-mortar testing sites evaluated herein, this program conducted 2.1 million tests through mobile units to further address testing inequity. Methods: Testing data were collected from the in-house Microsoft SQL Server Management Studio 18 Clarity database, representing 6,347,533 total tests and 449,721 positive test results. These tests were conducted at 48 hospital system locations. Per capita testing rates by zip code tabulation area (ZCTA) and COVID-19 positivity rates by ZCTA were used as dependent variables in separate regressions. Median income, median age, the percentage of English-speaking individuals, and the percentage of people of color were used as independent demographic variables to analyze testing patterns across several intersecting identities. Negative binomial regressions were run in a Jupyter Notebook using Python. Results: Per capita testing inversely correlated with median income geographically. The overall pseudo r2 value was 0.1101 when comparing hospital system tests by ZCTA against the selected variables. The number of tests significantly increased as median income fell (SE 1.00000155; P<.001). No other variables correlated at a significant level with the number of tests (all P values were >.05). When considering positive test results by ZCTA, the number of positive test results also significantly increased as median income fell (SE 1.57e?6; P<.001) and as the percentage of female residents fell (SE 0.957; P=.001). The number of positive test results by ZCTA rose at a significant level alongside the percentage of English-only speakers (SE 0.271; P=.03). Conclusions: New York City?s COVID-19 testing program was able to improve equity through the provision of no-cost testing, which focused on areas of the city that were disproportionately impacted by COVID-19 and had fewer resources. By detecting higher numbers of positive test results in resource-poor neighborhoods, New York City was able to deploy additional resources, such as those for contact tracing and isolation and quarantine support (eg, free food delivery and free hotel stays), early during the COVID-19 pandemic. Equitable deployment of testing is feasible and should be considered early in future epidemics or pandemics. UR - https://publichealth.jmir.org/2025/1/e52972 UR - http://dx.doi.org/10.2196/52972 ID - info:doi/10.2196/52972 ER - TY - JOUR AU - Pullano, Giulia AU - Alvarez-Zuzek, Gisele Lucila AU - Colizza, Vittoria AU - Bansal, Shweta PY - 2025/2/18 TI - Characterizing US Spatial Connectivity and Implications for Geographical Disease Dynamics and Metapopulation Modeling: Longitudinal Observational Study JO - JMIR Public Health Surveill SP - e64914 VL - 11 KW - geographical disease dynamics KW - spatial connectivity KW - mobility data KW - metapopulation modeling KW - COVID-19 KW - human mobility KW - infectious diseases KW - social distancing KW - epidemic KW - mobile apps KW - SafeGraph KW - SARS-CoV-2 KW - coronavirus KW - pandemic KW - spatio-temporal KW - US KW - public health KW - mobile health KW - mHealth KW - digital health KW - health informatics N2 - Background: Human mobility is expected to be a critical factor in the geographic diffusion of infectious diseases, and this assumption led to the implementation of social distancing policies during the early fight against the COVID-19 emergency in the United States. Yet, because of substantial data gaps in the past, what still eludes our understanding are the following questions: (1) How does mobility contribute to the spread of infection within the United States at local, regional, and national scales? (2) How do seasonality and shifts in behavior affect mobility over time? (3) At what geographic level is mobility homogeneous across the United States? Objective: This study aimed to address the questions that are critical for developing accurate transmission models, predicting the spatial propagation of disease across scales, and understanding the optimal geographical and temporal scale for the implementation of control policies. Methods: We analyzed high-resolution mobility data from mobile app usage from SafeGraph Inc, mapping daily connectivity between the US counties to grasp spatial clustering and temporal stability. Integrating this into a spatially explicit transmission model, we replicated SARS-CoV-2?s first wave invasion, assessing mobility?s spatiotemporal impact on disease predictions. Results: Analysis from 2019 to 2021 showed that mobility patterns remained stable, except for a decline in April 2020 due to lockdowns, which reduced daily movements from 45 million to approximately 25 million nationwide. Despite this reduction, intercounty connectivity remained seasonally stable, largely unaffected during the early COVID-19 phase, with a median Spearman coefficient of 0.62 (SD 0.01) between daily connectivity and gravity networks. We identified 104 geographic clusters of US counties with strong internal mobility connectivity and weaker links to counties outside these clusters. These clusters were stable over time, largely overlapping state boundaries (normalized mutual information=0.82) and demonstrating high temporal stability (normalized mutual information=0.95). Our findings suggest that intercounty connectivity is relatively static and homogeneous at the substate level. Furthermore, while county-level, daily mobility data best captures disease invasion, static mobility data aggregated to the cluster level also effectively models spatial diffusion. Conclusions: Our work demonstrates that intercounty mobility was negligibly affected outside the lockdown period in April 2020, explaining the broad spatial distribution of COVID-19 outbreaks in the United States during the early phase of the pandemic. Such geographically dispersed outbreaks place a significant strain on national public health resources and necessitate complex metapopulation modeling approaches for predicting disease dynamics and control design. We thus inform the design of such metapopulation models to balance high disease predictability with low data requirements. UR - https://publichealth.jmir.org/2025/1/e64914 UR - http://dx.doi.org/10.2196/64914 ID - info:doi/10.2196/64914 ER - TY - JOUR AU - Liu, Han AU - Zong, Huiying AU - Yang, Yang AU - Schwebel, C. David AU - Xie, Bin AU - Ning, Peishan AU - Rao, Zhenzhen AU - Li, Li AU - Hu, Guoqing PY - 2025/2/6 TI - Consistency of Daily Number of Reported COVID-19 Cases in 191 Countries From 2020 to 2022: Comparative Analysis of 2 Major Data Sources JO - JMIR Public Health Surveill SP - e65439 VL - 11 KW - COVID-19 KW - pandemic KW - data consistency KW - World Health Organization KW - data quality N2 - Background: The COVID-19 pandemic represents one of the most challenging public health emergencies in recent world history, causing about 7.07 million deaths globally by September 24, 2024. Accurate, timely, and consistent data are critical for early response to situations like the COVID-19 pandemic. Objective: This study aimed to evaluate consistency of daily reported COVID-19 cases in 191 countries from the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) and the World Health Organization (WHO) dashboards during 2020?2022. Methods: We retrieved data concerning new daily COVID-19 cases in 191 countries covered by both data sources from January 22, 2020, to December 31, 2022. The ratios of numbers of daily reported cases from the 2 sources were calculated to measure data consistency. We performed simple linear regression to examine significant changes in the ratio of numbers of daily reported cases during the study period. Results: Of 191 WHO member countries, only 60 displayed excellent data consistency in the number of daily reported COVID-19 cases between the WHO and JHU CSSE dashboards (mean ratio 0.9-1.1). Data consistency changed greatly across the 191 countries from 2020 to 2022 and differed across 4 types of countries, categorized by income. Data inconsistency between the 2 data sources generally decreased slightly over time, both for the 191 countries combined and within the 4 types of income-defined countries. The absolute relative difference between the 2 data sources increased in 84 countries, particularly for Malta (R2=0.25), Montenegro (R2=0.30), and the United States (R2=0.29), but it decreased significantly in 40 countries. Conclusions: The inconsistency between the 2 data sources warrants further research. Construction of public health surveillance and data collection systems for public health emergencies like the COVID-19 pandemic should be strengthened in the future. UR - https://publichealth.jmir.org/2025/1/e65439 UR - http://dx.doi.org/10.2196/65439 ID - info:doi/10.2196/65439 ER - TY - JOUR AU - Januraga, Putu Pande AU - Lukitosari, Endang AU - Luhukay, Lanny AU - Hasby, Rizky AU - Sutrisna, Aang PY - 2025/1/30 TI - Mapping Key Populations to Develop Improved HIV and AIDS Interventions: Multiphase Cross-Sectional Observational Mapping Study Using a District and City Approach JO - JMIR Public Health Surveill SP - e56820 VL - 11 KW - Indonesia KW - key population KW - mapping KW - pandemic KW - HIV KW - AIDS KW - hotspot N2 - Background: Indonesia?s vast archipelago and substantial population size present unique challenges in addressing its multifaceted HIV epidemic, with 90% of its 514 districts and cities reporting cases. Identifying key populations (KPs) is essential for effectively targeting interventions and allocating resources to address the changing dynamics of the epidemic. Objective: We examine the 2022 mapping of Indonesia?s KPs to develop improved HIV and AIDS interventions. Methods: In 2022, a district-based mapping of KPs was conducted across 201 districts and cities chosen for their HIV program intensity. This multiphase process included participatory workshops for hotspot identification, followed by direct hotspot observation, then followed by a second direct observation in selected hotspots for quality control. Data from 49,346 informants (KPs) were collected and analyzed. The results from individual hotspots were aggregated at the district or city level, and a formula was used to estimate the population size. Results: The mapping initiative identified 18,339 hotspots across 201 districts and cities, revealing substantial disparities in hotspot distribution. Of the 18,339 hotspots, 16,964 (92.5%) were observed, of which 1822 (10.74%) underwent a second review to enhance data accuracy. The findings mostly aligned with local stakeholders? estimates, but showed a lower median. Interviews indicated a shift in KP dynamics, with a median decline in hotspot attendance since the pandemic, and there was notable variation in mapping results across district categories. In ?comprehensive? areas, the average results for men who have sex with men (MSM), people who inject drugs, transgender women, and female sex workers (FSWs) were 1008 (median 694, IQR 317-1367), 224 (median 114, IQR 59-202), 196 (median 167, IQR 81-265), and 775 (median 573, IQR 352-1131), respectively. ?Medium? areas had lower averages: MSM at 381 (median 199, IQR 91-454), people who inject drugs at 51 (median 54, IQR 15-63), transgender women at 101 (median 55, IQR 29-127), and FSWs at 304 (median 231, IQR 118-425). ?Basic? areas showed the lowest averages: MSM at 161 (median 73, IQR 49-285), people who inject drugs at 7 (median 7, IQR 7-7), transgender women at 59 (median 26, IQR 12-60), and FSWs at 161 (median 131, IQR 59-188). Comparisons with ongoing outreach programs revealed substantial differences: the mapped MSM population was >50% lower than program coverage; the estimates for people who inject drugs were twice as high as the program coverage. Conclusions: The mapping results highlight significant variations in hotspots and KPs across districts and cities and underscore the necessity of adaptive HIV prevention strategies. The findings informed programmatic decisions, such as reallocating resources to underserved districts and recalibrating outreach strategies to better match KP dynamics. Developing strategies beyond identified hotspots, integrating mapping data into planning, and adopting a longitudinal approach to understand KP behavior over time are critical for effective HIV and AIDS prevention and control. UR - https://publichealth.jmir.org/2025/1/e56820 UR - http://dx.doi.org/10.2196/56820 UR - http://www.ncbi.nlm.nih.gov/pubmed/39883483 ID - info:doi/10.2196/56820 ER - TY - JOUR AU - Willem, Theresa AU - Wollek, Alessandro AU - Cheslerean-Boghiu, Theodor AU - Kenney, Martha AU - Buyx, Alena PY - 2025/1/28 TI - The Social Construction of Categorical Data: Mixed Methods Approach to Assessing Data Features in Publicly Available Datasets JO - JMIR Med Inform SP - e59452 VL - 13 KW - machine learning KW - categorical data KW - social context dependency KW - mixed methods KW - dermatology KW - dataset analysis N2 - Background: In data-sparse areas such as health care, computer scientists aim to leverage as much available information as possible to increase the accuracy of their machine learning models? outputs. As a standard, categorical data, such as patients? gender, socioeconomic status, or skin color, are used to train models in fusion with other data types, such as medical images and text-based medical information. However, the effects of including categorical data features for model training in such data-scarce areas are underexamined, particularly regarding models intended to serve individuals equitably in a diverse population. Objective: This study aimed to explore categorical data?s effects on machine learning model outputs, rooted the effects in the data collection and dataset publication processes, and proposed a mixed methods approach to examining datasets? data categories before using them for machine learning training. Methods: Against the theoretical background of the social construction of categories, we suggest a mixed methods approach to assess categorical data?s utility for machine learning model training. As an example, we applied our approach to a Brazilian dermatological dataset (Dermatological and Surgical Assistance Program at the Federal University of Espírito Santo [PAD-UFES] 20). We first present an exploratory, quantitative study that assesses the effects when including or excluding each of the unique categorical data features of the PAD-UFES 20 dataset for training a transformer-based model using a data fusion algorithm. We then pair our quantitative analysis with a qualitative examination of the data categories based on interviews with the dataset authors. Results: Our quantitative study suggests scattered effects of including categorical data for machine learning model training across predictive classes. Our qualitative analysis gives insights into how the categorical data were collected and why they were published, explaining some of the quantitative effects that we observed. Our findings highlight the social constructedness of categorical data in publicly available datasets, meaning that the data in a category heavily depend on both how these categories are defined by the dataset creators and the sociomedico context in which the data are collected. This reveals relevant limitations of using publicly available datasets in contexts different from those of the collection of their data. Conclusions: We caution against using data features of publicly available datasets without reflection on the social construction and context dependency of their categorical data features, particularly in data-sparse areas. We conclude that social scientific, context-dependent analysis of available data features using both quantitative and qualitative methods is helpful in judging the utility of categorical data for the population for which a model is intended. UR - https://medinform.jmir.org/2025/1/e59452 UR - http://dx.doi.org/10.2196/59452 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/59452 ER - TY - JOUR AU - Rowley, AK Elizabeth AU - Mitchell, K. Patrick AU - Yang, Duck-Hye AU - Lewis, Ned AU - Dixon, E. Brian AU - Vazquez-Benitez, Gabriela AU - Fadel, F. William AU - Essien, J. Inih AU - Naleway, L. Allison AU - Stenehjem, Edward AU - Ong, C. Toan AU - Gaglani, Manjusha AU - Natarajan, Karthik AU - Embi, Peter AU - Wiegand, E. Ryan AU - Link-Gelles, Ruth AU - Tenforde, W. Mark AU - Fireman, Bruce PY - 2025/1/27 TI - Methods to Adjust for Confounding in Test-Negative Design COVID-19 Effectiveness Studies: Simulation Study JO - JMIR Form Res SP - e58981 VL - 9 KW - disease risk score KW - propensity score KW - vaccine effectiveness KW - COVID-19 KW - simulation study KW - usefulness KW - comorbidity KW - assessment N2 - Background: Real-world COVID-19 vaccine effectiveness (VE) studies are investigating exposures of increasing complexity accounting for time since vaccination. These studies require methods that adjust for the confounding that arises when morbidities and demographics are associated with vaccination and the risk of outcome events. Methods based on propensity scores (PS) are well-suited to this when the exposure is dichotomous, but present challenges when the exposure is multinomial. Objective: This simulation study aimed to investigate alternative methods to adjust for confounding in VE studies that have a test-negative design. Methods: Adjustment for a disease risk score (DRS) is compared with multivariable logistic regression. Both stratification on the DRS and direct covariate adjustment of the DRS are examined. Multivariable logistic regression with all the covariates and with a limited subset of key covariates is considered. The performance of VE estimators is evaluated across a multinomial vaccination exposure in simulated datasets. Results: Bias in VE estimates from multivariable models ranged from ?5.3% to 6.1% across 4 levels of vaccination. Standard errors of VE estimates were unbiased, and 95% coverage probabilities were attained in most scenarios. The lowest coverage in the multivariable scenarios was 93.7% (95% CI 92.2%-95.2%) and occurred in the multivariable model with key covariates, while the highest coverage in the multivariable scenarios was 95.3% (95% CI 94.0%-96.6%) and occurred in the multivariable model with all covariates. Bias in VE estimates from DRS-adjusted models was low, ranging from ?2.2% to 4.2%. However, the DRS-adjusted models underestimated the standard errors of VE estimates, with coverage sometimes below the 95% level. The lowest coverage in the DRS scenarios was 87.8% (95% CI 85.8%-89.8%) and occurred in the direct adjustment for the DRS model. The highest coverage in the DRS scenarios was 94.8% (95% CI 93.4%-96.2%) and occurred in the model that stratified on DRS. Although variation in the performance of VE estimates occurred across modeling strategies, variation in performance was also present across exposure groups. Conclusions: Overall, models using a DRS to adjust for confounding performed adequately but not as well as the multivariable models that adjusted for covariates individually. UR - https://formative.jmir.org/2025/1/e58981 UR - http://dx.doi.org/10.2196/58981 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/58981 ER - TY - JOUR AU - Kabwama, Ndugwa Steven AU - Wanyenze, K. Rhoda AU - Lindgren, Helena AU - Razaz, Neda AU - Ssenkusu, M. John AU - Alfvén, Tobias PY - 2025/1/15 TI - Interventions to Maintain HIV/AIDS, Tuberculosis, and Malaria Service Delivery During Public Health Emergencies in Low- and Middle-Income Countries: Protocol for a Systematic Review JO - JMIR Res Protoc SP - e64316 VL - 14 KW - service availability KW - emergencies KW - tuberculosis KW - malaria KW - systematic reviews KW - health services KW - HIV KW - AIDS KW - public health emergency KW - low- and middle-income countries KW - qualitative reviews KW - qualitative KW - policies KW - communities KW - health facilities KW - emergency KW - implement KW - implementation N2 - Background: Although existing disease preparedness and response frameworks provide guidance about strengthening emergency response capacity, little attention is paid to health service continuity during emergency responses. During the 2014 Ebola outbreak, there were 11,325 reported deaths due to the Ebola virus and yet disruption in access to care caused more than 10,000 additional deaths due to measles, HIV/AIDS, tuberculosis, and malaria. Low- and middle-income countries account for the largest disease burden due to HIV, tuberculosis, and malaria and yet previous responses to health emergencies showed that HIV, tuberculosis, and malaria service delivery can be significantly disrupted. To date, there has not been a systematic synthesis of interventions implemented to maintain the delivery of these services during emergencies. Objective: This study aimed to synthesize the interventions implemented to maintain HIV/AIDS, tuberculosis, and malaria services during public health emergencies in low- and middle-income countries. Methods: The systematic review was registered in the international register for prospective systematic reviews. It will include activities undertaken to improve human health either through preventing the occurrence of HIV, tuberculosis, or malaria, reducing the severity among patients, or promoting the restoration of functioning lost as a result of experiencing HIV, tuberculosis, or malaria during health emergencies. These will include policy-level (eg, development of guidelines), health facility?level (eg, service rescheduling), and community-level interventions (eg, community drug distribution). Service delivery will be in terms of improving access, availability, use, and coverage. We will report on any interventions to maintain services along the care cascade for HIV, tuberculosis, or malaria. Peer-reviewed study databases including MEDLINE, Web of Science, Embase, Cochrane, and Global Index Medicus will be searched. Reference lists from global reports on HIV/AIDS, tuberculosis, or malaria will also be searched. We will use the GRADE-CERQual (Grading of Recommendations Assessment, Development, and Evaluation?Confidence in Evidence from Reviews of Qualitative Research) approach to report on the quality of evidence in each paper. The information from the studies will be synthesized at the disease or condition level (HIV/AIDS, tuberculosis, and malaria), implementation level (policy, health facility, and community), and outcomes (improving access, availability, use, or coverage). We will use the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist to report findings and discuss implications for strengthening preparedness and response, as well as strengthening health systems in low- and middle-income countries. Results: The initial search for published literature was conducted between January 2023 and March 2023 and yielded 8119 studies. At the time of publication, synthesis and interpretation of results were being concluded. Final results will be published in 2025. Conclusions: The findings will inform the development of national and global guidance to minimize disruption of services for patients with HIV/AIDS, tuberculosis, and malaria during public health emergencies. Trial Registration: PROSPERO CRD42023408967; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=408967 International Registered Report Identifier (IRRID): PRR1-10.2196/64316 UR - https://www.researchprotocols.org/2025/1/e64316 UR - http://dx.doi.org/10.2196/64316 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/64316 ER - TY - JOUR AU - Rohrer, Rebecca AU - Wilson, Allegra AU - Baumgartner, Jennifer AU - Burton, Nicole AU - Ortiz, R. Ray AU - Dorsinville, Alan AU - Jones, E. Lucretia AU - Greene, K. Sharon PY - 2025/1/14 TI - Nowcasting to Monitor Real-Time Mpox Trends During the 2022 Outbreak in New York City: Evaluation Using Reportable Disease Data Stratified by Race or Ethnicity JO - Online J Public Health Inform SP - e56495 VL - 17 KW - data quality KW - epidemiology KW - forecasting KW - infectious disease KW - morbidity and mortality trends KW - mpox KW - nowcasting KW - public health practice KW - surveillance N2 - Background: Applying nowcasting methods to partially accrued reportable disease data can help policymakers interpret recent epidemic trends despite data lags and quickly identify and remediate health inequities. During the 2022 mpox outbreak in New York City, we applied Nowcasting by Bayesian Smoothing (NobBS) to estimate recent cases, citywide and stratified by race or ethnicity (Black or African American, Hispanic or Latino, and White). However, in real time, it was unclear if the estimates were accurate. Objective: We evaluated the accuracy of estimated mpox case counts across a range of NobBS implementation options. Methods: We evaluated NobBS performance for New York City residents with a confirmed or probable mpox diagnosis or illness onset from July 8 through September 30, 2022, as compared with fully accrued cases. We used the exponentiated average log score (average score) to compare moving window lengths, stratifying or not by race or ethnicity, diagnosis and onset dates, and daily and weekly aggregation. Results: During the study period, 3305 New York City residents were diagnosed with mpox (median 4, IQR 3-5 days from diagnosis to diagnosis report). Of these, 812 (25%) had missing onset dates, and of these, 230 (28%) had unknown race or ethnicity. The median lag in days from onset to onset report was 10 (IQR 7-14). For daily hindcasts by diagnosis date, the average score was 0.27 for the 14-day moving window used in real time. Average scores improved (increased) with longer moving windows (maximum: 0.47 for 49-day window). Stratifying by race or ethnicity improved performance, with an overall average score of 0.38 for the 14-day moving window (maximum: 0.57 for 49 day-window). Hindcasts for White patients performed best, with average scores of 0.45 for the 14-day window and 0.75 for the 49-day window. For unstratified, daily hindcasts by onset date, the average score ranged from 0.16 for the 42-day window to 0.30 for the 14-day window. Performance was not improved by weekly aggregation. Hindcasts underestimated diagnoses in early August after the epidemic peaked, then overestimated diagnoses in late August as the epidemic waned. Estimates were most accurate during September when cases were low and stable. Conclusions: Performance was better when hindcasting by diagnosis date than by onset date, consistent with shorter lags and higher completeness for diagnoses. For daily hindcasts by diagnosis date, longer moving windows performed better, but direct comparisons are limited because longer windows could only be assessed after case counts in this outbreak had stabilized. Stratification by race or ethnicity improved performance and identified differences in epidemic trends across patient groups. Contributors to differences in performance across strata might include differences in case volume, epidemic trends, delay distributions, and interview success rates. Health departments need reliable nowcasting and rapid evaluation tools, particularly to promote health equity by ensuring accurate estimates within all strata. UR - https://ojphi.jmir.org/2025/1/e56495 UR - http://dx.doi.org/10.2196/56495 ID - info:doi/10.2196/56495 ER - TY - JOUR AU - Sasaki, Kenji AU - Ikeda, Yoichi AU - Nakano, Takashi PY - 2025/1/3 TI - Quantifying the Regional Disproportionality of COVID-19 Spread: Modeling Study JO - JMIR Form Res SP - e59230 VL - 9 KW - infectious disease KW - COVID-19 KW - epidemiology KW - public health KW - SARS-CoV-2 KW - pandemic KW - inequality measure KW - information theory KW - Kullback-Leibler divergence N2 - Background: The COVID-19 pandemic has caused serious health, economic, and social consequences worldwide. Understanding how infectious diseases spread can help mitigate these impacts. The Theil index, a measure of inequality rooted in information theory, is useful for identifying geographic disproportionality in COVID-19 incidence across regions. Objective: This study focused on capturing the degrees of regional disproportionality in incidence rates of infectious diseases over time. Using the Theil index, we aim to assess regional disproportionality in the spread of COVID-19 and detect epicenters where the number of infected individuals was disproportionately concentrated. Methods: To quantify the degree of disproportionality in the incidence rates, we applied the Theil index to the publicly available data of daily confirmed COVID-19 cases in the United States over a 1100-day period. This index measures relative disproportionality by comparing daily regional case distributions with population proportions, thereby identifying regions where infections are disproportionately concentrated. Results: Our analysis revealed a dynamic pattern of regional disproportionality in the confirmed cases by monitoring variations in regional contributions to the Theil index as the pandemic progressed. Over time, the index reflected a transition from localized outbreaks to widespread transmission, with high values corresponding to concentrated cases in some regions. We also found that the peaks in the Theil index often preceded surges in confirmed cases, suggesting its potential utility as an early warning signal. Conclusions: This study demonstrated that the Theil index is one of the effective indices for quantifying regional disproportionality in COVID-19 incidence rates. Although the Theil index alone cannot fully capture all aspects of pandemic dynamics, it serves as a valuable tool when used alongside other indicators such as infection and hospitalization rates. This approach allows policy makers to monitor regional disproportionality efficiently, offering insights for early intervention and targeted resource allocation. UR - https://formative.jmir.org/2025/1/e59230 UR - http://dx.doi.org/10.2196/59230 ID - info:doi/10.2196/59230 ER - TY - JOUR AU - Brice, N. Syaribah AU - Boutilier, J. Justin AU - Palmer, Geraint AU - Harper, R. Paul AU - Knight, Vincent AU - Tuson, Mark AU - Gartner, Daniel PY - 2024/12/13 TI - Close-Up on Ambulance Service Estimation in Indonesia: Monte Carlo Simulation Study JO - Interact J Med Res SP - e54240 VL - 13 KW - emergency medical services KW - ambulance services KW - hospital emergency services KW - Southeast Asian countries KW - low-and-middle-income countries KW - EMS KW - survey N2 - Background: Emergency medical services have a pivotal role in giving timely and appropriate responses to emergency events caused by medical, natural, or human-caused disasters. To provide adequate resources for the emergency services, such as ambulances, it is necessary to understand the demand for such services. In Indonesia, estimates of demand for emergency services cannot be obtained easily due to a lack of published literature or official reports concerning the matter. Objective: This study aimed to ascertain an estimate of the annual volume of hospital emergency visits and the corresponding demand for ambulance services in the city of Jakarta. Methods: In this study, we addressed the problem of emergency services demand estimation when aggregated detailed data are not available or are not part of the routine data collection. We used survey data together with the local Office of National Statistics reports and sample data from hospital emergency departments to establish parameter estimation. This involved estimating 4 parameters: the population of each area per period (day and night), the annual per capita hospital emergency visits, the probability of an emergency taking place in each period, and the rate of ambulance need per area. Monte Carlo simulation and naïve methods were used to generate an estimation for the mean ambulance needs per area in Jakarta. Results: The results estimated that the total annual ambulance need in Jakarta is between 83,000 and 241,000. Assuming the rate of ambulance usage in Jakarta at 9.3%, we estimated the total annual hospital emergency visits in Jakarta at around 0.9-2.6 million. The study also found that the estimation from using the simulation method was smaller than the average (naïve) methods (P<.001). Conclusions: The results provide an estimation of the annual emergency services needed for the city of Jakarta. In the absence of aggregated routinely collected data on emergency medical service usage in Jakarta, our results provide insights into whether the current emergency services, such as ambulances, have been adequately provided. UR - https://www.i-jmr.org/2024/1/e54240 UR - http://dx.doi.org/10.2196/54240 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/54240 ER - TY - JOUR AU - Wu, A. Scott AU - Soetikno, G. Alan AU - Ozer, A. Egon AU - Welch, B. Sarah AU - Liu, Yingxuan AU - Havey, J. Robert AU - Murphy, L. Robert AU - Hawkins, Claudia AU - Mason, Maryann AU - Post, A. Lori AU - Achenbach, J. Chad AU - Lundberg, L. Alexander PY - 2024/12/5 TI - Updated Surveillance Metrics and History of the COVID-19 Pandemic (2020-2023) in Canada: Longitudinal Trend Analysis JO - JMIR Public Health Surveill SP - e53218 VL - 10 KW - SARS-CoV-2 KW - COVID-19 KW - Canada KW - pandemic KW - surveillance KW - transmission KW - acceleration KW - deceleration KW - dynamic panel KW - generalized method of moments KW - GMM KW - Arellano-Bond KW - 7-day lag KW - k KW - metrics KW - epidemiology KW - dynamic KW - genomic KW - historical context KW - outbreak threshold N2 - Background: This study provides an update on the status of the COVID-19 pandemic in Canada, building upon our initial analysis conducted in 2020 by incorporating an additional 2 years of data. Objective: This study aims to (1) summarize the status of the pandemic in Canada when the World Health Organization (WHO) declared the end of the public health emergency for the COVID-19 pandemic on May 5, 2023; (2) use dynamic and genomic surveillance methods to describe the history of the pandemic in Canada and situate the window of the WHO declaration within the broader history; and (3) provide historical context for the course of the pandemic in Canada. Methods: This longitudinal study analyzed trends in traditional surveillance data and dynamic panel estimates for COVID-19 transmissions and deaths in Canada from June 2020 to May 2023. We also used sequenced SARS-CoV-2 variants from the Global Initiative on Sharing All Influenza Data (GISAID) to identify the appearance and duration of variants of concern. For these sequences, we used Nextclade nomenclature to collect clade designations and Pangolin nomenclature for lineage designations of SARS-CoV-2. We used 1-sided t tests of dynamic panel regression coefficients to measure the persistence of COVID-19 transmissions around the WHO declaration. Finally, we conducted a 1-sided t test for whether provincial and territorial weekly speed was greater than an outbreak threshold of 10. We ran the test iteratively with 6 months of data across the sample period. Results: Canada?s speed remained below the outbreak threshold for 8 months by the time of the WHO declaration ending the COVID-19 emergency of international concern. Acceleration and jerk were also low and stable. While the 1-day persistence coefficient remained statistically significant and positive (1.074; P<.001), the 7-day coefficient was negative and small in magnitude (?0.080; P=.02). Furthermore, shift parameters for either of the 2 most recent weeks around May 5, 2023, were negligible (0.003 and 0.018, respectively, with P values of .75 and .31), meaning the clustering effect of new COVID-19 cases had remained stable in the 2 weeks around the WHO declaration. From December 2021 onward, Omicron was the predominant variant of concern in sequenced viral samples. The rolling 1-sided t test of speed equal to 10 became entirely insignificant from mid-October 2022 onward. Conclusions: While COVID-19 continues to circulate in Canada, the rate of transmission remained well below the threshold of an outbreak for 8 months ahead of the WHO declaration. Both standard and enhanced surveillance metrics confirm that the pandemic had largely ended in Canada by the time of the WHO declaration. These results can inform future public health interventions and strategies in Canada, as well as contribute to the global understanding of the trajectory of the COVID-19 pandemic. UR - https://publichealth.jmir.org/2024/1/e53218 UR - http://dx.doi.org/10.2196/53218 UR - http://www.ncbi.nlm.nih.gov/pubmed/39471286 ID - info:doi/10.2196/53218 ER - TY - JOUR AU - Camirand Lemyre, Félix AU - Lévesque, Simon AU - Domingue, Marie-Pier AU - Herrmann, Klaus AU - Ethier, Jean-François PY - 2024/11/14 TI - Distributed Statistical Analyses: A Scoping Review and Examples of Operational Frameworks Adapted to Health Analytics JO - JMIR Med Inform SP - e53622 VL - 12 KW - distributed algorithms KW - generalized linear models KW - horizontally partitioned data KW - GLMs KW - learning health systems KW - distributed analysis KW - federated analysis KW - data science KW - data custodians KW - algorithms KW - statistics KW - synthesis KW - review methods KW - searches KW - scoping N2 - Background: Data from multiple organizations are crucial for advancing learning health systems. However, ethical, legal, and social concerns may restrict the use of standard statistical methods that rely on pooling data. Although distributed algorithms offer alternatives, they may not always be suitable for health frameworks. Objective: This study aims to support researchers and data custodians in three ways: (1) providing a concise overview of the literature on statistical inference methods for horizontally partitioned data, (2) describing the methods applicable to generalized linear models (GLMs) and assessing their underlying distributional assumptions, and (3) adapting existing methods to make them fully usable in health settings. Methods: A scoping review methodology was used for the literature mapping, from which methods presenting a methodological framework for GLM analyses with horizontally partitioned data were identified and assessed from the perspective of applicability in health settings. Statistical theory was used to adapt methods and derive the properties of the resulting estimators. Results: From the review, 41 articles were selected and 6 approaches were extracted to conduct standard GLM-based statistical analysis. However, these approaches assumed evenly and identically distributed data across nodes. Consequently, statistical procedures were derived to accommodate uneven node sample sizes and heterogeneous data distributions across nodes. Workflows and detailed algorithms were developed to highlight information sharing requirements and operational complexity. Conclusions: This study contributes to the field of health analytics by providing an overview of the methods that can be used with horizontally partitioned data by adapting these methods to the context of heterogeneous health data and clarifying the workflows and quantities exchanged by the methods discussed. Further analysis of the confidentiality preserved by these methods is needed to fully understand the risk associated with the sharing of summary statistics. UR - https://medinform.jmir.org/2024/1/e53622 UR - http://dx.doi.org/10.2196/53622 ID - info:doi/10.2196/53622 ER - TY - JOUR AU - Cabrera Alvargonzalez, J. Jorge AU - Larrañaga, Ana AU - Martinez, Javier AU - Pérez Castro, Sonia AU - Rey Cao, Sonia AU - Daviña Nuñez, Carlos AU - Del Campo Pérez, Víctor AU - Duran Parrondo, Carmen AU - Suarez Luque, Silvia AU - González Alonso, Elena AU - Silva Tojo, José Alfredo AU - Porteiro, Jacobo AU - Regueiro, Benito PY - 2024/9/24 TI - Assessment of the Effective Sensitivity of SARS-CoV-2 Sample Pooling Based on a Large-Scale Screening Experience: Retrospective Analysis JO - JMIR Public Health Surveill SP - e54503 VL - 10 KW - pooling KW - sensitivity KW - SARS-CoV-2 KW - PCR KW - saliva KW - screening KW - surveillance KW - COVID-19 KW - nonsymptomatic KW - transmission control N2 - Background: The development of new large-scale saliva pooling detection strategies can significantly enhance testing capacity and frequency for asymptomatic individuals, which is crucial for containing SARS-CoV-2. Objective: This study aims to implement and scale-up a SARS-CoV-2 screening method using pooled saliva samples to control the virus in critical areas and assess its effectiveness in detecting asymptomatic infections. Methods: Between August 2020 and February 2022, our laboratory received a total of 928,357 samples. Participants collected at least 1 mL of saliva using a self-sampling kit and registered their samples via a smartphone app. All samples were directly processed using AutoMate 2550 for preanalytical steps and then transferred to Microlab STAR, managed with the HAMILTON Pooling software for pooling. The standard pool preset size was 20 samples but was adjusted to 5 when the prevalence exceeded 2% in any group. Real-time polymerase chain reaction (RT-PCR) was conducted using the Allplex SARS-CoV-2 Assay until July 2021, followed by the Allplex SARS-CoV-2 FluA/FluB/RSV assay for the remainder of the study period. Results: Of the 928,357 samples received, 887,926 (95.64%) were fully processed into 56,126 pools. Of these pools, 4863 tested positive, detecting 5720 asymptomatic infections. This allowed for a comprehensive analysis of pooling?s impact on RT-PCR sensitivity and false-negative rate (FNR), including data on positive samples per pool (PPP). We defined Ctref as the minimum cycle threshold (Ct) of each data set from a sample or pool and compared these Ctref results from pooled samples with those of the individual tests (?CtP). We then examined their deviation from the expected offset due to dilution [??CtP = ?CtP ? log2]. In this work, the ?CtP and ??CtP were 2.23 versus 3.33 and ?0.89 versus 0.23, respectively, comparing global results with results for pools with 1 positive sample per pool. Therefore, depending on the number of genes used in the test and the size of the pool, we can evaluate the FNR and effective sensitivity (1 ? FNR) of the test configuration. In our scenario, with a maximum of 20 samples per pool and 3 target genes, statistical observations indicated an effective sensitivity exceeding 99%. From an economic perspective, the focus is on pooling efficiency, measured by the effective number of persons that can be tested with 1 test, referred to as persons per test (PPT). In this study, the global PPT was 8.66, reflecting savings of over 20 million euros (US $22 million) based on our reagent prices. Conclusions: Our results demonstrate that, as expected, pooling reduces the sensitivity of RT-PCR. However, with the appropriate pool size and the use of multiple target genes, effective sensitivity can remain above 99%. Saliva pooling may be a valuable tool for screening and surveillance in asymptomatic individuals and can aid in controlling SARS-CoV-2 transmission. Further studies are needed to assess the effectiveness of these strategies for SARS-CoV-2 and their application to other microorganisms or biomarkers detected by PCR. UR - https://publichealth.jmir.org/2024/1/e54503 UR - http://dx.doi.org/10.2196/54503 UR - http://www.ncbi.nlm.nih.gov/pubmed/39316785 ID - info:doi/10.2196/54503 ER - TY - JOUR AU - Jing, Liwei AU - Yu, Hongmei AU - Lu, Qing PY - 2024/8/23 TI - Further Exploring the Public Health Implications of the Network Scale-Up Method: Cross-Sectional Survey Study JO - JMIR Public Health Surveill SP - e48289 VL - 10 KW - network scale-up method KW - public health implications KW - people who inject drugs KW - popularity ratio KW - information transmission rate KW - PWID N2 - Background: The decline in the number of new HIV infections among adults has slowed down, gradually becoming the biggest obstacle to achieving the 2030 target of ending the HIV/AIDS epidemic. Thus, a political declaration to ensure that 90% of people at high risk of HIV infection can access comprehensive prevention services was proposed by the United Nations General Assembly. Therefore, obtaining an accurate estimated size of high-risk populations is required as a prior condition to plan and implement HIV prevention services. The network scale-up method (NSUM) was recommended by the United Nations Programme on HIV/AIDS and the World Health Organization to estimate the sizes of populations at high risk of HIV infection; however, we found that the NSUM also revealed underlying population characteristics of female sex workers in addition to being used to estimate the population size. Such information on underlying population characteristics is very useful in improving the planning and implementation of HIV prevention services. This is especially relevant for people who inject drugs, where in addition to stigma and discrimination, criminalization further hinders access to HIV prevention services. Objective: We aimed to conduct a further exploration of the public health implications of the NSUM by using it to estimate the population size, popularity ratio, and information transmission rate among people who inject drugs. Methods: A stratified 2-stage cluster survey of the general population and a respondent-driven sampling survey of people who inject drugs were conducted in the urban district of Taiyuan, China, in 2021. Results: The estimated size of the population of people who inject drugs in Taiyuan was 1241.9 (95% CI 1009.2?1474.9), corresponding to 4.4×10?2% (95% CI 3.6×10?2% to 5.2×10?2%) of the adult population aged 15?64 years. The estimated popularity ratio of people who inject drugs was 53.6% (95% CI 47.2%?60.1%), and the estimated information transmission rate was 87.9% (95% CI 86.5%?89.3%). Conclusions: In addition to being used to estimate the size of the population of people who inject drugs, the NSUM revealed that they have smaller-sized personal social networks while concealing their drug use, and these underlying population characteristics are extremely useful for planning appropriate service delivery approaches with the fewest barriers for people who inject drugs to access HIV prevention services. Therefore, more cost-effectiveness brings new public health implications for the NSUM, which makes it even more promising for its application. UR - https://publichealth.jmir.org/2024/1/e48289 UR - http://dx.doi.org/10.2196/48289 ID - info:doi/10.2196/48289 ER - TY - JOUR AU - Ohsawa, Yukio AU - Sun, Yi AU - Sekiguchi, Kaira AU - Kondo, Sae AU - Maekawa, Tomohide AU - Takita, Morihito AU - Tanimoto, Tetsuya AU - Kami, Masahiro PY - 2024/8/21 TI - Risk Index of Regional Infection Expansion of COVID-19: Moving Direction Entropy Study Using Mobility Data and Its Application to Tokyo JO - JMIR Public Health Surveill SP - e57742 VL - 10 KW - suppressing the spread of infection KW - index for risk assessment KW - local regions KW - diversity of mobility KW - mobility data KW - moving direction entropy KW - MDE KW - social network model KW - COVID-19 KW - influenza KW - sexually transmitted diseases N2 - Background: Policies, such as stay home, bubbling, and stay with your community, recommending that individuals reduce contact with diverse communities, including families and schools, have been introduced to mitigate the spread of the COVID-19 pandemic. However, these policies are violated if individuals from various communities gather, which is a latent risk in a real society where people move among various unreported communities. Objective: We aimed to create a physical index to assess the possibility of contact between individuals from diverse communities, which serves as an indicator of the potential risk of SARS-CoV-2 spread when considered and combined with existing indices. Methods: Moving direction entropy (MDE), which quantifies the diversity of moving directions of individuals in each local region, is proposed as an index to evaluate a region?s risk of contact of individuals from diverse communities. MDE was computed for each inland municipality in Tokyo using mobility data collected from smartphones before and during the COVID-19 pandemic. To validate the hypothesis that the impact of intercommunity contact on infection expansion becomes larger for a virus with larger infectivity, we compared the correlations of the expansion of infectious diseases with indices, including MDE and the densities of supermarkets, restaurants, etc. In addition, we analyzed the temporal changes in MDE in municipalities. Results: This study had 4 important findings. First, the MDE values for local regions showed significant invariance between different periods according to the Spearman rank correlation coefficient (>0.9). Second, MDE was found to correlate with the rate of infection cases of COVID-19 among local populations in 53 inland regions (average of 0.76 during the period of expansion). The density of restaurants had a similar correlation with COVID-19. The correlation between MDE and the rate of infection was smaller for influenza than for COVID-19, and tended to be even smaller for sexually transmitted diseases (order of infectivity). These findings support the hypothesis. Third, the spread of COVID-19 was accelerated in regions with high-rank MDE values compared to those with high-rank restaurant densities during and after the period of the governmental declaration of emergency (P<.001). Fourth, the MDE values tended to be high and increased during the pandemic period in regions where influx or daytime movement was present. A possible explanation for the third and fourth findings is that policymakers and living people have been overlooking MDE. Conclusions: We recommend monitoring the regional values of MDE to reduce the risk of infection spread. To aid in this monitoring, we present a method to create a heatmap of MDE values, thereby drawing public attention to behaviors that facilitate contact between communities during a highly infectious disease pandemic. UR - https://publichealth.jmir.org/2024/1/e57742 UR - http://dx.doi.org/10.2196/57742 UR - http://www.ncbi.nlm.nih.gov/pubmed/39037745 ID - info:doi/10.2196/57742 ER - TY - JOUR AU - Chang, Min-Chien AU - Wen, Tzai-Hung PY - 2024/8/20 TI - The Mediating Role of Human Mobility in Temporal-Lagged Relationships Between Risk Perception and COVID-19 Dynamics in Taiwan: Statistical Modeling for Comparing the Pre-Omicron and Omicron Eras JO - JMIR Public Health Surveill SP - e55183 VL - 10 KW - human mobility KW - risk perception KW - COVID-19 KW - Omicron KW - Taiwan KW - pandemic KW - disease transmission KW - pandemic dynamics KW - global threats KW - infectious disease KW - behavioural health KW - public health KW - surveillance N2 - Background: The COVID-19 pandemic has profoundly impacted all aspects of human life for over 3 years. Understanding the evolution of public risk perception during these periods is crucial. Few studies explore the mechanisms for reducing disease transmission due to risk perception. Thus, we hypothesize that changes in human mobility play a mediating role between risk perception and the progression of the pandemic. Objective: The study aims to explore how various forms of human mobility, including essential, nonessential, and job-related behaviors, mediate the temporal relationships between risk perception and pandemic dynamics. Methods: We used distributed-lag linear structural equation models to compare the mediating impact of human mobility across different virus variant periods. These models examined the temporal dynamics and time-lagged effects among risk perception, changes in mobility, and virus transmission in Taiwan, focusing on two distinct periods: (1) April-August 2021 (pre-Omicron era) and (2) February-September 2022 (Omicron era). Results: In the pre-Omicron era, our findings showed that an increase in public risk perception correlated with significant reductions in COVID-19 cases across various types of mobility within specific time frames. Specifically, we observed a decrease of 5.59 (95% CI ?4.35 to ?6.83) COVID-19 cases per million individuals after 7 weeks in nonessential mobility, while essential mobility demonstrated a reduction of 10.73 (95% CI ?9.6030 to ?11.8615) cases after 8 weeks. Additionally, job-related mobility resulted in a decrease of 3.96 (95% CI ?3.5039 to ?4.4254) cases after 11 weeks. However, during the Omicron era, these effects notably diminished. A reduction of 0.85 (95% CI ?1.0046 to ?0.6953) cases through nonessential mobility after 10 weeks and a decrease of 0.69 (95% CI ?0.7827 to ?0.6054) cases through essential mobility after 12 weeks were observed. Conclusions: This study confirms that changes in mobility serve as a mediating factor between heightened risk perception and pandemic mitigation in both pre-Omicron and Omicron periods. This suggests that elevating risk perception is notably effective in impeding virus progression, especially when vaccines are unavailable or their coverage remains limited. Our findings provide significant value for health authorities in devising policies to address the global threats posed by emerging infectious diseases. UR - https://publichealth.jmir.org/2024/1/e55183 UR - http://dx.doi.org/10.2196/55183 ID - info:doi/10.2196/55183 ER - TY - JOUR AU - Pham, Hai-Thanh AU - Do, Toan AU - Baek, Jonggyu AU - Nguyen, Cong-Khanh AU - Pham, Quang-Thai AU - Nguyen, L. Hoa AU - Goldberg, Robert AU - Pham, Loc Quang AU - Giang, Minh Le PY - 2024/8/20 TI - Handling Missing Data in COVID-19 Incidence Estimation: Secondary Data Analysis JO - JMIR Public Health Surveill SP - e53719 VL - 10 KW - imputation method KW - COVID-19 incidence rate KW - crude bias KW - crude RMSE KW - root mean square error KW - percentage change KW - pandemic KW - Vietnam KW - surveillance KW - population health KW - analytical method N2 - Background: The COVID-19 pandemic has revealed significant challenges in disease forecasting and in developing a public health response, emphasizing the need to manage missing data from various sources in making accurate forecasts. Objective: We aimed to show how handling missing data can affect estimates of the COVID-19 incidence rate (CIR) in different pandemic situations. Methods: This study used data from the COVID-19/SARS-CoV-2 surveillance system at the National Institute of Hygiene and Epidemiology, Vietnam. We separated the available data set into 3 distinct periods: zero COVID-19, transition, and new normal. We randomly removed 5% to 30% of data that were missing completely at random, with a break of 5% at each time point in the variable daily caseload of COVID-19. We selected 7 analytical methods to assess the effects of handling missing data and calculated statistical and epidemiological indices to measure the effectiveness of each method. Results: Our study examined missing data imputation performance across 3 study time periods: zero COVID-19 (n=3149), transition (n=1290), and new normal (n=9288). Imputation analyses showed that K-nearest neighbor (KNN) had the lowest mean absolute percentage change (APC) in CIR across the range (5% to 30%) of missing data. For instance, with 15% missing data, KNN resulted in 10.6%, 10.6%, and 9.7% average bias across the zero COVID-19, transition, and new normal periods, compared to 39.9%, 51.9%, and 289.7% with the maximum likelihood method. The autoregressive integrated moving average model showed the greatest mean APC in the mean number of confirmed cases of COVID-19 during each COVID-19 containment cycle (CCC) when we imputed the missing data in the zero COVID-19 period, rising from 226.3% at the 5% missing level to 6955.7% at the 30% missing level. Imputing missing data with median imputation methods had the lowest bias in the average number of confirmed cases in each CCC at all levels of missing data. In detail, in the 20% missing scenario, while median imputation had an average bias of 16.3% for confirmed cases in each CCC, which was lower than the KNN figure, maximum likelihood imputation showed a bias on average of 92.4% for confirmed cases in each CCC, which was the highest figure. During the new normal period in the 25% and 30% missing data scenarios, KNN imputation had average biases for CIR and confirmed cases in each CCC ranging from 21% to 32% for both, while maximum likelihood and moving average imputation showed biases on average above 250% for both CIR and confirmed cases in each CCC. Conclusions: Our study emphasizes the importance of understanding that the specific imputation method used by investigators should be tailored to the specific epidemiological context and data collection environment to ensure reliable estimates of the CIR. UR - https://publichealth.jmir.org/2024/1/e53719 UR - http://dx.doi.org/10.2196/53719 ID - info:doi/10.2196/53719 ER - TY - JOUR AU - Howell, R. Carrie AU - Zhang, Li AU - Clay, J. Olivio AU - Dutton, Gareth AU - Horton, Trudi AU - Mugavero, J. Michael AU - Cherrington, L. Andrea PY - 2024/8/7 TI - Social Determinants of Health Phenotypes and Cardiometabolic Condition Prevalence Among Patients in a Large Academic Health System: Latent Class Analysis JO - JMIR Public Health Surveill SP - e53371 VL - 10 KW - social determinants of health KW - electronic medical record KW - phenotypes KW - diabetes KW - obesity KW - cardiovascular disease KW - obese KW - social determinants KW - social determinant KW - cardiometabolic KW - risk factors KW - risk factor KW - latent class analysis KW - cardiometabolic disease KW - EMR KW - EHR KW - electronic health record N2 - Background: Adverse social determinants of health (SDoH) have been associated with cardiometabolic disease; however, disparities in cardiometabolic outcomes are rarely the result of a single risk factor. Objective: This study aimed to identify and characterize SDoH phenotypes based on patient-reported and neighborhood-level data from the institutional electronic medical record and evaluate the prevalence of diabetes, obesity, and other cardiometabolic diseases by phenotype status. Methods: Patient-reported SDoH were collected (January to December 2020) and neighborhood-level social vulnerability, neighborhood socioeconomic status, and rurality were linked via census tract to geocoded patient addresses. Diabetes status was coded in the electronic medical record using International Classification of Diseases codes; obesity was defined using measured BMI ?30 kg/m2. Latent class analysis was used to identify clusters of SDoH (eg, phenotypes); we then examined differences in the prevalence of cardiometabolic conditions based on phenotype status using prevalence ratios (PRs). Results: Complete data were available for analysis for 2380 patients (mean age 53, SD 16 years; n=1405, 59% female; n=1198, 50% non-White). Roughly 8% (n=179) reported housing insecurity, 30% (n=710) reported resource needs (food, health care, or utilities), and 49% (n=1158) lived in a high-vulnerability census tract. We identified 3 patient SDoH phenotypes: (1) high social risk, defined largely by self-reported SDoH (n=217, 9%); (2) adverse neighborhood SDoH (n=1353, 56%), defined largely by adverse neighborhood-level measures; and (3) low social risk (n=810, 34%), defined as low individual- and neighborhood-level risks. Patients with an adverse neighborhood SDoH phenotype had higher prevalence of diagnosed type 2 diabetes (PR 1.19, 95% CI 1.06?1.33), hypertension (PR 1.14, 95% CI 1.02?1.27), peripheral vascular disease (PR 1.46, 95% CI 1.09?1.97), and heart failure (PR 1.46, 95% CI 1.20?1.79). Conclusions: Patients with the adverse neighborhood SDoH phenotype had higher prevalence of poor cardiometabolic conditions compared to phenotypes determined by individual-level characteristics, suggesting that neighborhood environment plays a role, even if individual measures of socioeconomic status are not suboptimal. UR - https://publichealth.jmir.org/2024/1/e53371 UR - http://dx.doi.org/10.2196/53371 ID - info:doi/10.2196/53371 ER - TY - JOUR AU - Chen, Yu AU - Chen, Shouhang AU - Shen, Yuanfang AU - Li, Zhi AU - Li, Xiaolong AU - Zhang, Yaodong AU - Zhang, Xiaolong AU - Wang, Fang AU - Jin, Yuefei PY - 2024/7/31 TI - Molecular Evolutionary Dynamics of Coxsackievirus A6 Causing Hand, Foot, and Mouth Disease From 2021 to 2023 in China: Genomic Epidemiology Study JO - JMIR Public Health Surveill SP - e59604 VL - 10 KW - coxsackievirus A6 KW - hand, foot, and mouth disease KW - evolution KW - molecular epidemiology KW - China KW - CV-A6 KW - HFMD N2 - Background: Hand, foot, and mouth disease (HFMD) is a global public health concern, notably within the Asia-Pacific region. Recently, the primary pathogen causing HFMD outbreaks across numerous countries, including China, is coxsackievirus (CV) A6, one of the most prevalent enteroviruses in the world. It is a new variant that has undergone genetic recombination and evolution, which might not only induce modifications in the clinical manifestations of HFMD but also heighten its pathogenicity because of nucleotide mutation accumulation. Objective: The study assessed the epidemiological characteristics of HFMD in China and characterized the molecular epidemiology of the major pathogen (CV-A6) causing HFMD. We attempted to establish the association between disease progression and viral genetic evolution through a molecular epidemiological study. Methods: Surveillance data from the Chinese Center for Disease Control and Prevention from 2021 to 2023 were used to analyze the epidemiological seasons and peaks of HFMD in Henan, China, and capture the results of HFMD pathogen typing. We analyzed the evolutionary characteristics of all full-length CV-A6 sequences in the NCBI database and the isolated sequences in Henan. To characterize the molecular evolution of CV-A6, time-scaled tree and historical population dynamics regarding CV-A6 sequences were estimated. Additionally, we analyzed the isolated strains for mutated or missing amino acid sites compared to the prototype CV-A6 strain. Results: The 2021-2023 epidemic seasons for HFMD in Henan usually lasted from June to August, with peaks around June and July. The monthly case reporting rate during the peak period ranged from 20.7% (4854/23,440) to 35% (12,135/34,706) of the total annual number of cases. Analysis of the pathogen composition of 2850 laboratory-confirmed cases identified 8 enterovirus serotypes, among which CV-A6 accounted for the highest proportion (652/2850, 22.88%). CV-A6 emerged as the major pathogen for HFMD in 2022 (203/732, 27.73%) and 2023 (262/708, 37.01%). We analyzed all CV-A6 full-length sequences in the NCBI database and the evolutionary features of viruses isolated in Henan. In China, the D3 subtype gradually appeared from 2011, and by 2019, all CV-A6 virus strains belonged to the D3 subtype. The VP1 sequences analyzed in Henan showed that its subtypes were consistent with the national subtypes. Furthermore, we analyzed the molecular evolutionary features of CV-A6 using Bayesian phylogeny and found that the most recent common ancestor of CV-A6 D3 dates back to 2006 in China, earlier than the 2011 HFMD outbreak. Moreover, the strains isolated in 2023 had mutations at several amino acid sites compared to the original strain. Conclusions: The CV-A6 virus may have been introduced and circulating covertly within China prior to the large-scale HFMD outbreak. Our laboratory testing data confirmed the fluctuation and periodic patterns of CV-A6 prevalence. Our study provides valuable insights into understanding the evolutionary dynamics of CV-A6. UR - https://publichealth.jmir.org/2024/1/e59604 UR - http://dx.doi.org/10.2196/59604 ID - info:doi/10.2196/59604 ER - TY - JOUR AU - Lee, JinWook AU - Park, JuWon AU - Kim, Nayeon AU - Nari, Fatima AU - Bae, Seowoo AU - Lee, Ji Hyeon AU - Lee, Mingyu AU - Jun, Kwan Jae AU - Choi, Son Kui AU - Suh, Mina PY - 2024/7/22 TI - Socioeconomic Disparities in Six Common Cancer Survival Rates in South Korea: Population-Wide Retrospective Cohort Study JO - JMIR Public Health Surveill SP - e55011 VL - 10 KW - cancer survival KW - income level KW - socioeconomic status KW - deprivation index KW - inequality KW - nationwide analysis KW - cancer KW - South Korea KW - public health N2 - Background: In South Korea, the cancer incidence rate has increased by 56.5% from 2001 to 2021. Nevertheless, the 5-year cancer survival rate from 2017 to 2021 increased by 17.9% compared with that from 2001 to 2005. Cancer survival rates tend to decline with lower socioeconomic status, and variations exist in the survival rates among different cancer types. Analyzing socioeconomic patterns in the survival of patients with cancer can help identify high-risk groups and ensure that they benefit from interventions. Objective: The aim of this study was to analyze differences in survival rates among patients diagnosed with six types of cancer?stomach, colorectal, liver, breast, cervical, and lung cancers?based on socioeconomic status using Korean nationwide data. Methods: This study used the Korea Central Cancer Registry database linked to the National Health Information Database to follow up with patients diagnosed with cancer between 2014 and 2018 until December 31, 2021. Kaplan-Meier curves stratified by income status were generated, and log-rank tests were conducted for each cancer type to assess statistical significance. Hazard ratios with 95% CIs for any cause of overall survival were calculated using Cox proportional hazards regression models with the time since diagnosis. Results: The survival rates for the six different types of cancer were as follows: stomach cancer, 69.6% (96,404/138,462); colorectal cancer, 66.6% (83,406/125,156); liver cancer, 33.7% (23,860/70,712); lung cancer, 30.4% (33,203/109,116); breast cancer, 91.5% (90,730/99,159); and cervical cancer, 78% (12,930/16,580). When comparing the medical aid group to the highest income group, the hazard ratios were 1.72 (95% CI 1.66?1.79) for stomach cancer, 1.60 (95% CI 1.54?1.56) for colorectal cancer, 1.51 (95% CI 1.45?1.56) for liver cancer, 1.56 (95% CI 1.51?1.59) for lung cancer, 2.19 (95% CI 2.01?2.38) for breast cancer, and 1.65 (95% CI 1.46?1.87) for cervical cancer. A higher deprivation index and advanced diagnostic stage were associated with an increased risk of mortality. Conclusions: Socioeconomic status significantly mediates disparities in cancer survival in several cancer types. This effect is particularly pronounced in less fatal cancers such as breast cancer. Therefore, considering the type of cancer and socioeconomic factors, social and medical interventions such as early cancer detection and appropriate treatment are necessary for vulnerable populations. UR - https://publichealth.jmir.org/2024/1/e55011 UR - http://dx.doi.org/10.2196/55011 ID - info:doi/10.2196/55011 ER - TY - JOUR AU - Zhang, Zhuo AU - Xue, Dongmei AU - Bian, Ying PY - 2024/7/12 TI - Association Between Socioeconomic Inequalities in Pain and All-Cause Mortality in the China Health and Retirement Longitudinal Study: Longitudinal Cohort Study JO - JMIR Public Health Surveill SP - e54309 VL - 10 KW - pain KW - equality KW - all-cause mortality KW - concentration index KW - decomposition N2 - Background: Few studies focus on the equality of pain, and the relationship between pain and death is inconclusive. Investigating the distribution of pain and potential mortality risks is crucial for ameliorating painful conditions and devising targeted intervention measures. Objective: Our study aimed to investigate the association between inequalities in pain and all-cause mortality in China. Methods: Longitudinal cohort data from waves 1 and 2 of the China Health and Retirement Longitudinal Study (2011-2013) were used in this study. Pain was self-reported at baseline, and death information was obtained from the 2013 follow-up survey. The concentration index and its decomposition were used to explain the inequality of pain, and the association between pain and death was analyzed with a Cox proportional risk model. Results: A total of 16,747 participants were included, with an average age of 59.57 (SD 9.82) years. The prevalence of pain was 32.54% (8196/16,747). Among participants with pain, the main pain type was moderate pain (1973/5426, 36.36%), and the common pain locations were the waist (3232/16,747, 19.3%), legs (2476/16,747, 14.78%) and head (2250/16,747, 13.44%). We found that the prevalence of pain was concentrated in participants with low economic status (concentration index ?0.066, 95% CI ?0.078 to ?0.054). Educational level (36.49%), location (36.87%), and economic status (25.05%) contributed significantly to the inequality of pain. In addition, Cox regression showed that pain was associated with an increased risk of all-cause mortality (hazard ratio 1.30, 95% CI 1.06-1.61). Conclusions: The prevalence of pain in Chinese adults is concentrated among participants with low economic status, and pain increases the risk of all-cause death. Our results highlight the importance of socioeconomic factors in reducing deaths due to pain inequalities by implementing targeted interventions. UR - https://publichealth.jmir.org/2024/1/e54309 UR - http://dx.doi.org/10.2196/54309 UR - http://www.ncbi.nlm.nih.gov/pubmed/38872381 ID - info:doi/10.2196/54309 ER - TY - JOUR AU - Liu, Chuchu AU - Holme, Petter AU - Lehmann, Sune AU - Yang, Wenchuan AU - Lu, Xin PY - 2024/6/28 TI - Nonrepresentativeness of Human Mobility Data and its Impact on Modeling Dynamics of the COVID-19 Pandemic: Systematic Evaluation JO - JMIR Form Res SP - e55013 VL - 8 KW - human mobility KW - data representativeness KW - population composition KW - COVID-19 KW - epidemiological modeling N2 - Background: In recent years, a range of novel smartphone-derived data streams about human mobility have become available on a near?real-time basis. These data have been used, for example, to perform traffic forecasting and epidemic modeling. During the COVID-19 pandemic in particular, human travel behavior has been considered a key component of epidemiological modeling to provide more reliable estimates about the volumes of the pandemic?s importation and transmission routes, or to identify hot spots. However, nearly universally in the literature, the representativeness of these data, how they relate to the underlying real-world human mobility, has been overlooked. This disconnect between data and reality is especially relevant in the case of socially disadvantaged minorities. Objective: The objective of this study is to illustrate the nonrepresentativeness of data on human mobility and the impact of this nonrepresentativeness on modeling dynamics of the epidemic. This study systematically evaluates how real-world travel flows differ from census-based estimations, especially in the case of socially disadvantaged minorities, such as older adults and women, and further measures biases introduced by this difference in epidemiological studies. Methods: To understand the demographic composition of population movements, a nationwide mobility data set from 318 million mobile phone users in China from January 1 to February 29, 2020, was curated. Specifically, we quantified the disparity in the population composition between actual migrations and resident composition according to census data, and shows how this nonrepresentativeness impacts epidemiological modeling by constructing an age-structured SEIR (Susceptible-Exposed-Infected- Recovered) model of COVID-19 transmission. Results: We found a significant difference in the demographic composition between those who travel and the overall population. In the population flows, 59% (n=20,067,526) of travelers are young and 36% (n=12,210,565) of them are middle-aged (P<.001), which is completely different from the overall adult population composition of China (where 36% of individuals are young and 40% of them are middle-aged). This difference would introduce a striking bias in epidemiological studies: the estimation of maximum daily infections differs nearly 3 times, and the peak time has a large gap of 46 days. Conclusions: The difference between actual migrations and resident composition strongly impacts outcomes of epidemiological forecasts, which typically assume that flows represent underlying demographics. Our findings imply that it is necessary to measure and quantify the inherent biases related to nonrepresentativeness for accurate epidemiological surveillance and forecasting. UR - https://formative.jmir.org/2024/1/e55013 UR - http://dx.doi.org/10.2196/55013 UR - http://www.ncbi.nlm.nih.gov/pubmed/ ID - info:doi/10.2196/55013 ER - TY - JOUR AU - Ma, Shuli AU - Ge, Jie AU - Qin, Lei AU - Chen, Xiaoting AU - Du, Linlin AU - Qi, Yanbo AU - Bai, Li AU - Han, Yunfeng AU - Xie, Zhiping AU - Chen, Jiaxin AU - Jia, Yuehui PY - 2024/6/19 TI - Spatiotemporal Epidemiological Trends of Mpox in Mainland China: Spatiotemporal Ecological Comparison Study JO - JMIR Public Health Surveill SP - e57807 VL - 10 KW - mpox KW - spatiotemporal analysis KW - emergencies KW - prevention and control KW - public health N2 - Background: The World Health Organization declared mpox an international public health emergency. Since January 1, 2022, China has been ranked among the top 10 countries most affected by the mpox outbreak globally. However, there is a lack of spatial epidemiological studies on mpox, which are crucial for accurately mapping the spatial distribution and clustering of the disease. Objective: This study aims to provide geographically accurate visual evidence to determine priority areas for mpox prevention and control. Methods: Locally confirmed mpox cases were collected between June and November 2023 from 31 provinces of mainland China excluding Taiwan, Macao, and Hong Kong. Spatiotemporal epidemiological analyses, including spatial autocorrelation and regression analyses, were conducted to identify the spatiotemporal characteristics and clustering patterns of mpox attack rate and its spatial relationship with sociodemographic and socioeconomic factors. Results: From June to November 2023, a total of 1610 locally confirmed mpox cases were reported in 30 provinces in mainland China, resulting in an attack rate of 11.40 per 10 million people. Global spatial autocorrelation analysis showed that in July (Moran I=0.0938; P=.08), August (Moran I=0.1276; P=.08), and September (Moran I=0.0934; P=.07), the attack rates of mpox exhibited a clustered pattern and positive spatial autocorrelation. The Getis-Ord Gi* statistics identified hot spots of mpox attack rates in Beijing, Tianjin, Shanghai, Jiangsu, and Hainan. Beijing and Tianjin were consistent hot spots from June to October. No cold spots with low mpox attack rates were detected by the Getis-Ord Gi* statistics. Local Moran I statistics identified a high-high (HH) clustering of mpox attack rates in Guangdong, Beijing, and Tianjin. Guangdong province consistently exhibited HH clustering from June to November, while Beijing and Tianjin were identified as HH clusters from July to September. Low-low clusters were mainly located in Inner Mongolia, Xinjiang, Xizang, Qinghai, and Gansu. Ordinary least squares regression models showed that the cumulative mpox attack rates were significantly and positively associated with the proportion of the urban population (t0.05/2,1=2.4041 P=.02), per capita gross domestic product (t0.05/2,1=2.6955; P=.01), per capita disposable income (t0.05/2,1=2.8303; P=.008), per capita consumption expenditure (PCCE; t0.05/2,1=2.7452; P=.01), and PCCE for health care (t0.05/2,1=2.5924; P=.01). The geographically weighted regression models indicated a positive association and spatial heterogeneity between cumulative mpox attack rates and the proportion of the urban population, per capita gross domestic product, per capita disposable income, and PCCE, with high R2 values in north and northeast China. Conclusions: Hot spots and HH clustering of mpox attack rates identified by local spatial autocorrelation analysis should be considered key areas for precision prevention and control of mpox. Specifically, Guangdong, Beijing, and Tianjin provinces should be prioritized for mpox prevention and control. These findings provide geographically precise and visualized evidence to assist in identifying key areas for targeted prevention and control. UR - https://publichealth.jmir.org/2024/1/e57807 UR - http://dx.doi.org/10.2196/57807 UR - http://www.ncbi.nlm.nih.gov/pubmed/38896444 ID - info:doi/10.2196/57807 ER - TY - JOUR AU - Chan, Pok Chin AU - Lee, Shan Shui AU - Kwan, Ho Tsz AU - Wong, Shan Samuel Yeung AU - Yeoh, Eng-Kiong AU - Wong, Sze Ngai PY - 2024/6/19 TI - Population Behavior Changes Underlying Phasic Shifts of SARS-CoV-2 Exposure Settings Across 3 Omicron Epidemic Waves in Hong Kong: Prospective Cohort Study JO - JMIR Public Health Surveill SP - e51498 VL - 10 KW - exposure risk KW - contact setting KW - social distancing KW - epidemic control KW - participatory surveillance KW - SARS-CoV-2 KW - COVID-19 N2 - Background: Exposure risk was shown to have affected individual susceptibility and the epidemic spread of COVID-19. The dynamics of risk by and across exposure settings alongside the variations following the implementation of social distancing interventions are understudied. Objective: This study aims to examine the population?s trajectory of exposure risk in different settings and its association with SARS-CoV-2 infection across 3 consecutive Omicron epidemic waves in Hong Kong. Methods: From March to June 2022, invitation letters were posted to 41,132 randomly selected residential addresses for the recruitment of households into a prospective population cohort. Through web-based monthly surveys coupled with email reminders, a representative from each enrolled household self-reported incidents of SARS-CoV-2 infections, COVID-19 vaccination uptake, their activity pattern in the workplace, and daily and social settings in the preceding month. As a proxy of their exposure risk, the reported activity trend in each setting was differentiated into trajectories based on latent class growth analyses. The associations of different trajectories of SARS-CoV-2 infection overall and by Omicron wave (wave 1: February-April; wave 2: May-September; wave 3: October-December) in 2022 were evaluated by using Cox proportional hazards models and Kaplan-Meier analysis. Results: In total, 33,501 monthly responses in the observation period of February-December 2022 were collected from 5321 individuals, with 41.7% (2221/5321) being male and a median age of 46 (IQR 34-57) years. Against an expanding COVID-19 vaccination coverage from 81.9% to 95.9% for 2 doses and 20% to 77.7% for 3 doses, the cumulative incidence of SARS-CoV-2 infection escalated from <0.2% to 25.3%, 32.4%, and 43.8% by the end of waves 1, 2, and 3, respectively. Throughout February-December 2022, 52.2% (647/1240) of participants had worked regularly on-site, 28.7% (356/1240) worked remotely, and 19.1% (237/1240) showed an assorted pattern. For daily and social settings, 4 and 5 trajectories were identified, respectively, with 11.5% (142/1240) and 14.6% (181/1240) of the participants gauged to have a high exposure risk. Compared to remote working, working regularly on-site (adjusted hazard ratio [aHR] 1.47, 95% CI 1.19-1.80) and living in a larger household (aHR 1.12, 95% CI 1.06-1.18) were associated with a higher risk of SARS-CoV-2 infection in wave 1. Those from the highest daily exposure risk trajectory (aHR 1.46, 95% CI 1.07-2.00) and the second highest social exposure risk trajectory (aHR 1.52, 95% CI 1.18-1.97) were also at an increased risk of infection in waves 2 and 3, respectively, relative to the lowest risk trajectory. Conclusions: In an infection-naive population, SARS-CoV-2 transmission was predominantly initiated at the workplace, accelerated in the household, and perpetuated in the daily and social environments, as stringent restrictions were scaled down. These patterns highlight the phasic shift of exposure settings, which is important for informing the effective calibration of targeted social distancing measures as an alternative to lockdown. UR - https://publichealth.jmir.org/2024/1/e51498 UR - http://dx.doi.org/10.2196/51498 UR - http://www.ncbi.nlm.nih.gov/pubmed/38896447 ID - info:doi/10.2196/51498 ER - TY - JOUR AU - Gauld, Christophe AU - Hartley, Sarah AU - Micoulaud-Franchi, Jean-Arthur AU - Royant-Parola, Sylvie PY - 2024/6/11 TI - Sleep Health Analysis Through Sleep Symptoms in 35,808 Individuals Across Age and Sex Differences: Comparative Symptom Network Study JO - JMIR Public Health Surveill SP - e51585 VL - 10 KW - symptom KW - epidemiology KW - age KW - sex KW - diagnosis KW - network approach KW - sleep KW - sleep health N2 - Background: Sleep health is a multidimensional construct that includes objective and subjective parameters and is influenced by individual sleep-related behaviors and sleep disorders. Symptom network analysis allows modeling of the interactions between variables, enabling both the visualization of relationships between different factors and the identification of the strength of those relationships. Given the known influence of sex and age on sleep health, network analysis can help explore sets of mutually interacting symptoms relative to these demographic variables. Objective: This study aimed to study the centrality of symptoms and compare age and sex differences regarding sleep health using a symptom network approach in a large French population that feels concerned about their sleep. Methods: Data were extracted from a questionnaire provided by the Réseau Morphée health network. A network analysis was conducted on 39 clinical variables related to sleep disorders and sleep health. After network estimation, statistical analyses consisted of calculating inferences of centrality, robustness (ie, testifying to a sufficient effect size), predictability, and network comparison. Sleep clinical variable centralities within the networks were analyzed by both sex and age using 4 age groups (18-30, 31-45, 46-55, and >55 years), and local symptom-by-symptom correlations determined. Results: Data of 35,808 participants were obtained. The mean age was 42.7 (SD 15.7) years, and 24,964 (69.7%) were women. Overall, there were no significant differences in the structure of the symptom networks between sexes or age groups. The most central symptoms across all groups were nonrestorative sleep and excessive daytime sleepiness. In the youngest group, additional central symptoms were chronic circadian misalignment and chronic sleep deprivation (related to sleep behaviors), particularly among women. In the oldest group, leg sensory discomfort and breath abnormality complaint were among the top 4 central symptoms. Symptoms of sleep disorders thus became more central with age than sleep behaviors. The high predictability of central nodes in one of the networks underlined its importance in influencing other nodes. Conclusions: The absence of structural difference between networks is an important finding, given the known differences in sleep between sexes and across age groups. These similarities suggest comparable interactions between clinical sleep variables across sexes and age groups and highlight the implication of common sleep and wake neural circuits and circadian rhythms in understanding sleep health. More precisely, nonrestorative sleep and excessive daytime sleepiness are central symptoms in all groups. The behavioral component is particularly central in young people and women. Sleep-related respiratory and motor symptoms are prominent in older people. These results underscore the importance of comprehensive sleep promotion and screening strategies tailored to sex and age to impact sleep health. UR - https://publichealth.jmir.org/2024/1/e51585 UR - http://dx.doi.org/10.2196/51585 UR - http://www.ncbi.nlm.nih.gov/pubmed/38861716 ID - info:doi/10.2196/51585 ER - TY - JOUR AU - Bennett, W. Brady AU - DuBose, Stephanie AU - Huang, A. Ya-Lin AU - Johnson, H. Christopher AU - Hoover, W. Karen AU - Wiener, Jeffrey AU - Purcell, W. David AU - Sullivan, S. Patrick PY - 2024/6/11 TI - Population Percentage and Population Size of Men Who Have Sex With Men in the United States, 2017-2021: Meta-Analysis of 5 Population-Based Surveys JO - JMIR Public Health Surveill SP - e56643 VL - 10 KW - sexual behavior KW - sexual identity KW - sexual attraction KW - men who have sex with men KW - population estimates KW - MSM KW - men who have sex with other men KW - national surveys KW - census KW - United States N2 - Background: Male-to-male sexual transmission continues to account for the greatest proportion of new HIV diagnoses in the United States. However, calculating population-specific surveillance metrics for HIV and other sexually transmitted infections requires regularly updated estimates of the number and proportion of men who have sex with men (MSM) in the United States, which are not collected by census surveys. Objective: The purpose of this analysis was to estimate the number and percentage of MSM in the United States from population-based surveys. Methods: We used data from 5 population-based surveys to calculate weighted estimates of the proportion of MSM in the United States and pooled these estimates using meta-analytic procedures. We estimated the proportion of MSM using sexual behavior?based questions (encompassing anal or oral sex) for 3 recall periods?past 12 months, past 5 years, and lifetime. In addition, we estimated the proportion of MSM using self-reported identity and attraction survey responses. The total number of MSM and non-MSM in the United States were calculated from estimates of the percentage of MSM who reported sex with another man in the past 12 months. Results: The percentage of MSM varied by recall period: 3.3% (95% CI 1.7%-4.9%) indicated sex with another male in the past 12 months, 4.7% (95% CI 0.0%-33.8%) in the past 5 years, and 6.2% (95% CI 2.9%-9.5%) in their lifetime. There were comparable percentages of men who identified as gay or bisexual (3.4%, 95% CI 2.2%-4.6%) or who indicated that they are attracted to other men (4.9%, 95% CI 3.1%-6.7%) based on pooled estimates. Our estimate of the total number of MSM in the United States is 4,230,000 (95% CI 2,179,000-6,281,000) based on the history of recent sexual behavior (sex with another man in the past 12 months). Conclusions: We calculated the pooled percentage and number of MSM in the United States from a meta-analysis of population-based surveys collected from 2017 to 2021. These estimates update and expand upon those derived from the Centers for Disease Control and Prevention in 2012 by including estimates of the percentage of MSM based on sexual identity and sexual attraction. The percentage and number of MSM in the United States is an important indicator for calculating population-specific disease rates and eligibility for preventive interventions such as pre-exposure prophylaxis. UR - https://publichealth.jmir.org/2024/1/e56643 UR - http://dx.doi.org/10.2196/56643 UR - http://www.ncbi.nlm.nih.gov/pubmed/38861303 ID - info:doi/10.2196/56643 ER - TY - JOUR AU - Hong, Chong Hye AU - Kim, Man Young PY - 2024/6/10 TI - Multimorbidity and its Associated Factors in Korean Shift Workers: Population-Based Cross-Sectional Study JO - JMIR Public Health Surveill SP - e55014 VL - 10 KW - chronic disease KW - multimorbidity KW - shift work schedule KW - shift workers KW - population-based study KW - Korea KW - network analysis KW - logistic regression KW - cross-sectional study KW - public health N2 - Background: Multimorbidity is a crucial factor that influences premature death rates, poor health, depression, quality of life, and use of health care. Approximately one-fifth of the global workforce is involved in shift work, which is associated with increased risk for several chronic diseases and multimorbidity. About 12% to 14% of wage workers in Korea are shift workers. However, the prevalence of multimorbidity and its associated factors in Korean shift workers are rarely reported. Objective: This study aimed to assess multimorbidity prevalence, examine the factors associated with multimorbidity, and identify multimorbidity patterns among shift workers in Korea. Methods: This study is a population-based cross-sectional study using Korea National Health and Nutrition Examination Survey data from 2016 to 2020. The study included 1704 (weighted n=2,697,228) Korean shift workers aged 19 years and older. Multimorbidity was defined as participants having 2 or more chronic diseases. Demographic and job-related variables, including regular work status, average working hours per week, and shift work type, as well as health behaviors, including BMI, smoking status, alcohol use, physical activity, and sleep duration, were included in the analysis. A survey-corrected logistic regression analysis was performed to identify factors influencing multimorbidity among the workers, and multimorbidity patterns were identified with a network analysis. Results: The overall prevalence of multimorbidity was 13.7% (302/1704). Logistic regression indicated that age, income, regular work, and obesity were significant factors influencing multimorbidity. Network analysis results revealed that chronic diseases clustered into three groups: (1) cardiometabolic multimorbidity (hypertension, dyslipidemia, diabetes, coronary heart disease, and stroke), (2) musculoskeletal multimorbidity (arthritis and osteoporosis), and (3) unclassified diseases (depression, chronic liver disease, thyroid disease, asthma, cancer, and chronic kidney disease). Conclusions: The findings revealed that several socioeconomic and behavioral factors were associated with multimorbidity among shift workers, indicating the need for policy development related to work schedule modification. Further organization-level screening and intervention programs are needed to prevent and manage multimorbidity among shift workers. We also recommend longitudinal studies to confirm the effects of job-related factors and health behaviors on multimorbidity among shift workers in the future. UR - https://publichealth.jmir.org/2024/1/e55014 UR - http://dx.doi.org/10.2196/55014 UR - http://www.ncbi.nlm.nih.gov/pubmed/38857074 ID - info:doi/10.2196/55014 ER - TY - JOUR AU - Dong, Wen-Hong AU - Guo, Jun-Xia AU - Wang, Lei AU - Zheng, Shuang-Shuang AU - Zhu, Bing-Quan AU - Shao, Jie PY - 2024/6/3 TI - Trend of Mortality Due to Congenital Anomalies in Children Younger Than 5 Years in Eastern China, 2012-2021: Surveillance Data Analysis JO - JMIR Public Health Surveill SP - e53860 VL - 10 KW - under-five years KW - congenital anomalies KW - mortality KW - death cause KW - rank N2 - Background: As one of the leading causes of child mortality, deaths due to congenital anomalies (CAs) have been a prominent obstacle to meet Sustainable Development Goal 3.2. Objective: We conducted this study to understand the death burden and trend of under-5 CA mortality (CAMR) in Zhejiang, one of the provinces with the best medical services and public health foundations in Eastern China. Methods: We used data retrieved from the under-5 mortality surveillance system in Zhejiang from 2012 to 2021. CAMR by sex, residence, and age group for each year was calculated and standardized according to 2020 National Population Census sex- and residence-specific live birth data in China. Poisson regression models were used to estimate the annual average change rate (AACR) of CAMR and to obtain the rate ratio between subgroups after adjusting for sex, residence, and age group when appropriate. Results: From 2012 to 2021, a total of 1753 children died from CAs, and the standardized CAMR declined from 121.2 to 62.6 per 100,000 live births with an AACR of ?9% (95% CI ?10.7% to ?7.2%; P<.001). The declining trend was also observed in female and male children, urban and rural children, and neonates and older infants, and the AACRs were ?9.7%, ?8.5%, ?8.5%, ?9.2%, ?12%, and ?6.3%, respectively (all P<.001). However, no significant reduction was observed in children aged 1-4 years (P=.22). Generally, the CAMR rate ratios for male versus female children, rural versus urban children, older infants versus neonates, and older children versus neonates were 1.18 (95% CI 1.08-1.30; P<.001), 1.20 (95% CI 1.08-1.32; P=.001), 0.66 (95% CI 0.59-0.73; P<.001), and 0.20 (95% CI 0.17-0.24; P<.001), respectively. Among all broad CA groups, circulatory system malformations, mainly deaths caused by congenital heart diseases, accounted for 49.4% (866/1753) of deaths and ranked first across all years, although it declined yearly with an AACR of ?9.8% (P<.001). Deaths due to chromosomal abnormalities tended to grow in recent years, although the AACR was not significant (P=.90). Conclusions: CAMR reduced annually, with cardiovascular malformations ranking first across all years in Zhejiang, China. Future research and practices should focus more on the prevention, early detection, long-term management of CAs and comprehensive support for families with children with CAs to improve their survival chances. UR - https://publichealth.jmir.org/2024/1/e53860 UR - http://dx.doi.org/10.2196/53860 UR - http://www.ncbi.nlm.nih.gov/pubmed/38829691 ID - info:doi/10.2196/53860 ER - TY - JOUR AU - Meng, Fan-Tsui AU - Jhuang, Jing-Rong AU - Peng, Yan-Teng AU - Chiang, Chun-Ju AU - Yang, Ya-Wen AU - Huang, Chi-Yen AU - Huang, Kuo-Ping AU - Lee, Wen-Chung PY - 2024/5/31 TI - Predicting Lung Cancer Survival to the Future: Population-Based Cancer Survival Modeling Study JO - JMIR Public Health Surveill SP - e46737 VL - 10 KW - lung cancer KW - survival KW - survivorship-period-cohort model KW - prediction KW - prognosis KW - early diagnosis KW - lung cancer screening KW - survival trend KW - population-based KW - population health KW - public health KW - surveillance KW - low-dose computed tomography N2 - Background: Lung cancer remains the leading cause of cancer-related mortality globally, with late diagnoses often resulting in poor prognosis. In response, the Lung Ambition Alliance aims to double the 5-year survival rate by 2025. Objective: Using the Taiwan Cancer Registry, this study uses the survivorship-period-cohort model to assess the feasibility of achieving this goal by predicting future survival rates of patients with lung cancer in Taiwan. Methods: This retrospective study analyzed data from 205,104 patients with lung cancer registered between 1997 and 2018. Survival rates were calculated using the survivorship-period-cohort model, focusing on 1-year interval survival rates and extrapolating to predict 5-year outcomes for diagnoses up to 2020, as viewed from 2025. Model validation involved comparing predicted rates with actual data using symmetric mean absolute percentage error. Results: The study identified notable improvements in survival rates beginning in 2004, with the predicted 5-year survival rate for 2020 reaching 38.7%, marking a considerable increase from the most recent available data of 23.8% for patients diagnosed in 2013. Subgroup analysis revealed varied survival improvements across different demographics and histological types. Predictions based on current trends indicate that achieving the Lung Ambition Alliance?s goal could be within reach. Conclusions: The analysis demonstrates notable improvements in lung cancer survival rates in Taiwan, driven by the adoption of low-dose computed tomography screening, alongside advances in diagnostic technologies and treatment strategies. While the ambitious target set by the Lung Ambition Alliance appears achievable, ongoing advancements in medical technology and health policies will be crucial. The study underscores the potential impact of continued enhancements in lung cancer management and the importance of strategic health interventions to further improve survival outcomes. UR - https://publichealth.jmir.org/2024/1/e46737 UR - http://dx.doi.org/10.2196/46737 UR - http://www.ncbi.nlm.nih.gov/pubmed/38819904 ID - info:doi/10.2196/46737 ER - TY - JOUR AU - Haeri Mazanderani, Ahmad AU - Radebe, Lebohang AU - Sherman, G. Gayle PY - 2024/5/14 TI - Attrition Rates in HIV Viral Load Monitoring and Factors Associated With Overdue Testing Among Children Within South Africa?s Antiretroviral Treatment Program: Retrospective Descriptive Analysis JO - JMIR Public Health Surveill SP - e40796 VL - 10 KW - HIV KW - monitoring KW - viral load KW - suppression KW - overdue KW - retention KW - VL test KW - attrition KW - child KW - youth KW - pediatric KW - paediatric KW - sexually transmitted KW - sexual transmission KW - virological failure KW - South Africa KW - infant KW - adolescent KW - big data KW - descriptive analysis KW - laboratory data N2 - Background: Numerous studies in South Africa have reported low HIV viral load (VL) suppression and high attrition rates within the pediatric HIV treatment program. Objective: Using routine laboratory data, we evaluated HIV VL monitoring, including mobility and overdue VL (OVL) testing, within 5 priority districts in South Africa. Methods: We performed a retrospective descriptive analysis of National Health Laboratory Service (NHLS) data for children and adolescents aged 1-15 years having undergone HIV VL testing between May 1, 2019, and April 30, 2020, from 152 facilities within the City of Johannesburg, City of Tshwane, eThekwini, uMgungundlovu, and Zululand. HIV VL test?level data were deduplicated to patient-level data using the NHLS CDW (Corporate Data Warehouse) probabilistic record-linking algorithm and then further manually deduplicated. An OVL was defined as no subsequent VL determined within 18 months of the last test. Variables associated with the last VL test, including age, sex, VL findings, district type, and facility type, are described. A multivariate logistic regression analysis was performed to identify variables associated with an OVL test. Results: Among 21,338 children and adolescents aged 1-15 years who had an HIV VL test, 72.70% (n=15,512) had a follow-up VL test within 18 months. Furthermore, 13.33% (n=2194) of them were followed up at a different facility, of whom 3.79% (n=624) were in a different district and 1.71% (n=281) were in a different province. Among patients with a VL of ?1000 RNA copies/mL of plasma, the median time to subsequent testing was 6 (IQR 4-10) months. The younger the age of the patient, the greater the proportion with an OVL, ranging from a peak of 52% among 1-year-olds to a trough of 21% among 14-year-olds. On multivariate analysis, 2 consecutive HIV VL findings of ?1000 RNA copies/mL of plasma were associated with an increased adjusted odds ratio (AOR) of having an OVL (AOR 2.07, 95% CI 1.71-2.51). Conversely, patients examined at a hospital (AOR 0.86, 95% CI 0.77-0.96), those with ?2 previous tests (AOR 0.78, 95% CI 0.70-0.86), those examined in a rural district (AOR 0.63, 95% CI 0.54-0.73), and older age groups of 5-9 years (AOR 0.56, 95% CI 0.47-0.65) and 10-14 years (AOR 0.51, 95% CI 0.44-0.59) compared to 1-4 years were associated with a significantly decreased odds of having an OVL test. Conclusions: Considerable attrition occurs within South Africa?s pediatric HIV treatment program, with over one-fourth of children having an OVL test 18 months subsequent to their previous test. In particular, younger children and those with virological failure were found to be at increased risk of having an OVL test. Improved HIV VL monitoring is essential for improving outcomes within South Africa?s pediatric antiretroviral treatment program. UR - https://publichealth.jmir.org/2024/1/e40796 UR - http://dx.doi.org/10.2196/40796 UR - http://www.ncbi.nlm.nih.gov/pubmed/38743934 ID - info:doi/10.2196/40796 ER - TY - JOUR AU - Resendez, Skyler AU - Brown, H. Steven AU - Ruiz Ayala, Sebastian Hugo AU - Rangan, Prahalad AU - Nebeker, Jonathan AU - Montella, Diane AU - Elkin, L. Peter PY - 2024/4/30 TI - Defining the Subtypes of Long COVID and Risk Factors for Prolonged Disease: Population-Based Case-Crossover Study JO - JMIR Public Health Surveill SP - e49841 VL - 10 KW - long COVID KW - PASC KW - postacute sequelae of COVID-19 KW - public health KW - policy initiatives KW - pandemic KW - diagnosis KW - COVID-19 treatment KW - long COVID cause KW - health care support KW - public safety KW - COVID-19 KW - Veterans Affairs KW - United States KW - COVID-19 testing KW - clinician KW - mobile phone N2 - Background: There have been over 772 million confirmed cases of COVID-19 worldwide. A significant portion of these infections will lead to long COVID (post?COVID-19 condition) and its attendant morbidities and costs. Numerous life-altering complications have already been associated with the development of long COVID, including chronic fatigue, brain fog, and dangerous heart rhythms. Objective: We aim to derive an actionable long COVID case definition consisting of significantly increased signs, symptoms, and diagnoses to support pandemic-related clinical, public health, research, and policy initiatives. Methods: This research employs a case-crossover population-based study using International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) data generated at Veterans Affairs medical centers nationwide between January 1, 2020, and August 18, 2022. In total, 367,148 individuals with ICD-10-CM data both before and after a positive COVID-19 test were selected for analysis. We compared ICD-10-CM codes assigned 1 to 7 months following each patient?s positive test with those assigned up to 6 months prior. Further, 350,315 patients had novel codes assigned during this window of time. We defined signs, symptoms, and diagnoses as being associated with long COVID if they had a novel case frequency of ?1:1000, and they significantly increased in our entire cohort after a positive test. We present odds ratios with CIs for long COVID signs, symptoms, and diagnoses, organized by ICD-10-CM functional groups and medical specialty. We used our definition to assess long COVID risk based on a patient?s demographics, Elixhauser score, vaccination status, and COVID-19 disease severity. Results: We developed a long COVID definition consisting of 323 ICD-10-CM diagnosis codes grouped into 143 ICD-10-CM functional groups that were significantly increased in our 367,148 patient post?COVID-19 population. We defined 17 medical-specialty long COVID subtypes such as cardiology long COVID. Patients who were COVID-19?positive developed signs, symptoms, or diagnoses included in our long COVID definition at a proportion of at least 59.7% (268,320/449,450, based on a denominator of all patients who were COVID-19?positive). The long COVID cohort was 8 years older with more comorbidities (2-year Elixhauser score 7.97 in the patients with long COVID vs 4.21 in the patients with non?long COVID). Patients who had a more severe bout of COVID-19, as judged by their minimum oxygen saturation level, were also more likely to develop long COVID. Conclusions: An actionable, data-driven definition of long COVID can help clinicians screen for and diagnose long COVID, allowing identified patients to be admitted into appropriate monitoring and treatment programs. This long COVID definition can also support public health, research, and policy initiatives. Patients with COVID-19 who are older or have low oxygen saturation levels during their bout of COVID-19, or those who have multiple comorbidities should be preferentially watched for the development of long COVID. UR - https://publichealth.jmir.org/2024/1/e49841 UR - http://dx.doi.org/10.2196/49841 UR - http://www.ncbi.nlm.nih.gov/pubmed/38687984 ID - info:doi/10.2196/49841 ER - TY - JOUR AU - Major, G. Chelsea AU - Rodríguez, M. Dania AU - Sánchez-González, Liliana AU - Rodríguez-Estrada, Vanessa AU - Morales-Ortíz, Tatiana AU - Torres, Carolina AU - Pérez-Rodríguez, M. Nicole AU - Medina-Lópes, A. Nicole AU - Alexander, Neal AU - Mabey, David AU - Ryff, Kyle AU - Tosado-Acevedo, Rafael AU - Muñoz-Jordán, Jorge AU - Adams, E. Laura AU - Rivera-Amill, Vanessa AU - Rolfes, Melissa AU - Paz-Bailey, Gabriela PY - 2024/4/19 TI - Investigating SARS-CoV-2 Incidence and Morbidity in Ponce, Puerto Rico: Protocol and Baseline Results From a Community Cohort Study JO - JMIR Res Protoc SP - e53837 VL - 13 KW - cohort studies KW - COVID-19 KW - epidemiologic studies KW - Hispanic or Latino KW - incidence KW - prospective studies KW - research methodology KW - SARS-CoV-2 KW - seroprevalence N2 - Background: A better understanding of SARS-CoV-2 infection risk among Hispanic and Latino populations and in low-resource settings in the United States is needed to inform control efforts and strategies to improve health equity. Puerto Rico has a high poverty rate and other population characteristics associated with increased vulnerability to COVID-19, and there are limited data to date to determine community incidence. Objective: This study describes the protocol and baseline seroprevalence of SARS-CoV-2 in a prospective community-based cohort study (COPA COVID-19 [COCOVID] study) to investigate SARS-CoV-2 infection incidence and morbidity in Ponce, Puerto Rico. Methods: In June 2020, we implemented the COCOVID study within the Communities Organized to Prevent Arboviruses project platform among residents of 15 communities in Ponce, Puerto Rico, aged 1 year or older. Weekly, participants answered questionnaires on acute symptoms and preventive behaviors and provided anterior nasal swab samples for SARS-CoV-2 polymerase chain reaction testing; additional anterior nasal swabs were collected for expedited polymerase chain reaction testing from participants that reported 1 or more COVID-19?like symptoms. At enrollment and every 6 months during follow-up, participants answered more comprehensive questionnaires and provided venous blood samples for multiantigen SARS-CoV-2 immunoglobulin G antibody testing (an indicator of seroprevalence). Weekly follow-up activities concluded in April 2022 and 6-month follow-up visits concluded in August 2022. Primary study outcome measures include SARS-CoV-2 infection incidence and seroprevalence, relative risk of SARS-CoV-2 infection by participant characteristics, SARS-CoV-2 household attack rate, and COVID-19 illness characteristics and outcomes. In this study, we describe the characteristics of COCOVID participants overall and by SARS-CoV-2 seroprevalence status at baseline. Results: We enrolled a total of 1030 participants from 388 households. Relative to the general populations of Ponce and Puerto Rico, our cohort overrepresented middle-income households, employed and middle-aged adults, and older children (P<.001). Almost all participants (1021/1025, 99.61%) identified as Latino/a, 17.07% (175/1025) had annual household incomes less than US $10,000, and 45.66% (463/1014) reported 1 or more chronic medical conditions. Baseline SARS-CoV-2 seroprevalence was low (16/1030, 1.55%) overall and increased significantly with later study enrollment time (P=.003). Conclusions: The COCOVID study will provide a valuable opportunity to better estimate the burden of SARS-CoV-2 and associated risk factors in a primarily Hispanic or Latino population, assess the limitations of surveillance, and inform mitigation measures in Puerto Rico and other similar populations. International Registered Report Identifier (IRRID): RR1-10.2196/53837 UR - https://www.researchprotocols.org/2024/1/e53837 UR - http://dx.doi.org/10.2196/53837 UR - http://www.ncbi.nlm.nih.gov/pubmed/38640475 ID - info:doi/10.2196/53837 ER - TY - JOUR AU - Chen, Yi-Chu AU - Chen, Yun-Yuan AU - Su, Shih-Yung AU - Jhuang, Jing-Rong AU - Chiang, Chun-Ju AU - Yang, Ya-Wen AU - Lin, Li-Ju AU - Wu, Chao-Chun AU - Lee, Wen-Chung PY - 2024/4/18 TI - Projected Time for the Elimination of Cervical Cancer Under Various Intervention Scenarios: Age-Period-Cohort Macrosimulation Study JO - JMIR Public Health Surveill SP - e46360 VL - 10 KW - age-period-cohort model KW - population attributable fraction KW - macrosimulation KW - cancer screening KW - human papillomavirus KW - HPV KW - cervical cancer KW - intervention KW - women KW - cervical screening KW - public health intervention N2 - Background: The World Health Organization aims for the global elimination of cervical cancer, necessitating modeling studies to forecast long-term outcomes. Objective: This paper introduces a macrosimulation framework using age-period-cohort modeling and population attributable fractions to predict the timeline for eliminating cervical cancer in Taiwan. Methods: Data for cervical cancer cases from 1997 to 2016 were obtained from the Taiwan Cancer Registry. Future incidence rates under the current approach and various intervention strategies, such as scaled-up screening (cytology based or human papillomavirus [HPV] based) and HPV vaccination, were projected. Results: Our projections indicate that Taiwan could eliminate cervical cancer by 2050 with either 70% compliance in cytology-based or HPV-based screening or 90% HPV vaccination coverage. The years projected for elimination are 2047 and 2035 for cytology-based and HPV-based screening, respectively; 2050 for vaccination alone; and 2038 and 2033 for combined screening and vaccination approaches. Conclusions: The age-period-cohort macrosimulation framework offers a valuable policy analysis tool for cervical cancer control. Our findings can inform strategies in other high-incidence countries, serving as a benchmark for global efforts to eliminate the disease. UR - https://publichealth.jmir.org/2024/1/e46360 UR - http://dx.doi.org/10.2196/46360 UR - http://www.ncbi.nlm.nih.gov/pubmed/38635315 ID - info:doi/10.2196/46360 ER - TY - JOUR AU - Qian, Weicheng AU - Cooke, Aranock AU - Stanley, Gordon Kevin AU - Osgood, David Nathaniel PY - 2024/4/17 TI - Comparing Contact Tracing Through Bluetooth and GPS Surveillance Data: Simulation-Driven Approach JO - J Med Internet Res SP - e38170 VL - 26 KW - smartphone-based sensing KW - proximity contact data KW - transmission models KW - agent-based simulation KW - health informatics KW - mobile phone N2 - Background: Accurate and responsive epidemiological simulations of epidemic outbreaks inform decision-making to mitigate the impact of pandemics. These simulations must be grounded in quantities derived from measurements, among which the parameters associated with contacts between individuals are notoriously difficult to estimate. Digital contact tracing data, such as those provided by Bluetooth beaconing or GPS colocating, can provide more precise measures of contact than traditional methods based on direct observation or self-reporting. Both measurement modalities have shortcomings and are prone to false positives or negatives, as unmeasured environmental influences bias the data. Objective: We aim to compare GPS colocated versus Bluetooth beacon?derived proximity contact data for their impacts on transmission models? results under community and types of diseases. Methods: We examined the contact patterns derived from 3 data sets collected in 2016, with participants comprising students and staff from the University of Saskatchewan in Canada. Each of these 3 data sets used both Bluetooth beaconing and GPS localization on smartphones running the Ethica Data (Avicenna Research) app to collect sensor data about every 5 minutes over a month. We compared the structure of contact networks inferred from proximity contact data collected with the modalities of GPS colocating and Bluetooth beaconing. We assessed the impact of sensing modalities on the simulation results of transmission models informed by proximate contacts derived from sensing data. Specifically, we compared the incidence number, attack rate, and individual infection risks across simulation results of agent-based susceptible-exposed-infectious-removed transmission models of 4 different contagious diseases. We have demonstrated their differences with violin plots, 2-tailed t tests, and Kullback-Leibler divergence. Results: Both network structure analyses show visually salient differences in proximity contact data collected between GPS colocating and Bluetooth beaconing, regardless of the underlying population. Significant differences were found for the estimated attack rate based on distance threshold, measurement modality, and simulated disease. This finding demonstrates that the sensor modality used to trace contact can have a significant impact on the expected propagation of a disease through a population. The violin plots of attack rate and Kullback-Leibler divergence of individual infection risks demonstrated discernible differences for different sensing modalities, regardless of the underlying population and diseases. The results of the t tests on attack rate between different sensing modalities were mostly significant (P<.001). Conclusions: We show that the contact networks generated from these 2 measurement modalities are different and generate significantly different attack rates across multiple data sets and pathogens. While both modalities offer higher-resolution portraits of contact behavior than is possible with most traditional contact measures, the differential impact of measurement modality on the simulation outcome cannot be ignored and must be addressed in studies only using a single measure of contact in the future. UR - https://www.jmir.org/2024/1/e38170 UR - http://dx.doi.org/10.2196/38170 UR - http://www.ncbi.nlm.nih.gov/pubmed/38422493 ID - info:doi/10.2196/38170 ER - TY - JOUR AU - Lee, Mi-Sun AU - Lee, Hooyeon PY - 2024/4/10 TI - Chronic Disease Patterns and Their Relationship With Health-Related Quality of Life in South Korean Older Adults With the 2021 Korean National Health and Nutrition Examination Survey: Latent Class Analysis JO - JMIR Public Health Surveill SP - e49433 VL - 10 KW - chronic disease KW - latent class analysis KW - multimorbidity KW - older adults KW - quality of life N2 - Background: Improved life expectancy has increased the prevalence of older adults living with multimorbidities, which likely deteriorates their health-related quality of life (HRQoL). Understanding which chronic conditions frequently co-occur can facilitate person-centered care tailored to the needs of individuals with specific multimorbidity profiles. Objective: The study objectives were to (1) examine the prevalence of multimorbidity among Korean older adults (ie, those aged 65 years and older), (2) investigate chronic disease patterns using latent class analysis, and (3) assess which chronic disease patterns are more strongly associated with HRQoL. Methods: A sample of 1806 individuals aged 65 years and older from the 2021 Korean National Health and Nutrition Examination Survey was analyzed. Latent class analysis was conducted to identify the clustering pattern of chronic diseases. HRQoL was assessed by an 8-item health-related quality of life scale (HINT-8). Multiple linear regression was used to analyze the association with the total score of the HINT-8. Logistic regression analysis was performed to evaluate the odds ratio of having problems according to the HINT-8 items. Results: The prevalence of multimorbidity in the sample was 54.8%. Three chronic disease patterns were identified: relatively healthy, cardiometabolic condition, arthritis, allergy, or asthma. The total scores of the HINT-8 were the highest in participants characterized as arthritis, allergy, or asthma group, indicating the lowest quality of life. Conclusions: Current health care models are disease-oriented, meaning that the management of chronic conditions applies to a single condition and may not be relevant to those with multimorbidities. Identifying chronic disease patterns and their impact on overall health and well-being is critical for guiding integrated care. UR - https://publichealth.jmir.org/2024/1/e49433 UR - http://dx.doi.org/10.2196/49433 UR - http://www.ncbi.nlm.nih.gov/pubmed/38598275 ID - info:doi/10.2196/49433 ER - TY - JOUR AU - Hall, William Eric AU - Sullivan, Sean Patrick AU - Bradley, Heather PY - 2024/4/5 TI - Estimated Number of Injection-Involved Overdose Deaths in US States From 2000 to 2020: Secondary Analysis of Surveillance Data JO - JMIR Public Health Surveill SP - e49527 VL - 10 KW - death rate KW - death KW - drug abuse KW - drugs KW - injection drug use KW - injection KW - mortality KW - National Vital Statistics System KW - overdose death rate KW - overdose KW - state KW - substance abuse KW - Treatment Episode Dataset-Admission KW - treatment N2 - Background: In the United States, both drug overdose mortality and injection-involved drug overdose mortality have increased nationally over the past 25 years. Despite documented geographic differences in overdose mortality and substances implicated in overdose mortality trends, injection-involved overdose mortality has not been summarized at a subnational level. Objective: We aimed to estimate the annual number of injection-involved overdose deaths in each US state from 2000 to 2020. Methods: We conducted a stratified analysis that used data from drug treatment admissions (Treatment Episodes Data Set?Admissions; TEDS-A) and the National Vital Statistics System (NVSS) to estimate state-specific percentages of reported drug overdose deaths that were injection-involved from 2000 to 2020. TEDS-A collects data on the route of administration and the type of substance used upon treatment admission. We used these data to calculate the percentage of reported injections for each drug type by demographic group (race or ethnicity, sex, and age group), year, and state. Additionally, using NVSS mortality data, the annual number of overdose deaths involving selected drug types was identified by the following specific multiple-cause-of-death codes: heroin or synthetic opioids other than methadone (T40.1, T40.4), natural or semisynthetic opioids and methadone (T40.2, T40.3), cocaine (T40.5), psychostimulants with abuse potential (T43.6), sedatives (T42.3, T42.4), and others (T36-T59.0). We used the probabilities of injection with the annual number of overdose deaths, by year, primary substance, and demographic groups to estimate the number of overdose deaths that were injection-involved. Results: In 2020, there were 91,071 overdose deaths among adults recorded in the United States, and 93.1% (84,753/91,071) occurred in the 46 jurisdictions that reported data to TEDS-A. Slightly less than half (38,253/84,753, 45.1%; 95% CI 41.1%-49.8%) of those overdose deaths were estimated to be injection-involved, translating to 38,253 (95% CI 34,839-42,181) injection-involved overdose deaths in 2020. There was large variation among states in the estimated injection-involved overdose death rate (median 14.72, range 5.45-31.77 per 100,000 people). The national injection-involved overdose death rate increased by 323% (95% CI 255%-391%) from 2010 (3.78, 95% CI 3.33-4.31) to 2020 (15.97, 95% CI 14.55-17.61). States in which the estimated injection-involved overdose death rate increased faster than the national average were disproportionately concentrated in the Northeast region. Conclusions: Although overdose mortality and injection-involved overdose mortality have increased dramatically across the country, these trends have been more pronounced in some regions. A better understanding of state-level trends in injection-involved mortality can inform the prioritization of public health strategies that aim to reduce overdose mortality and prevent downstream consequences of injection drug use. UR - https://publichealth.jmir.org/2024/1/e49527 UR - http://dx.doi.org/10.2196/49527 UR - http://www.ncbi.nlm.nih.gov/pubmed/38578676 ID - info:doi/10.2196/49527 ER - TY - JOUR AU - Ko, Yousang AU - Park, Seuk Jae AU - Min, Jinsoo AU - Kim, Woo Hyung AU - Koo, Hyeon-Kyoung AU - Oh, Youn Jee AU - Jeong, Yun-Jeong AU - Lee, Eunhye AU - Yang, Bumhee AU - Kim, Sang Ju AU - Lee, Sung-Soon AU - Kwon, Yunhyung AU - Yang, Jiyeon AU - Han, yeon Ji AU - Jang, Jin You AU - Kim, Jinseob PY - 2024/4/1 TI - Timely Pulmonary Tuberculosis Diagnosis Based on the Epidemiological Disease Spectrum: Population-Based Prospective Cohort Study in the Republic of Korea JO - JMIR Public Health Surveill SP - e47422 VL - 10 KW - pulmonary tuberculosis KW - disease spectrum KW - timely diagnosis KW - patient delay KW - health care delay KW - risk factor KW - epidemiological disease KW - tuberculosis KW - treatment KW - TB KW - PTB disease spectrum KW - mortality KW - early diagnosis N2 - Background: Timely pulmonary tuberculosis (PTB) diagnosis is a global health priority for interrupting transmission and optimizing treatment outcomes. The traditional dichotomous time-divided approach for addressing time delays in diagnosis has limited clinical application because the time delay significantly varies depending on each community in question. Objective: We aimed to reevaluate the diagnosis time delay based on the PTB disease spectrum using a novel scoring system that was applied at the national level in the Republic of Korea. Methods: The Pulmonary Tuberculosis Spectrum Score (PTBSS) was developed based on previously published proposals related to the disease spectrum, and its validity was assessed by examining both all-cause and PTB-related mortality. In our analysis, we integrated the PTBSS into the Korea Tuberculosis Cohort Registry. We evaluated various time delays, including patient, health care, and overall delays, and their system-associated variables in line with each PTBSS. Furthermore, we reclassified the scores into distinct categories of mild (PTBSS=0-1), moderate (PBTBSS=2-3), and severe (PBTBSS=4-6) using a multivariate regression approach. Results: Among the 14,031 Korean patients with active PTB whose data were analyzed from 2018 to 2020, 37% (n=5191), 38% (n=5328), and 25% (n=3512) were classified as having a mild, moderate, and severe disease status, respectively, according to the PTBSS. This classification can therefore reflect the disease spectrum of PTB by considering the correlation of the score with mortality. The time delay patterns differed according to the PTBSS. In health care delays according to the PTBSS, greater PTB disease progression was associated with a shorter diagnosis period, since the condition is microbiologically easy to diagnose. However, with respect to patient delays, the change in elapsed time showed a U-shaped pattern as PTB progressed. This means that a remarkable patient delay in the real-world setting might occur at both apical ends of the spectrum (ie, in both mild and severe cases of PTB). Independent risk factors for a severe PTB pattern were age (adjusted odds ratio 1.014) and male sex (adjusted odds ratio 1.422), whereas no significant risk factor was found for mild PTB. Conclusions: Timely PTB diagnosis should be accomplished. This can be improved with use of the PTBSS, a simple and intuitive scoring system, which can be more helpful in clinical and public health applications compared to the traditional dichotomous time-only approach. UR - https://publichealth.jmir.org/2024/1/e47422 UR - http://dx.doi.org/10.2196/47422 UR - http://www.ncbi.nlm.nih.gov/pubmed/38557939 ID - info:doi/10.2196/47422 ER - TY - JOUR AU - Fellows, E. Ian AU - Corcoran, Carl AU - McIntyre, F. Anne PY - 2024/3/19 TI - Triangulating Truth and Reaching Consensus on Population Size, Prevalence, and More: Modeling Study JO - JMIR Public Health Surveill SP - e48738 VL - 10 KW - HIV KW - epidemiology KW - population size estimation KW - key populations KW - Bayesian models KW - consensus estimation KW - statistical tool KW - prevalence KW - Bayesian model KW - population KW - estimate KW - consensus KW - population size N2 - Background: Population size, prevalence, and incidence are essential metrics that influence public health programming and policy. However, stakeholders are frequently tasked with setting performance targets, reporting global indicators, and designing policies based on multiple (often incongruous) estimates of these variables, and they often do so in the absence of a formal, transparent framework for reaching a consensus estimate. Objective: This study aims to describe a model to synthesize multiple study estimates while incorporating stakeholder knowledge, introduce an R Shiny app to implement the model, and demonstrate the model and app using real data. Methods: In this study, we developed a Bayesian hierarchical model to synthesize multiple study estimates that allow the user to incorporate the quality of each estimate as a confidence score. The model was implemented as a user-friendly R Shiny app aimed at practitioners of population size estimation. The underlying Bayesian model was programmed in Stan for efficient sampling and computation. Results: The app was demonstrated using biobehavioral survey-based population size estimates (and accompanying confidence scores) of female sex workers and men who have sex with men from 3 survey locations in a country in sub-Saharan Africa. The consensus results incorporating confidence scores are compared with the case where they are absent, and the results with confidence scores are shown to perform better according to an app-supplied metric for unaccounted-for variation. Conclusions: The utility of the triangulator model, including the incorporation of confidence scores, as a user-friendly app is demonstrated using a use case example. Our results offer empirical evidence of the model?s effectiveness in producing an accurate consensus estimate and emphasize the significant impact that the accessible model and app offer for public health. It offers a solution to the long-standing problem of synthesizing multiple estimates, potentially leading to more informed and evidence-based decision-making processes. The Triangulator has broad utility and flexibility to be adapted and used in various other contexts and regions to address similar challenges. UR - https://publichealth.jmir.org/2024/1/e48738 UR - http://dx.doi.org/10.2196/48738 UR - http://www.ncbi.nlm.nih.gov/pubmed/38502183 ID - info:doi/10.2196/48738 ER - TY - JOUR AU - Tuyishime, Elysee AU - Remera, Eric AU - Kayitesi, Catherine AU - Malamba, Samuel AU - Sangwayire, Beata AU - Habimana Kabano, Ignace AU - Ruisenor-Escudero, Horacio AU - Oluoch, Tom AU - Unna Chukwu, Angela PY - 2024/3/15 TI - Estimation of the Population Size of Street- and Venue-Based Female Sex Workers and Sexually Exploited Minors in Rwanda in 2022: 3-Source Capture-Recapture JO - JMIR Public Health Surveill SP - e50743 VL - 10 KW - population size KW - female sex workers KW - capture-recapture KW - 3-source KW - Rwanda KW - HIV KW - surveillance KW - population KW - epidemiology KW - prevention KW - AIDS KW - sexually transmitted disease KW - STD KW - minor KW - young adult KW - sexually exploited minor KW - children N2 - Background: HIV surveillance among key populations is a priority in all epidemic settings. Female sex workers (FSWs) globally as well as in Rwanda are disproportionately affected by the HIV epidemic; hence, the Rwanda HIV and AIDS National Strategic Plan (2018-2024) has adopted regular surveillance of population size estimation (PSE) of FSWs every 2-3 years. Objective: We aimed at estimating, for the fourth time, the population size of street- and venue-based FSWs and sexually exploited minors aged ?15 years in Rwanda. Methods: In August 2022, the 3-source capture-recapture method was used to estimate the population size of FSWs and sexually exploited minors in Rwanda. The field work took 3 weeks to complete, with each capture occasion lasting for a week. The sample size for each capture was calculated using shinyrecap with inputs drawn from previously conducted estimation exercises. In each capture round, a stratified multistage sampling process was used, with administrative provinces as strata and FSW hotspots as the primary sampling unit. Different unique objects were distributed to FSWs in each capture round; acceptance of the unique object was marked as successful capture. Sampled FSWs for the subsequent capture occasions were asked if they had received the previously distributed unique object in order to determine recaptures. Statistical analysis was performed in R (version 4.0.5), and Bayesian Model Averaging was performed to produce the final PSE with a 95% credibility set (CS). Results: We sampled 1766, 1848, and 1865 FSWs and sexually exploited minors in each capture round. There were 169 recaptures strictly between captures 1 and 2, 210 recaptures exclusively between captures 2 and 3, and 65 recaptures between captures 1 and 3 only. In all 3 captures, 61 FSWs were captured. The median PSE of street- and venue-based FSWs and sexually exploited minors in Rwanda was 37,647 (95% CS 31,873-43,354), corresponding to 1.1% (95% CI 0.9%-1.3%) of the total adult females in the general population. Relative to the adult females in the general population, the western and northern provinces ranked first and second with a higher concentration of FSWs, respectively. The cities of Kigali and eastern province ranked third and fourth, respectively. The southern province was identified as having a low concentration of FSWs. Conclusions: We provide, for the first time, both the national and provincial level population size estimate of street- and venue-based FSWs in Rwanda. Compared with the previous 2 rounds of FSW PSEs at the national level, we observed differences in the street- and venue-based FSW population size in Rwanda. Our study might not have considered FSWs who do not want anyone to know they are FSWs due to several reasons, leading to a possible underestimation of the true PSE. UR - https://publichealth.jmir.org/2024/1/e50743 UR - http://dx.doi.org/10.2196/50743 UR - http://www.ncbi.nlm.nih.gov/pubmed/38488847 ID - info:doi/10.2196/50743 ER - TY - JOUR AU - Fahimi, Mansour AU - Hair, C. Elizabeth AU - Do, K. Elizabeth AU - Kreslake, M. Jennifer AU - Yan, Xiaolu AU - Chan, Elisa AU - Barlas, M. Frances AU - Giles, Abigail AU - Osborn, Larry PY - 2024/3/7 TI - Improving the Efficiency of Inferences From Hybrid Samples for Effective Health Surveillance Surveys: Comprehensive Review of Quantitative Methods JO - JMIR Public Health Surveill SP - e48186 VL - 10 KW - hybrid samples KW - composite estimation KW - optimal composition factor KW - unequal weighting effect KW - composite weighting KW - weighting KW - surveillance KW - sample survey KW - data collection KW - risk factor N2 - Background: Increasingly, survey researchers rely on hybrid samples to improve coverage and increase the number of respondents by combining independent samples. For instance, it is possible to combine 2 probability samples with one relying on telephone and another on mail. More commonly, however, researchers are now supplementing probability samples with those from online panels that are less costly. Setting aside ad hoc approaches that are void of rigor, traditionally, the method of composite estimation has been used to blend results from different sample surveys. This means individual point estimates from different surveys are pooled together, 1 estimate at a time. Given that for a typical study many estimates must be produced, this piecemeal approach is computationally burdensome and subject to the inferential limitations of the individual surveys that are used in this process. Objective: In this paper, we will provide a comprehensive review of the traditional method of composite estimation. Subsequently, the method of composite weighting is introduced, which is significantly more efficient, both computationally and inferentially when pooling data from multiple surveys. With the growing interest in hybrid sampling alternatives, we hope to offer an accessible methodology for improving the efficiency of inferences from such sample surveys without sacrificing rigor. Methods: Specifically, we will illustrate why the many ad hoc procedures for blending survey data from multiple surveys are void of scientific integrity and subject to misleading inferences. Moreover, we will demonstrate how the traditional approach of composite estimation fails to offer a pragmatic and scalable solution in practice. By relying on theoretical and empirical justifications, in contrast, we will show how our proposed methodology of composite weighting is both scientifically sound and inferentially and computationally superior to the old method of composite estimation. Results: Using data from 3 large surveys that have relied on hybrid samples composed of probability-based and supplemental sample components from online panels, we illustrate that our proposed method of composite weighting is superior to the traditional method of composite estimation in 2 distinct ways. Computationally, it is vastly less demanding and hence more accessible for practitioners. Inferentially, it produces more efficient estimates with higher levels of external validity when pooling data from multiple surveys. Conclusions: The new realities of the digital age have brought about a number of resilient challenges for survey researchers, which in turn have exposed some of the inefficiencies associated with the traditional methods this community has relied upon for decades. The resilience of such challenges suggests that piecemeal approaches that may have limited applicability or restricted accessibility will prove to be inadequate and transient. It is from this perspective that our proposed method of composite weighting has aimed to introduce a durable and accessible solution for hybrid sample surveys. UR - https://publichealth.jmir.org/2024/1/e48186 UR - http://dx.doi.org/10.2196/48186 UR - http://www.ncbi.nlm.nih.gov/pubmed/38451620 ID - info:doi/10.2196/48186 ER - TY - JOUR AU - Zeng, Jie AU - Lin, Guozhen AU - Dong, Hang AU - Li, Mengmeng AU - Ruan, Honglian AU - Yang, Jun PY - 2024/2/5 TI - Association Between Nitrogen Dioxide Pollution and Cause-Specific Mortality in China: Cross-Sectional Time Series Study JO - JMIR Public Health Surveill SP - e44648 VL - 10 KW - nitrogen dioxide KW - cause-specific mortality KW - stratification effect KW - vulnerable subpopulations KW - China N2 - Background: Nitrogen dioxide (NO2) has been frequently linked to a range of diseases and associated with high rates of mortality and morbidity worldwide. However, there is limited evidence regarding the risk of NO2 on a spectrum of causes of mortality. Moreover, adjustment for potential confounders in NO2 analysis has been insufficient, and the spatial resolution of exposure assessment has been limited. Objective: This study aimed to quantitatively assess the relationship between short-term NO2 exposure and death from a range of causes by adjusting for potential confounders in Guangzhou, China, and determine the modifying effect of gender and age. Methods: A time series study was conducted on 413,703 deaths that occurred in Guangzhou during the period of 2010 to 2018. The causes of death were classified into 10 categories and 26 subcategories. We utilized a generalized additive model with quasi-Poisson regression analysis using a natural cubic splines function with lag structure of 0 to 4 days to estimate the potential lag effect of NO2 on cause-specific mortality. We estimated the percentage change in cause-specific mortality rates per 10 ?g/m3 increase in NO2 levels. We stratified meteorological factors such as temperature, humidity, wind speed, and air pressure into high and low levels with the median as the critical value and analyzed the effects of NO2 on various death-causing diseases at those high and low levels. To further identify potentially vulnerable subpopulations, we analyzed groups stratified by gender and age. Results: A significant association existed between NO2 exposure and deaths from multiple causes. Each 10 ?g/m3 increment in NO2 density at a lag of 0 to 4 days increased the risks of all-cause mortality by 1.73% (95% CI 1.36%-2.09%) and mortality due to nonaccidental causes, cardiovascular disease, respiratory disease, endocrine disease, and neoplasms by 1.75% (95% CI 1.38%-2.12%), 2.06% (95% CI 1.54%-2.59%), 2.32% (95% CI 1.51%-3.13%), 2.40% (95% CI 0.84%-3.98%), and 1.18% (95% CI 0.59%-1.78%), respectively. Among the 26 subcategories, mortality risk was associated with 16, including intentional self-harm, hypertensive disease, and ischemic stroke disease. Relatively higher effect estimates of NO2 on mortality existed for low levels of temperature, relative humidity, wind speed, and air pressure than with high levels, except a relatively higher effect estimate was present for endocrine disease at a high air pressure level. Most of the differences between subgroups were not statistically significant. The effect estimates for NO2 were similar by gender. There were significant differences between the age groups for mortality due to all causes, nonaccidental causes, and cardiovascular disease. Conclusions: Short-term NO2 exposure may increase the risk of mortality due to a spectrum of causes, especially in potentially vulnerable populations. These findings may be important for predicting and modifying guidelines for NO2 exposure in China. UR - https://publichealth.jmir.org/2024/1/e44648 UR - http://dx.doi.org/10.2196/44648 UR - http://www.ncbi.nlm.nih.gov/pubmed/38315528 ID - info:doi/10.2196/44648 ER - TY - JOUR AU - Fong, T. Ted C. AU - Cheung, Tak Derek Yee AU - Choi, Hang Edmond Pui AU - Fong, T. Daniel Y. AU - Ho, H. Rainbow T. AU - Ip, Patrick AU - Kung, Chun Man AU - Lam, Cheung Mona Wai AU - Lee, Marie Antoinette AU - Wong, Wai William Chi AU - Lam, Hing Tai AU - Yip, F. Paul S. PY - 2024/1/26 TI - Latent Heterogeneity of Online Sexual Experiences and Associations With Sexual Risk Behaviors and Behavioral Health Outcomes in Chinese Young Adults: Cross-Sectional Study JO - JMIR Public Health Surveill SP - e50020 VL - 10 KW - Hong Kong KW - latent class analysis KW - mediation KW - mental health KW - sex knowledge KW - sexual risk behaviors KW - sexually transmitted infections KW - structural equation modeling KW - youth sexuality N2 - Background: Online sexual experiences (OSEs) are becoming increasingly common in young adults, but existing papers have reported only on specific types of OSEs and have not shown the heterogeneous nature of the repertoire of OSEs. The use patterns of OSEs remain unclear, and the relationships of OSEs with sexual risk behaviors and behavioral health outcomes have not been evaluated. Objective: This study aimed to examine the latent heterogeneity of OSEs in young adults and the associations with sexual risk behaviors and behavioral health outcomes. Methods: The 2021 Youth Sexuality Study of the Hong Kong Family Planning Association phone interviewed a random sample of 1205 young adults in Hong Kong in 2022 (male sex: 613/1205, 50.9%; mean age 23.0 years, SD 2.86 years) on lifetime OSEs, demographic and family characteristics, Patient Health Questionnaire-4 (PHQ-4) scores, sex-related factors (sexual orientation, sex knowledge, and sexual risk behaviors), and behavioral health outcomes (sexually transmitted infections [STIs], drug use, and suicidal ideation) in the past year. Sample heterogeneity of OSEs was analyzed via latent class analysis with substantive checking of the class profiles. Structural equation modeling was used to examine the direct and indirect associations between the OSE class and behavioral health outcomes via sexual risk behaviors and PHQ-4 scores. Results: The data supported 3 latent classes of OSEs with measurement invariance by sex. In this study, 33.1% (398/1205), 56.0% (675/1205), and 10.9% (132/1205) of the sample were in the abstinent class (minimal OSEs), normative class (occasional OSEs), and active class (substantive OSEs), respectively. Male participants showed a lower prevalence of the abstinent class (131/613, 21.4% versus 263/592, 44.4%) and a higher prevalence of the active class (104/613, 17.0% versus 28/592, 4.7%) than female participants. The normative class showed significantly higher sex knowledge than the other 2 classes. The active class was associated with male sex, nonheterosexual status, higher sex desire and PHQ-4 scores, and more sexual risk behaviors than the other 2 classes. Compared with the nonactive (abstinent and normative) classes, the active class was indirectly associated with higher rates of STIs (absolute difference in percentage points [?]=4.8%; P=.03) and drug use (?=7.6%; P=.001) via sexual risk behaviors, and with higher rates of suicidal ideation (?=2.5%; P=.007) via PHQ-4 scores. Conclusions: This study provided the first results on the 3 (abstinent, normative, and active) latent classes of OSEs with distinct profiles in OSEs, demographic and family characteristics, PHQ-4 scores, sex-related factors, and behavioral health outcomes. The active class showed indirect associations with higher rates of STIs and drug use via sexual risk behaviors and higher rates of suicidal ideation via PHQ-4 scores than the other 2 classes. These results have implications for the formulation and evaluation of targeted interventions to help young adults. UR - https://publichealth.jmir.org/2024/1/e50020 UR - http://dx.doi.org/10.2196/50020 UR - http://www.ncbi.nlm.nih.gov/pubmed/38277190 ID - info:doi/10.2196/50020 ER - TY - JOUR AU - De La Cerda, Isela AU - Bauer, X. Cici AU - Zhang, Kehe AU - Lee, Miryoung AU - Jones, Michelle AU - Rodriguez, Arturo AU - McCormick, B. Joseph AU - Fisher-Hoch, P. Susan PY - 2023/12/20 TI - Evaluation of a Targeted COVID-19 Community Outreach Intervention: Case Report for Precision Public Health JO - JMIR Public Health Surveill SP - e47981 VL - 9 KW - community interventions KW - emergency preparedness KW - health disparities KW - intervention evaluation KW - precision public health KW - public health informatics KW - public health intervention KW - public health KW - spatial epidemiology KW - surveillance N2 - Background: Cameron County, a low-income south Texas-Mexico border county marked by severe health disparities, was consistently among the top counties with the highest COVID-19 mortality in Texas at the onset of the pandemic. The disparity in COVID-19 burden within Texas counties revealed the need for effective interventions to address the specific needs of local health departments and their communities. Publicly available COVID-19 surveillance data were not sufficiently timely or granular to deliver such targeted interventions. An agency-academic collaboration in Cameron used novel geographic information science methods to produce granular COVID-19 surveillance data. These data were used to strategically target an educational outreach intervention named ?Boots on the Ground? (BOG) in the City of Brownsville (COB). Objective: This study aimed to evaluate the impact of a spatially targeted community intervention on daily COVID-19 test counts. Methods: The agency-academic collaboration between the COB and UTHealth Houston led to the creation of weekly COVID-19 epidemiological reports at the census tract level. These reports guided the selection of census tracts to deliver targeted BOG between April 21 and June 8, 2020. Recordkeeping of the targeted BOG tracts and the intervention dates, along with COVID-19 daily testing counts per census tract, provided data for intervention evaluation. An interrupted time series design was used to evaluate the impact on COVID-19 test counts 2 weeks before and after targeted BOG. A piecewise Poisson regression analysis was used to quantify the slope (sustained) and intercept (immediate) change between pre- and post-BOG COVID-19 daily test count trends. Additional analysis of COB tracts that did not receive targeted BOG was conducted for comparison purposes. Results: During the intervention period, 18 of the 48 COB census tracts received targeted BOG. Among these, a significant change in the slope between pre- and post-BOG daily test counts was observed in 5 tracts, 80% (n=4) of which had a positive slope change. A positive slope change implied a significant increase in daily COVID-19 test counts 2 weeks after targeted BOG compared to the testing trend observed 2 weeks before intervention. In an additional analysis of the 30 census tracts that did not receive targeted BOG, significant slope changes were observed in 10 tracts, of which positive slope changes were only observed in 20% (n=2). In summary, we found that BOG-targeted tracts had mostly positive daily COVID-19 test count slope changes, whereas untargeted tracts had mostly negative daily COVID-19 test count slope changes. Conclusions: Evaluation of spatially targeted community interventions is necessary to strengthen the evidence base of this important approach for local emergency preparedness. This report highlights how an academic-agency collaboration established and evaluated the impact of a real-time, targeted intervention delivering precision public health to a small community. UR - https://publichealth.jmir.org/2023/1/e47981 UR - http://dx.doi.org/10.2196/47981 UR - http://www.ncbi.nlm.nih.gov/pubmed/38117549 ID - info:doi/10.2196/47981 ER - TY - JOUR AU - Bougeard, Stéphanie AU - Huneau-Salaun, Adeline AU - Attia, Mikael AU - Richard, Jean-Baptiste AU - Demeret, Caroline AU - Platon, Johnny AU - Allain, Virginie AU - Le Vu, Stéphane AU - Goyard, Sophie AU - Gillon, Véronique AU - Bernard-Stoecklin, Sibylle AU - Crescenzo-Chaigne, Bernadette AU - Jones, Gabrielle AU - Rose, Nicolas AU - van der Werf, Sylvie AU - Lantz, Olivier AU - Rose, Thierry AU - Noël, Harold PY - 2023/11/28 TI - Application of Machine Learning Prediction of Individual SARS-CoV-2 Vaccination and Infection Status to the French Serosurveillance Survey From March 2020 to 2022: Cross-Sectional Study JO - JMIR Public Health Surveill SP - e46898 VL - 9 KW - SARS-CoV-2 KW - serological surveillance KW - infection KW - vaccination KW - machine learning KW - seroprevalence KW - blood testing KW - immunity KW - survey KW - vaccine response KW - French population KW - prediction N2 - Background: The seroprevalence of SARS-CoV-2 infection in the French population was estimated with a representative, repeated cross-sectional survey based on residual sera from routine blood testing. These data contained no information on infection or vaccination status, thus limiting the ability to detail changes observed in the immunity level of the population over time. Objective: Our aim is to predict the infected or vaccinated status of individuals in the French serosurveillance survey based only on the results of serological assays. Reference data on longitudinal serological profiles of seronegative, infected, and vaccinated individuals from another French cohort were used to build the predictive model. Methods: A model of individual vaccination or infection status with respect to SARS-CoV-2 obtained from a machine learning procedure was proposed based on 3 complementary serological assays. This model was applied to the French nationwide serosurveillance survey from March 2020 to March 2022 to estimate the proportions of the population that were negative, infected, vaccinated, or infected and vaccinated. Results: From February 2021 to March 2022, the estimated percentage of infected and unvaccinated individuals in France increased from 7.5% to 16.8%. During this period, the estimated percentage increased from 3.6% to 45.2% for vaccinated and uninfected individuals and from 2.1% to 29.1% for vaccinated and infected individuals. The decrease in the seronegative population can be largely attributed to vaccination. Conclusions: Combining results from the serosurveillance survey with more complete data from another longitudinal cohort completes the information retrieved from serosurveillance while keeping its protocol simple and easy to implement. UR - https://publichealth.jmir.org/2023/1/e46898 UR - http://dx.doi.org/10.2196/46898 UR - http://www.ncbi.nlm.nih.gov/pubmed/38015594 ID - info:doi/10.2196/46898 ER - TY - JOUR AU - Yang, Wenyi AU - Wang, Baohua AU - Ma, Shaobo AU - Wang, Jingxin AU - Ai, Limei AU - Li, Zhengyu AU - Wan, Xia PY - 2023/11/6 TI - Optimal Look-Back Period to Identify True Incident Cases of Diabetes in Medical Insurance Data in the Chinese Population: Retrospective Analysis Study JO - JMIR Public Health Surveill SP - e46708 VL - 9 KW - diabetes KW - incident cases KW - administrative data KW - look-back period KW - retrograde survival function N2 - Background: Accurate estimation of incidence and prevalence is vital for preventing and controlling diabetes. Administrative data (including insurance data) could be a good source to estimate the incidence of diabetes. However, how to determine the look-back period (LP) to remove cases with preceding records remains a problem for administrative data. A short LP will cause overestimation of incidence, whereas a long LP will limit the usefulness of a database. Therefore, it is necessary to determine the optimal LP length for identifying incident cases in administrative data. Objective: This study aims to offer different methods to identify the optimal LP for diabetes by using medical insurance data from the Chinese population with reference to other diseases in the administrative data. Methods: Data from the insurance database of the city of Weifang, China from between January 2016 and December 2020 were used. To identify the incident cases in 2020, we removed prevalent patients with preceding records of diabetes between 2016 and 2019 (ie, a 4-year LP). Using this 4-year LP as a reference, consistency examination indexes (CEIs), including positive predictive values, the ? coefficient, and overestimation rate, were calculated to determine the level of agreement between different LPs and an LP of 4 years (the longest LP). Moreover, we constructed a retrograde survival function, in which survival (ie, incident cases) means not having a preceding record at the given time and the survival time is the difference between the date of the last record in 2020 and the most recent previous record in the LP. Based on the survival outcome and survival time, we established the survival function and survival hazard function. When the survival probability, S(t), remains stable, and survival hazard converges to zero, we obtain the optimal LP. Combined with the results of these two methods, we determined the optimal LP for Chinese diabetes patients. Results: The ? agreement was excellent (0.950), with a high positive predictive value (92.2%) and a low overestimation rate (8.4%) after a 2-year LP. As for the retrograde survival function, S(t) dropped rapidly during the first 1-year LP (from 1.00 to 0.11). At a 417-day LP, the hazard function reached approximately zero (ht=0.000459), S(t) remained at 0.10, and at 480 days, the frequency of S(t) did not increase. Combining the two methods, we found that the optimal LP is 2 years for Chinese diabetes patients. Conclusions: The retrograde survival method and CEIs both showed effectiveness. A 2-year LP should be considered when identifying incident cases of diabetes using insurance data in the Chinese population. UR - https://publichealth.jmir.org/2023/1/e46708 UR - http://dx.doi.org/10.2196/46708 UR - http://www.ncbi.nlm.nih.gov/pubmed/37930785 ID - info:doi/10.2196/46708 ER - TY - JOUR AU - Yang, Jun AU - Dong, Hang AU - Yu, Chao AU - Li, Bixia AU - Lin, Guozhen AU - Chen, Sujuan AU - Cai, Dongjie AU - Huang, Lin AU - Wang, Boguang AU - Li, Mengmeng PY - 2023/10/9 TI - Mortality Risk and Burden From a Spectrum of Causes in Relation to Size-Fractionated Particulate Matters: Time Series Analysis JO - JMIR Public Health Surveill SP - e41862 VL - 9 KW - size-fractionated particulate matter KW - cause-specific mortality KW - cardiovascular disease KW - respiratory disease KW - neoplasm KW - attributable burden N2 - Background: There is limited evidence regarding the adverse impact of particulate matters (PMs) on multiple body systems from both epidemiological and mechanistic studies. The association between size-fractionated PMs and mortality risk, as well as the burden of a whole spectrum of causes of death, remains poorly characterized. Objective: We aimed to examine the wide range of susceptible diseases affected by different sizes of PMs. We also assessed the association between PMs with an aerodynamic diameter less than 1 µm (PM1), 2.5 µm (PM2.5), and 10 µm (PM10) and deaths from 36 causes in Guangzhou, China. Methods: Daily data were obtained on cause-specific mortality, PMs, and meteorology from 2014 to 2016. A time-stratified case-crossover approach was applied to estimate the risk and burden of cause-specific mortality attributable to PMs after adjusting for potential confounding variables, such as long-term trend and seasonality, relative humidity, temperature, air pressure, and public holidays. Stratification analyses were further conducted to explore the potential modification effects of season and demographic characteristics (eg, gender and age). We also assessed the reduction in mortality achieved by meeting the new air quality guidelines set by the World Health Organization (WHO). Results: Positive and monotonic associations were generally observed between PMs and mortality. For every 10 ?g/m3 increase in 4-day moving average concentrations of PM1, PM2.5, and PM10, the risk of all-cause mortality increased by 2.00% (95% CI 1.08%-2.92%), 1.54% (95% CI 0.93%-2.16%), and 1.38% (95% CI 0.95%-1.82%), respectively. Significant effects of size-fractionated PMs were observed for deaths attributed to nonaccidental causes, cardiovascular disease, respiratory disease, neoplasms, chronic rheumatic heart diseases, hypertensive diseases, cerebrovascular diseases, stroke, influenza, and pneumonia. If daily concentrations of PM1, PM2.5, and PM10 reached the WHO target levels of 10, 15, and 45 ?g/m3, 7921 (95% empirical CI [eCI] 4454-11,206), 8303 (95% eCI 5063-11,248), and 8326 (95% eCI 5980-10690) deaths could be prevented, respectively. The effect estimates of PMs were relatively higher during hot months, among female individuals, and among those aged 85 years and older, although the differences between subgroups were not statistically significant. Conclusions: We observed positive and monotonical exposure-response curves between PMs and deaths from several diseases. The effect of PM1 was stronger on mortality than that of PM2.5 and PM10. A substantial number of premature deaths could be preventable by adhering to the WHO?s new guidelines for PMs. Our findings highlight the importance of a size-based strategy in controlling PMs and managing their health impact. UR - https://publichealth.jmir.org/2023/1/e41862 UR - http://dx.doi.org/10.2196/41862 UR - http://www.ncbi.nlm.nih.gov/pubmed/37812487 ID - info:doi/10.2196/41862 ER - TY - JOUR AU - Song, Han In AU - Lee, Hyuk Jin AU - Shin, Soo Jee PY - 2023/9/29 TI - Firearm Possession Rates in Home Countries and Firearm Suicide Rates Among US- and Foreign-Born Suicide Decedents in the United States: Analysis of Combined Data from the National Violent Death Reporting System and the Small Arms Survey JO - JMIR Public Health Surveill SP - e44211 VL - 9 KW - firearm suicide KW - US born KW - foreign born KW - means of suicide KW - firearm possession rate KW - suicide decedents N2 - Background: Suicide by firearms is a serious public health issue in the United States. However, little research has been conducted on the relationship between cultural backgrounds and suicide by firearms, specifically in those born and raised in the United States compared to those who have immigrated to the United States. Objective: To better understand the relationship between cultural backgrounds and suicide, this study aimed to examine firearm suicide rates among US- and foreign-born suicide decedents based on the firearm possession rate in the decedent?s home country. Methods: Multivariate logistic regression was performed to analyze data of 28,895 suicide decedents from 37 states obtained from the 2017 National Violent Death Reporting System data set. The firearm possession rate in the home countries of foreign-born suicide decedents was obtained from the 2017 Small Arms Survey. Results: The firearm suicide rate was about twice as high among US-born suicide decedents compared to their foreign-born counterparts. Meanwhile, suicide by hanging was about 75% higher among foreign-born compared to US-born suicide decedents. Those from countries with a low-to-medium firearm possession rate were significantly less likely to use firearms compared to US-born suicide decedents (adjusted odds ratio [AOR]=0.45, 95% CI 0.31-0.65, and AOR=0.46, 95% CI 0.39-0.53, respectively). Meanwhile, firearm suicide rates were not different between US- and foreign-born suicide decedents from countries with a similarly high firearm possession rate. Conclusions: The results suggest that there is an association between using firearms as a means of suicide and the firearm possession rate in the decedent?s home country. Suicide by firearms in the United States needs to be understood in the sociocultural context related to firearm possession. UR - https://publichealth.jmir.org/2023/1/e44211 UR - http://dx.doi.org/10.2196/44211 UR - http://www.ncbi.nlm.nih.gov/pubmed/37773604 ID - info:doi/10.2196/44211 ER - TY - JOUR AU - Zhang, Yanting AU - Rumgay, Harriet AU - Li, Mengmeng AU - Cao, Sumei AU - Chen, Wanqing PY - 2023/9/20 TI - Nasopharyngeal Cancer Incidence and Mortality in 185 Countries in 2020 and the Projected Burden in 2040: Population-Based Global Epidemiological Profiling JO - JMIR Public Health Surveill SP - e49968 VL - 9 KW - nasopharyngeal cancer KW - incidence KW - mortality KW - epidemiology KW - worldwide N2 - Background: Nasopharyngeal cancer (NPC) is one of the most common head and neck cancers. Objective: This study describes the global epidemiological profiles of NPC incidence and mortality in 185 countries in 2020 and the projected burden in 2040. Methods: The estimated numbers of NPC cases and deaths were retrieved from the GLOBOCAN 2020 data set. Age-standardized incidence rates (ASIRs) and age-standardized mortality rates (ASMRs) were calculated using the world standard. The future number of NPC cases and deaths by 2040 were estimated based on global demographic projections. Results: Globally, approximately 133,354 cases and 80,008 deaths from NPC were estimated in 2020 corresponding to ASIRs and ASMRs of 1.5 and 0.9 per 100,000 person-years, respectively. The largest numbers of both global cases and deaths from NPC occurred in Eastern Asia (65,866/133,354, 49.39% and 36,453/80,008, 45.56%, respectively), in which China contributed most to this burden (62,444/133,354, 46.82% and 34,810/80,008, 43.50%, respectively). The ASIRs and ASMRs in men were approximately 3-fold higher than those in women. Incidence rates varied across world regions, with the highest ASIRs for both men and women detected in South-Eastern Asia (7.7 and 2.5 per 100,000 person-years, respectively) and Eastern Asia (3.9 and 1.5 per 100,000 person-years, respectively). The highest ASMRs for both men and women were found in South-Eastern Asia (5.4 and 1.5 per 100,000 person-years, respectively). By 2040, the annual number of cases and deaths will increase to 179,476 (46,122/133,354, a 34.58% increase from the year 2020) and 113,851 (33,843/80,008, a 42.29% increase), respectively. Conclusions: Disparities in NPC incidence and mortality persist worldwide. Our study highlights the urgent need to develop and accelerate NPC control initiatives to tackle the NPC burden in certain regions and countries (eg, South-Eastern Asia, China). UR - https://publichealth.jmir.org/2023/1/e49968 UR - http://dx.doi.org/10.2196/49968 UR - http://www.ncbi.nlm.nih.gov/pubmed/37728964 ID - info:doi/10.2196/49968 ER - TY - JOUR AU - Ji, Zixiang AU - Wu, Hengjing AU - Zhu, Rongyu AU - Wang, Lu AU - Wang, Yuzhu AU - Zhang, Lijuan PY - 2023/9/15 TI - Trends in Cause-Specific Injury Mortality in China in 2005-2019: Longitudinal Observational Study JO - JMIR Public Health Surveill SP - e47902 VL - 9 KW - reverse KW - age-standardized mortality rate KW - injury KW - suicide KW - trend KW - potential years of life lost KW - average years of life lost KW - crude mortality rate KW - falls KW - older adults KW - young adults N2 - Background: Over the last few decades, although the age-standardized mortality rate (ASMR) of injury has shown a significant declining trend in China, this pattern has dramatically reversed recently. Objective: We aimed to elucidate the geographical, demographic, and temporal trends of cause-specific injuries, the reversal phenomenon of these trends, and the fluctuations of injury burden from 2005 to 2019 in China. Methods: A longitudinal observational study was performed using the raw data of injury deaths in the National Cause-of-Death surveillance data provided by the disease surveillance points system in 2005-2019. The cause-specific injuries were divided into disparate subgroups by sex, age, urban/rural region, and eastern/central/western areas of China. The burden of injury was assessed using potential years of life lost (PYLL), average years of life lost (AYLL), and PYLL rate (PYLLR). Temporal trends of mortality rates and burden were evaluated using best-fitting joinpoint models. Results: Injury deaths accounted for 7.51% (1,156,504/15,403,835) of all-cause deaths in China in 2005-2019. The crude mortality rate of all-cause injury was 47.74 per 100,000 persons. The top 3 injury types (traffic accident, falls, and suicide) accounted for 70.57% (816,145/1,156,504) of all injury-related deaths. The ASMR of all-cause injury decreased (P=.003), while the crude mortality rate remained unchanged (P=.52) during 2005-2019. A significant reverse trend in ASMR of all-cause injury was observed in urban older adults since 2013, mainly due to the inverted trend in injuries from falls. A reverse trend in ASMR of suicide was observed among individuals aged 10-24 years, with notable increases by 35.18% (annual percentage change 15.4%, 95% CI 4.1%-28.0%) in men since 2017. The AYLL and PYLLR of all-cause injury among older adults showed consistent ascending trends from 2005 to 2019 (average annual percentage change [AAPC] 6.1%, 95% CI 5.4%-6.9%, 129.04% increase for AYLL; AAPC 5.4%, 95% CI 2.4%-8.4%, 105.52% increase for PYLLR). The AYLL due to suicide for individuals aged 10-24 years showed a considerable upswing tendency (AAPC 0.5%, 95% CI 0.4%-0.7%, 8.02% increase). Conclusions: Although the ASMR of all-cause injury decreased in China from 2005 to 2019, the trend in suicide among adolescents and young adults and falls among older adults has been on the rise in recent years. Interventions should be encouraged to mitigate the cause-specific burdens of injury death. UR - https://publichealth.jmir.org/2023/1/e47902 UR - http://dx.doi.org/10.2196/47902 UR - http://www.ncbi.nlm.nih.gov/pubmed/37713250 ID - info:doi/10.2196/47902 ER - TY - JOUR AU - He, Qiyu AU - Mok, Tsz-Ngai AU - Sin, Tat-Hang AU - Yin, Jiaying AU - Li, Sicun AU - Yin, Yiyue AU - Ming, Wai-Kit AU - Feng, Bin PY - 2023/6/7 TI - Global, Regional, and National Prevalence of Gout From 1990 to 2019: Age-Period-Cohort Analysis With Future Burden Prediction JO - JMIR Public Health Surveill SP - e45943 VL - 9 KW - gout KW - prevalence KW - age-period-cohort analysis KW - Global Burden of Disease Study 2019 KW - prediction KW - Bayesian age-period-cohort analysis KW - Norped age-period-cohort analysis N2 - Background: Gout is a common and debilitating condition that is associated with significant morbidity and mortality. Despite advances in medical treatment, the global burden of gout continues to increase, particularly in high?sociodemographic index (SDI) regions. Objective: To address the aforementioned issue, we used age-period-cohort (APC) modeling to analyze global trends in gout incidence and prevalence from 1990 to 2019. Methods: Data were extracted from the Global Burden of Disease Study 2019 to assess all-age prevalence and age-standardized prevalence rates, as well as years lived with disability rates, for 204 countries and territories. APC effects were also examined in relation to gout prevalence. Future burden prediction was carried out using the Nordpred APC prediction of future incidence cases and the Bayesian APC model. Results: The global gout incidence has increased by 63.44% over the past 2 decades, with a corresponding increase of 51.12% in global years lived with disability. The sex ratio remained consistent at 3:1 (male to female), but the global gout incidence increased in both sexes over time. Notably, the prevalence and incidence of gout were the highest in high-SDI regions (95% uncertainty interval 14.19-20.62), with a growth rate of 94.3%. Gout prevalence increases steadily with age, and the prevalence increases rapidly in high-SDI quantiles for the period effect. Finally, the cohort effect showed that gout prevalence increases steadily, with the risk of morbidity increasing in younger birth cohorts. The prediction model suggests that the gout incidence rate will continue to increase globally. Conclusions: Our study provides important insights into the global burden of gout and highlights the need for effective management and prophylaxis of this condition. The APC model used in our analysis provides a novel approach to understanding the complex trends in gout prevalence and incidence, and our findings can inform the development of targeted interventions to address this growing health issue. UR - https://publichealth.jmir.org/2023/1/e45943 UR - http://dx.doi.org/10.2196/45943 UR - http://www.ncbi.nlm.nih.gov/pubmed/37285198 ID - info:doi/10.2196/45943 ER - TY - JOUR AU - Li, Xi-liang AU - Huang, Hang AU - Lu, Ying AU - Stafford, S. Randall AU - Lima, Maria Simone AU - Mota, Caroline AU - Shi, Xin PY - 2023/5/30 TI - Prediction of Multimorbidity in Brazil: Latest Fifth of a Century Population Study JO - JMIR Public Health Surveill SP - e44647 VL - 9 KW - Brazil KW - demographic factors KW - logistic regression analysis KW - multimorbidity KW - nomogram prediction KW - prevalence N2 - Background: Multimorbidity is characterized by the co-occurrence of 2 or more chronic diseases and has been a focus of the health care sector and health policy makers due to its severe adverse effects. Objective: This paper aims to use the latest 2 decades of national health data in Brazil to analyze the effects of demographic factors and predict the impact of various risk factors on multimorbidity. Methods: Data analysis methods include descriptive analysis, logistic regression, and nomogram prediction. The study makes use of a set of national cross-sectional data with a sample size of 877,032. The study used data from 1998, 2003, and 2008 from the Brazilian National Household Sample Survey, and from 2013 and 2019 from the Brazilian National Health Survey. We developed a logistic regression model to assess the influence of risk factors on multimorbidity and predict the influence of the key risk factors in the future, based on the prevalence of multimorbidity in Brazil. Results: Overall, females were 1.7 times more likely to experience multimorbidity than males (odds ratio [OR] 1.72, 95% CI 1.69-1.74). The prevalence of multimorbidity among unemployed individuals was 1.5 times that of employed individuals (OR 1.51, 95% CI 1.49-1.53). Multimorbidity prevalence increased significantly with age. People over 60 years of age were about 20 times more likely to have multiple chronic diseases than those between 18 and 29 years of age (OR 19.6, 95% CI 19.15-20.07). The prevalence of multimorbidity in illiterate individuals was 1.2 times that in literate ones (OR 1.26, 95% CI 1.24-1.28). The subjective well-being of seniors without multimorbidity was 15 times that among people with multimorbidity (OR 15.29, 95% CI 14.97-15.63). Adults with multimorbidity were more than 1.5 times more likely to be hospitalized than those without (OR 1.53, 95% CI 1.50-1.56) and 1.9 times more likely need medical care (OR 1.94, 95% CI 1.91-1.97). These patterns were similar in all 5 cohort studies and remained stable for over 21 years. A nomogram model was used to predict multimorbidity prevalence under the influence of various risk factors. The prediction results were consistent with the effects of logistic regression; older age and poorer participant well-being had the strongest correlation with multimorbidity. Conclusions: Our study shows that multimorbidity prevalence varied little in the past 2 decades but varies widely across social groups. Identifying populations with higher rates of multimorbidity prevalence may improve policy making around multimorbidity prevention and management. The Brazilian government can create public health policies targeting these groups, and provide more medical treatment and health services to support and protect the multimorbidity population. UR - https://publichealth.jmir.org/2023/1/e44647 UR - http://dx.doi.org/10.2196/44647 UR - http://www.ncbi.nlm.nih.gov/pubmed/37252771 ID - info:doi/10.2196/44647 ER - TY - JOUR AU - Wittwer, Salome AU - Paolotti, Daniela AU - Lichand, Guilherme AU - Leal Neto, Onicio PY - 2023/4/26 TI - Participatory Surveillance for COVID-19 Trend Detection in Brazil: Cross-sectional Study JO - JMIR Public Health Surveill SP - e44517 VL - 9 KW - participatory surveillance KW - COVID-19 KW - digital epidemiology KW - coronavirus KW - infectious disease KW - epidemic KW - pandemic KW - SARS-CoV-2 KW - forecast KW - trend KW - reporting KW - self-report KW - surveillance N2 - Background: The ongoing COVID-19 pandemic has emphasized the necessity of a well-functioning surveillance system to detect and mitigate disease outbreaks. Traditional surveillance (TS) usually relies on health care providers and generally suffers from reporting lags that prevent immediate response plans. Participatory surveillance (PS), an innovative digital approach whereby individuals voluntarily monitor and report on their own health status via web-based surveys, has emerged in the past decade to complement traditional data collection approaches. Objective: This study compared novel PS data on COVID-19 infection rates across 9 Brazilian cities with official TS data to examine the opportunities and challenges of using PS data, and the potential advantages of combining the 2 approaches. Methods: The TS data for Brazil are publicly accessible on GitHub. The PS data were collected through the Brazil Sem Corona platform, a Colab platform. To gather information on an individual?s health status, each participant was asked to fill out a daily questionnaire on symptoms and exposure in the Colab app. Results: We found that high participation rates are key for PS data to adequately mirror TS infection rates. Where participation was high, we documented a significant trend correlation between lagged PS data and TS infection rates, suggesting that PS data could be used for early detection. In our data, forecasting models integrating both approaches increased accuracy up to 3% relative to a 14-day forecast model based exclusively on TS data. Furthermore, we showed that PS data captured a population that significantly differed from a traditional observation. Conclusions: In the traditional system, the new recorded COVID-19 cases per day are aggregated based on positive laboratory-confirmed tests. In contrast, PS data show a significant share of reports categorized as potential COVID-19 cases that are not laboratory confirmed. Quantifying the economic value of PS system implementation remains difficult. However, scarce public funds and persisting constraints to the TS system provide motivation for a PS system, making it an important avenue for future research. The decision to set up a PS system requires careful evaluation of its expected benefits, relative to the costs of setting up platforms and incentivizing engagement to increase both coverage and consistent reporting over time. The ability to compute such economic tradeoffs might be key to have PS become a more integral part of policy toolkits moving forward. These results corroborate previous studies when it comes to the benefits of an integrated and comprehensive surveillance system, and shed light on its limitations and on the need for additional research to improve future implementations of PS platforms. UR - https://publichealth.jmir.org/2023/1/e44517 UR - http://dx.doi.org/10.2196/44517 UR - http://www.ncbi.nlm.nih.gov/pubmed/36888908 ID - info:doi/10.2196/44517 ER - TY - JOUR AU - Mason, Maryann AU - Khazanchi, Rushmin AU - Brewer, Audrey AU - Sheehan, Karen AU - Liu, Yingxuan AU - Post, Lori PY - 2023/4/7 TI - Changes in the Demographic Distribution of Chicago Gun-Homicide Decedents From 2015-2021: Violent Death Surveillance Cross-sectional Study JO - JMIR Public Health Surveill SP - e43723 VL - 9 KW - gun-homicide surveillance KW - gun-homicide decedents KW - demographics KW - age, gun violence KW - firearm N2 - Background: Homicide is one of the 5 leading causes of death in the United States for persons aged 1 to 44 years. In 2019, 75% of US homicides were by gun. Chicago has a gun-homicide rate 4 times the national average, and 90% of all homicides are by gun. The public health approach to violence prevention calls for a 4-step process, beginning with defining and monitoring the problem. Insight into the characteristics of gun-homicide decedents can help frame next steps, including identifying risk and protective factors, developing prevention and intervention strategies, and scaling effective responses. Although much is known about gun homicide because it is a long-standing, entrenched public health problem, it is useful to monitor trends to update ongoing prevention efforts. Objective: This study aimed to use public health surveillance data and methods to describe changes in the race/ethnicity, sex, and age of Chicago gun-homicide decedents from 2015-2021, in the context of year-to-year variation and an overall increase in the city?s gun-homicide rate. Methods: We calculated the distribution of gun-related homicide deaths by 6 race/ethnicity and sex groups (non-Hispanic Black female, non-Hispanic White female, Hispanic female, non-Hispanic Black male, non-Hispanic White male, and Hispanic male), age in years, and age by age group. We used counts, percentages, and rates per 100,000 persons to describe the distribution of deaths among these demographic groups. Comparisons of means and column proportions with tests of significance set at P?.05 were used to describe changes in the distribution of gun-homicide decedents over time by race-ethnicity-sex and age groups. The comparison of mean age by race-ethnicity-sex group is done using 1-way ANOVA with significance set at P?.05. Results: The distribution of gun-homicide decedents in Chicago by race/ethnicity and sex groups had been relatively stable from 2015 to 2021 with 2 notable exceptions: a more than doubling of the proportion of gun-homicide decedents who were non-Hispanic Black female (3.6% in 2015 to 8.2% in 2021) and an increase of 3.27 years in the mean age of gun-homicide decedents. The increase in mean age coincided with a decrease in the proportion of non-Hispanic Black male gun-homicide decedents between the ages of 15-19 and 20-24 years and, conversely, an increase in the proportion of non-Hispanic Black male gun-homicide decedents aged 25-34 years. Conclusions: The annual gun-homicide rate in Chicago had been increasing since 2015 with year-to-year variation. Continued monitoring of trends in the demographic makeup of gun-homicide decedents is necessary to provide the most relevant and timely information to help shape violence prevention efforts. We detected several changes that suggest a need for increased outreach and engagement marketed toward non-Hispanic Black female and non-Hispanic Black male individuals between the ages of 25-34 years. UR - https://publichealth.jmir.org/2023/1/e43723 UR - http://dx.doi.org/10.2196/43723 UR - http://www.ncbi.nlm.nih.gov/pubmed/37027193 ID - info:doi/10.2196/43723 ER - TY - JOUR AU - Marques-Cruz, Manuel AU - Nogueira-Leite, Diogo AU - Alves, Miguel João AU - Fernandes, Francisco AU - Fernandes, Miguel José AU - Almeida, Ângelo Miguel AU - Cunha Correia, Patrícia AU - Perestrelo, Paula AU - Cruz-Correia, Ricardo AU - Pita Barros, Pedro PY - 2023/4/6 TI - COVID-19 Contact Tracing as an Indicator for Evaluating a Pandemic Situation: Simulation Study JO - JMIR Public Health Surveill SP - e43836 VL - 9 KW - COVID-19 KW - public health KW - public health surveillance KW - quarantine KW - infection transmission KW - epidemiological models N2 - Background: Contact tracing is a fundamental intervention in public health. When systematically applied, it enables the breaking of chains of transmission, which is important for controlling COVID-19 transmission. In theoretically perfect contact tracing, all new cases should occur among quarantined individuals, and an epidemic should vanish. However, the availability of resources influences the capacity to perform contact tracing. Therefore, it is necessary to estimate its effectiveness threshold. We propose that this effectiveness threshold may be indirectly estimated using the ratio of COVID-19 cases arising from quarantined high-risk contacts, where higher ratios indicate better control and, under a threshold, contact tracing may fail and other restrictions become necessary. Objective: This study assessed the ratio of COVID-19 cases in high-risk contacts quarantined through contact tracing and its potential use as an ancillary pandemic control indicator. Methods: We built a 6-compartment epidemiological model to emulate COVID-19 infection flow according to publicly available data from Portuguese authorities. Our model extended the usual susceptible-exposed-infected-recovered model by adding a compartment Q with individuals in mandated quarantine who could develop infection or return to the susceptible pool and a compartment P with individuals protected from infection because of vaccination. To model infection dynamics, data on SARS-CoV-2 infection risk (IR), time until infection, and vaccine efficacy were collected. Estimation was needed for vaccine data to reflect the timing of inoculation and booster efficacy. In total, 2 simulations were built: one adjusting for the presence and absence of variants or vaccination and another maximizing IR in quarantined individuals. Both simulations were based on a set of 100 unique parameterizations. The daily ratio of infected cases arising from high-risk contacts (q estimate) was calculated. A theoretical effectiveness threshold of contact tracing was defined for 14-day average q estimates based on the classification of COVID-19 daily cases according to the pandemic phases and was compared with the timing of population lockdowns in Portugal. A sensitivity analysis was performed to understand the relationship between different parameter values and the threshold obtained. Results: An inverse relationship was found between the q estimate and daily cases in both simulations (correlations >0.70). The theoretical effectiveness thresholds for both simulations attained an alert phase positive predictive value of >70% and could have anticipated the need for additional measures in at least 4 days for the second and fourth lockdowns. Sensitivity analysis showed that only the IR and booster dose efficacy at inoculation significantly affected the q estimates. Conclusions: We demonstrated the impact of applying an effectiveness threshold for contact tracing on decision-making. Although only theoretical thresholds could be provided, their relationship with the number of confirmed cases and the prediction of pandemic phases shows the role as an indirect indicator of the efficacy of contact tracing. UR - https://publichealth.jmir.org/2023/1/e43836 UR - http://dx.doi.org/10.2196/43836 UR - http://www.ncbi.nlm.nih.gov/pubmed/36877958 ID - info:doi/10.2196/43836 ER - TY - JOUR AU - Nguyen, Tue Trong AU - Ho, Tu Cam AU - Bui, Thu Huong Thi AU - Ho, Khanh Lam AU - Ta, Thanh Van PY - 2023/2/16 TI - Multidimensional Machine Learning for Assessing Parameters Associated With COVID-19 in Vietnam: Validation Study JO - JMIR Form Res SP - e42895 VL - 7 KW - COVID-19 KW - multidimensional analysis KW - hierarchical cluster analysis KW - regression analysis KW - mild KW - moderate KW - severe KW - age KW - scoring index of chest x-ray KW - percentage and quantity of neutrophils KW - albumin KW - C-reactive protein KW - ratio of lymphocytes N2 - Background: Machine learning (ML) is a type of artificial intelligence strategy. Its algorithms are used on big data sets to see patterns, learn from their results, and perform tasks autonomously without being instructed on how to address problems. New diseases like COVID-19 provide important data for ML. Therefore, all relevant parameters should be explicitly quantified and modeled. Objective: The purpose of this study was to determine (1) the overall preclinical characteristics, (2) the cumulative cutoff values and risk ratios (RRs), and (3) the factors associated with COVID-19 severity in unidimensional and multidimensional analyses involving 2173 SARS-CoV-2 patients. Methods: The study population consisted of 2173 patients (1587 mild status [mild group] and asymptomatic patients, 377 moderate status patients [moderate group], and 209 severe status patients [severe group]). The status of the patients was recorded from September 2021 to March 2022. Two correlation tests, relative risk, and RR were used to eliminate unbalanced parameters and select the most remarkable parameters. The independent methods of hierarchical cluster analysis and k-means were used to classify parameters according to their r values. Finally, network analysis provided a 3-dimensional view of the results. Results: COVID-19 severity was significantly correlated with age (mild-moderate group: RR 4.19, 95% CI 3.58-4.95; P<.001), scoring index of chest x-ray (mild-moderate group: RR 3.29, 95% CI 2.76-3.92; P<.001; moderate-severe group: RR 3.03, 95% CI 2.4023-3.8314; P<.001), percentage of neutrophils (mild-moderate group: RR 3.18, 95% CI 2.73-3.70; P<.001; moderate-severe group: RR 3.32, 95% CI 2.6480-4.1529; P<.001), quantity of neutrophils (moderate-severe group: RR 3.15, 95% CI 2.6153-3.8025; P<.001), albumin (moderate-severe group: RR 0.46, 95% CI 0.3650-0.5752; P<.001), C-reactive protein (mild-moderate group: RR 3.4, 95% CI 2.91-3.97; P<.001), and ratio of lymphocytes (moderate-severe group: RR 0.34, 95% CI 0.2743-0.4210; P<.001). Significant inversion of correlations among the severity groups is important. Alanine transaminase and leucocytes showed a significant negative correlation (r=?1; P<.001) in the mild group and a significant positive correlation in the moderate group (r=1; P<.001). Transferrin and anion Cl showed a significant positive correlation (r=1; P<.001) in the mild group and a significant negative correlation in the moderate group (r=?0.59; P<.001). The clustering and network analysis showed that in the mild-moderate group, the closest neighbors of COVID-19 severity were ferritin and age. C-reactive protein, scoring index of chest x-ray, albumin, and lactate dehydrogenase were the next closest neighbors of these 3 factors. In the moderate-severe group, the closest neighbors of COVID-19 severity were ferritin, fibrinogen, albumin, quantity of lymphocytes, scoring index of chest x-ray, white blood cell count, lactate dehydrogenase, and quantity of neutrophils. Conclusions: This multidimensional study in Vietnam showed possible correlations between several elements and COVID-19 severity to provide clinical reference markers for surveillance and diagnostic management. UR - https://formative.jmir.org/2023/1/e42895 UR - http://dx.doi.org/10.2196/42895 UR - http://www.ncbi.nlm.nih.gov/pubmed/36668902 ID - info:doi/10.2196/42895 ER - TY - JOUR AU - Bauer, Cici AU - Zhang, Kehe AU - Li, Wenjun AU - Bernson, Dana AU - Dammann, Olaf AU - LaRochelle, R. Marc AU - Stopka, J. Thomas PY - 2023/2/10 TI - Small Area Forecasting of Opioid-Related Mortality: Bayesian Spatiotemporal Dynamic Modeling Approach JO - JMIR Public Health Surveill SP - e41450 VL - 9 KW - opioid-related mortality KW - small area estimation KW - spatiotemporal models KW - Bayesian KW - forecasting N2 - Background: Opioid-related overdose mortality has remained at crisis levels across the United States, increasing 5-fold and worsened during the COVID-19 pandemic. The ability to provide forecasts of opioid-related mortality at granular geographical and temporal scales may help guide preemptive public health responses. Current forecasting models focus on prediction on a large geographical scale, such as states or counties, lacking the spatial granularity that local public health officials desire to guide policy decisions and resource allocation. Objective: The overarching objective of our study was to develop Bayesian spatiotemporal dynamic models to predict opioid-related mortality counts and rates at temporally and geographically granular scales (ie, ZIP Code Tabulation Areas [ZCTAs]) for Massachusetts. Methods: We obtained decedent data from the Massachusetts Registry of Vital Records and Statistics for 2005 through 2019. We developed Bayesian spatiotemporal dynamic models to predict opioid-related mortality across Massachusetts? 537 ZCTAs. We evaluated the prediction performance of our models using the one-year ahead approach. We investigated the potential improvement of prediction accuracy by incorporating ZCTA-level demographic and socioeconomic determinants. We identified ZCTAs with the highest predicted opioid-related mortality in terms of rates and counts and stratified them by rural and urban areas. Results: Bayesian dynamic models with the full spatial and temporal dependency performed best. Inclusion of the ZCTA-level demographic and socioeconomic variables as predictors improved the prediction accuracy, but only in the model that did not account for the neighborhood-level spatial dependency of the ZCTAs. Predictions were better for urban areas than for rural areas, which were more sparsely populated. Using the best performing model and the Massachusetts opioid-related mortality data from 2005 through 2019, our models suggested a stabilizing pattern in opioid-related overdose mortality in 2020 and 2021 if there were no disruptive changes to the trends observed for 2005-2019. Conclusions: Our Bayesian spatiotemporal models focused on opioid-related overdose mortality data facilitated prediction approaches that can inform preemptive public health decision-making and resource allocation. While sparse data from rural and less populated locales typically pose special challenges in small area predictions, our dynamic Bayesian models, which maximized information borrowing across geographic areas and time points, were used to provide more accurate predictions for small areas. Such approaches can be replicated in other jurisdictions and at varying temporal and geographical levels. We encourage the formation of a modeling consortium for fatal opioid-related overdose predictions, where different modeling techniques could be ensembled to inform public health policy. UR - https://publichealth.jmir.org/2023/1/e41450 UR - http://dx.doi.org/10.2196/41450 UR - http://www.ncbi.nlm.nih.gov/pubmed/36763450 ID - info:doi/10.2196/41450 ER - TY - JOUR AU - Ryu, Wook Gi AU - Park, Shin Young AU - Kim, Jeewuan AU - Yang, Sook Yong AU - Ko, Young-Guk AU - Choi, Mona PY - 2022/11/18 TI - Incidence and Prevalence of Peripheral Arterial Disease in South Korea: Retrospective Analysis of National Claims Data JO - JMIR Public Health Surveill SP - e34908 VL - 8 IS - 11 KW - peripheral arterial disease KW - insurance claims KW - incidence KW - prevalence KW - endovascular revascularization KW - amputation KW - population-based study KW - blood flow KW - intermittent claudication KW - age KW - sex N2 - Background: Peripheral arterial disease (PAD) causes blood vessel narrowing that decreases blood flow to the lower extremities, with symptoms such as leg pain, discomfort, and intermittent claudication. PAD increases risks for amputation, poor health-related quality of life, and mortality. It is estimated that more than 200 million people worldwide have PAD, although the paucity of PAD research in the East detracts from knowledge on global PAD epidemiology. There are few national data?based analyses or health care utilization investigations. Thus, a national data analysis of PAD incidence and prevalence would provide baseline data to enable health promotion strategies for patients with PAD. Objective: This study aims to identify South Korean trends in the incidence and prevalence of PAD and PAD treatment, in-hospital deaths, and health care utilization. Methods: This was a retrospective analysis of South Korean national claims data from 2009 to 2018. The incidence of PAD was determined by setting the years 2010 and 2011 as a washout period to exclude previously diagnosed patients with PAD. The study included adults aged ?20 and <90 years who received a primary diagnosis of PAD between 2011 and 2018; patients were stratified according to age, sex, and insurance status for the incidence and prevalence analyses. Descriptive statistics were used to assess incidence, prevalence, endovascular revascularization (EVR) events, amputations, in-hospital deaths, and the health care utilization characteristics of patients with PAD. Results: Based on data from 2011 to 2018, there were an average of 124,682 and 993,048 incident and prevalent PAD cases, respectively, in 2018. PAD incidence (per 1000 persons) ranged from 2.68 to 3.09 during the study period. From 2012 to 2018, the incidence rate in both sexes showed an increasing trend. PAD incidence continued to increase with age. PAD prevalence (per 1000 persons) increased steadily, from 3.93 in 2011 to 23.55 in 2018. The number of EVR events varied between 933 and 1422 during the study period, and both major and minor amputations showed a decreasing trend. Health care utilization characteristics showed that women visited clinics more frequently than men, whereas men used tertiary and general hospitals more often than women. Conclusions: The number of incident and prevalent PAD cases generally showed an increasing trend. Visits to tertiary and general hospitals were higher among men than women. These results indicate the need for attention not only to Western and male patients, but also to Eastern and female patients with PAD. The results are generalizable, as they are based on national claims data from the entire South Korean population, and they can promote preventive care and management strategies for patients with PAD in clinical and public health settings. UR - https://publichealth.jmir.org/2022/11/e34908 UR - http://dx.doi.org/10.2196/34908 UR - http://www.ncbi.nlm.nih.gov/pubmed/36399371 ID - info:doi/10.2196/34908 ER - TY - JOUR AU - Ansari, Bahareh AU - Hart-Malloy, Rachel AU - Rosenberg, S. Eli AU - Trigg, Monica AU - Martin, G. Erika PY - 2022/11/9 TI - Modeling the Potential Impact of Missing Race and Ethnicity Data in Infectious Disease Surveillance Systems on Disparity Measures: Scenario Analysis of Different Imputation Strategies JO - JMIR Public Health Surveill SP - e38037 VL - 8 IS - 11 KW - missing data KW - sexually transmitted diseases KW - imputation KW - surveillance KW - health equity N2 - Background: Monitoring progress toward population health equity goals requires developing robust disparity indicators. However, surveillance data gaps that result in undercounting racial and ethnic minority groups might influence the observed disparity measures. Objective: This study aimed to assess the impact of missing race and ethnicity data in surveillance systems on disparity measures. Methods: We explored variations in missing race and ethnicity information in reported annual chlamydia and gonorrhea diagnoses in the United States from 2007 to 2018 by state, year, reported sex, and infection. For diagnoses with incomplete demographic information in 2018, we estimated disparity measures (relative rate ratio and rate difference) with 5 imputation scenarios compared with the base case (no adjustments). The 5 scenarios used the racial and ethnic distribution of chlamydia or gonorrhea diagnoses in the same state, chlamydia or gonorrhea diagnoses in neighboring states, chlamydia or gonorrhea diagnoses within the geographic region, HIV diagnoses, and syphilis diagnoses. Results: In 2018, a total of 31.93% (560,551/1,755,510) of chlamydia and 22.11% (128,790/582,475) of gonorrhea diagnoses had missing race and ethnicity information. Missingness differed by infection type but not by reported sex. Missing race and ethnicity information varied widely across states and times (range across state-years: from 0.0% to 96.2%). The rate ratio remained similar in the imputation scenarios, although the rate difference differed nationally and in some states. Conclusions: We found that missing race and ethnicity information affects measured disparities, which is important to consider when interpreting disparity metrics. Addressing missing information in surveillance systems requires system-level solutions, such as collecting more complete laboratory data, improving the linkage of data systems, and designing more efficient data collection procedures. As a short-term solution, local public health agencies can adapt these imputation scenarios to their aggregate data to adjust surveillance data for use in population indicators of health equity. UR - https://publichealth.jmir.org/2022/11/e38037 UR - http://dx.doi.org/10.2196/38037 UR - http://www.ncbi.nlm.nih.gov/pubmed/36350701 ID - info:doi/10.2196/38037 ER - TY - JOUR AU - Dehesh, Paria AU - Baradaran, Reza Hamid AU - Eshrati, Babak AU - Motevalian, Abbas Seyed AU - Salehi, Masoud AU - Donyavi, Tahereh PY - 2022/11/8 TI - The Relationship Between Population-Level SARS-CoV-2 Cycle Threshold Values and Trend of COVID-19 Infection: Longitudinal Study JO - JMIR Public Health Surveill SP - e36424 VL - 8 IS - 11 KW - cycle threshold value KW - COVID-19 KW - trend KW - surveillance KW - epidemiology KW - disease surveillance KW - digital surveillance KW - prediction model KW - epidemic modeling KW - health system KW - infectious disease N2 - Background: The distribution of population-level real-time reverse transcription-polymerase chain reaction (RT-PCR) cycle threshold (Ct) values as a proxy of viral load may be a useful indicator for predicting COVID-19 dynamics. Objective: The aim of this study was to determine the relationship between the daily trend of average Ct values and COVID-19 dynamics, calculated as the daily number of hospitalized patients with COVID-19, daily number of new positive tests, daily number of COVID-19 deaths, and number of hospitalized patients with COVID-19 by age. We further sought to determine the lag between these data series. Methods: The samples included in this study were collected from March 21, 2021, to December 1, 2021. Daily Ct values of all patients who were referred to the Molecular Diagnostic Laboratory of Iran University of Medical Sciences in Tehran, Iran, for RT-PCR tests were recorded. The daily number of positive tests and the number of hospitalized patients by age group were extracted from the COVID-19 patient information registration system in Tehran province, Iran. An autoregressive integrated moving average (ARIMA) model was constructed for the time series of variables. Cross-correlation analysis was then performed to determine the best lag and correlations between the average daily Ct value and other COVID-19 dynamics?related variables. Finally, the best-selected lag of Ct identified through cross-correlation was incorporated as a covariate into the autoregressive integrated moving average with exogenous variables (ARIMAX) model to calculate the coefficients. Results: Daily average Ct values showed a significant negative correlation (23-day time delay) with the daily number of newly hospitalized patients (P=.02), 30-day time delay with the daily number of new positive tests (P=.02), and daily number of COVID-19 deaths (P=.02). The daily average Ct value with a 30-day delay could impact the daily number of positive tests for COVID-19 (?=?16.87, P<.001) and the daily number of deaths from COVID-19 (?=?1.52, P=.03). There was a significant association between Ct lag (23 days) and the number of COVID-19 hospitalizations (?=?24.12, P=.005). Cross-correlation analysis showed significant time delays in the average Ct values and daily hospitalized patients between 18-59 years (23-day time delay, P=.02) and in patients over 60 years old (23-day time delay, P<.001). No statistically significant relation was detected in the number of daily hospitalized patients under 5 years old (9-day time delay, P=.27) and aged 5-17 years (13-day time delay, P=.39). Conclusions: It is important for surveillance of COVID-19 to find a good indicator that can predict epidemic surges in the community. Our results suggest that the average daily Ct value with a 30-day delay can predict increases in the number of positive confirmed COVID-19 cases, which may be a useful indicator for the health system. UR - https://publichealth.jmir.org/2022/11/e36424 UR - http://dx.doi.org/10.2196/36424 UR - http://www.ncbi.nlm.nih.gov/pubmed/36240022 ID - info:doi/10.2196/36424 ER - TY - JOUR AU - McIntyre, F. Anne AU - Mitchell, Andrew AU - Stafford, A. Kristen AU - Nwafor, Uchenna Samuel AU - Lo, Julia AU - Sebastian, Victor AU - Schwitters, Amee AU - Swaminathan, Mahesh AU - Dalhatu, Ibrahim AU - Charurat, Man PY - 2022/10/26 TI - Key Population Size Estimation to Guide HIV Epidemic Responses in Nigeria: Bayesian Analysis of 3-Source Capture-Recapture Data JO - JMIR Public Health Surveill SP - e34555 VL - 8 IS - 10 KW - sex workers KW - men who have sex with men KW - people who inject drugs KW - HIV KW - population size KW - population KW - data KW - female KW - men KW - drugs KW - drug injection KW - epidemic KW - Nigeria N2 - Background: Nigeria has the fourth largest burden of HIV globally. Key populations, including female sex workers, men who have sex with men, and people who inject drugs, are more vulnerable to HIV than the general population due to stigmatized and criminalized behaviors. Reliable key population size estimates are needed to guide HIV epidemic response efforts. Objective: The objective of our study was to use empirical methods for sampling and analysis to improve the quality of population size estimates of female sex workers, men who have sex with men, and people who inject drugs in 7 states (Akwa Ibom, Benue, Cross River, Lagos, Nasarawa, Rivers, and the Federal Capital Territory) of Nigeria for program planning and to demonstrate improved statistical estimation methods. Methods: From October to December 2018, we used 3-source capture-recapture to produce population size estimates in 7 states in Nigeria. Hotspots were mapped before 3-source capture-recapture started. We sampled female sex workers, men who have sex with men, and people who inject drugs during 3 independent captures about one week apart. During hotspot encounters, key population members were offered inexpensive, memorable objects unique to each capture round. In subsequent rounds, key population members were offered an object and asked to identify objects received during previous rounds (if any). Correct responses were tallied and recorded on tablets. Data were aggregated by key population and state for analysis. Median population size estimates were derived using Bayesian nonparametric latent-class models with 80% highest density intervals. Results: Overall, we sampled approximately 310,000 persons at 9015 hotspots during 3 independent captures. Population size estimates for female sex workers ranged from 14,500 to 64,300; population size estimates for men who have sex with men ranged from 3200 to 41,400; and population size estimates for people who inject drugs ranged from 3400 to 30,400. Conclusions: This was the first implementation of these 3-source capture-recapture methods in Nigeria. Our population size estimates were larger than previously documented for each key population in all states. The Bayesian models account for factors, such as social visibility, that influence heterogeneous capture probabilities, resulting in more reliable population size estimates. The larger population size estimates suggest a need for programmatic scale-up to reach these populations, which are at highest risk for HIV. UR - https://publichealth.jmir.org/2022/10/e34555 UR - http://dx.doi.org/10.2196/34555 UR - http://www.ncbi.nlm.nih.gov/pubmed/36287587 ID - info:doi/10.2196/34555 ER - TY - JOUR AU - Karystianis, George AU - Cabral, Carines Rina AU - Adily, Armita AU - Lukmanjaya, Wilson AU - Schofield, Peter AU - Buchan, Iain AU - Nenadic, Goran AU - Butler, Tony PY - 2022/10/20 TI - Mental Illness Concordance Between Hospital Clinical Records and Mentions in Domestic Violence Police Narratives: Data Linkage Study JO - JMIR Form Res SP - e39373 VL - 6 IS - 10 KW - data linkage KW - mental health KW - domestic violence KW - police records KW - hospital records KW - text mining N2 - Background: To better understand domestic violence, data sources from multiple sectors such as police, justice, health, and welfare are needed. Linking police data to data collections from other agencies could provide unique insights and promote an all-of-government response to domestic violence. The New South Wales Police Force attends domestic violence events and records information in the form of both structured data and a free-text narrative, with the latter shown to be a rich source of information on the mental health status of persons of interest (POIs) and victims, abuse types, and sustained injuries. Objective: This study aims to examine the concordance (ie, matching) between mental illness mentions extracted from the police?s event narratives and mental health diagnoses from hospital and emergency department records. Methods: We applied a rule-based text mining method on 416,441 domestic violence police event narratives between December 2005 and January 2016 to identify mental illness mentions for POIs and victims. Using different window periods (1, 3, 6, and 12 months) before and after a domestic violence event, we linked the extracted mental illness mentions of victims and POIs to clinical records from the Emergency Department Data Collection and the Admitted Patient Data Collection in New South Wales, Australia using a unique identifier for each individual in the same cohort. Results: Using a 2-year window period (ie, 12 months before and after the domestic violence event), less than 1% (3020/416,441, 0.73%) of events had a mental illness mention and also a corresponding hospital record. About 16% of domestic violence events for both POIs (382/2395, 15.95%) and victims (101/631, 16.01%) had an agreement between hospital records and police narrative mentions of mental illness. A total of 51,025/416,441 (12.25%) events for POIs and 14,802/416,441 (3.55%) events for victims had mental illness mentions in their narratives but no hospital record. Only 841 events for POIs and 919 events for victims had a documented hospital record within 48 hours of the domestic violence event. Conclusions: Our findings suggest that current surveillance systems used to report on domestic violence may be enhanced by accessing rich information (ie, mental illness) contained in police text narratives, made available for both POIs and victims through the application of text mining. Additional insights can be gained by linkage to other health and welfare data collections. UR - https://formative.jmir.org/2022/10/e39373 UR - http://dx.doi.org/10.2196/39373 UR - http://www.ncbi.nlm.nih.gov/pubmed/36264613 ID - info:doi/10.2196/39373 ER - TY - JOUR AU - Lv, Yi AU - Zhu, Qiyu AU - Xu, Chengdong AU - Zhang, Guanbin AU - Jiang, Yan AU - Han, Mengjie AU - Jin, Cong PY - 2022/9/27 TI - Spatiotemporal Analysis of Online Purchase of HIV Self-testing Kits in China, 2015-2017: Longitudinal Observational Study JO - JMIR Public Health Surveill SP - e37922 VL - 8 IS - 9 KW - spatiotemporal KW - characteristics KW - online KW - purchase KW - HIV KW - self-testing KW - e-commerce KW - economic status KW - HIV epidemic KW - China N2 - Background: Since the introduction of HIV self-testing by UNAIDS in 2014, the practice has been extensively implemented around the world. HIV self-testing (HIVST) was developed in China around 2015, and the online purchase of HIVST kits through e-commerce platforms has since become the most important delivery method for self-testing, with advantages such as user-friendliness, speed, and better privacy protection. Objective: Understanding the spatiotemporal characteristics of online HIVST kit purchasing behavior and identifying potential impacting factors will help promote the HIV self-testing strategy. Methods: The online retail data of HIVST kits from the 2 largest e-commerce platforms in China from 2015 to 2017 were collected for this study. The Bayesian spatiotemporal hierarchical model was used to investigate the spatiotemporal characteristics of online purchased HIVST kits. Ordinary least squares regression was used to identify potential factors associated with online purchase, including GDP per capita, population density, road density, HIV screening laboratory density, and newly diagnosed HIV/AIDS cases per 100,000 persons. The q statistics calculated by Geodetector were used to determine the interactive effect of every 2 factors on the online purchase. Results: The online purchase of HIVST kits increased rapidly in China from 2015 to 2017, with annual peak sales in May and December. Five economically superior regions in China, Pearl River Delta, Yangtze River Delta, Chengdu and surrounding areas, Beijing and Tianjin areas, and Shandong Peninsula, showed a comparatively higher spatial preference for online purchased HIVST kits. The GDP per capita (P<.001) and the rate of newly diagnosed HIV/AIDS cases per 100,000 persons (P<.001) were identified as 2 factors positively associated with online purchase. Among the factors we investigated in this study, 2 factors associated with online purchase, GDP per capita and the rate of newly diagnosed HIV/AIDS cases per 100,000 persons, also displayed the strongest interactive effect, with a q value of 0.66. Conclusions: Individuals in better-off areas are more inclined to purchase HIVST kits online. In addition to economic status, the severity of the HIV epidemic is also a factor influencing the online purchase of HIVST kits. UR - https://publichealth.jmir.org/2022/9/e37922 UR - http://dx.doi.org/10.2196/37922 UR - http://www.ncbi.nlm.nih.gov/pubmed/35918844 ID - info:doi/10.2196/37922 ER - TY - JOUR AU - Weiss, Samuel Paul AU - Waller, Allyn Lance PY - 2022/9/9 TI - The Impact of Nonrandom Missingness in Surveillance Data for Population-Level Summaries: Simulation Study JO - JMIR Public Health Surveill SP - e37887 VL - 8 IS - 9 KW - surveillance KW - estimation KW - missing data KW - population-level estimates KW - health policy KW - public health policy KW - estimates KW - data KW - policy decision KW - bias KW - response rate N2 - Background: Surveillance data are essential public health resources for guiding policy and allocation of human and capital resources. These data often consist of large collections of information based on nonrandom sample designs. Population estimates based on such data may be impacted by the underlying sample distribution compared to the true population of interest. In this study, we simulate a population of interest and allow response rates to vary in nonrandom ways to illustrate and measure the effect this has on population-based estimates of an important public health policy outcome. Objective: The aim of this study was to illustrate the effect of nonrandom missingness on population-based survey sample estimation. Methods: We simulated a population of respondents answering a survey question about their satisfaction with their community?s policy regarding vaccination mandates for government personnel. We allowed response rates to differ between the generally satisfied and dissatisfied and considered the effect of common efforts to control for potential bias such as sampling weights, sample size inflation, and hypothesis tests for determining missingness at random. We compared these conditions via mean squared errors and sampling variability to characterize the bias in estimation arising under these different approaches. Results: Sample estimates present clear and quantifiable bias, even in the most favorable response profile. On a 5-point Likert scale, nonrandom missingness resulted in errors averaging to almost a full point away from the truth. Efforts to mitigate bias through sample size inflation and sampling weights have negligible effects on the overall results. Additionally, hypothesis testing for departures from random missingness rarely detect the nonrandom missingness across the widest range of response profiles considered. Conclusions: Our results suggest that assuming surveillance data are missing at random during analysis could provide estimates that are widely different from what we might see in the whole population. Policy decisions based on such potentially biased estimates could be devastating in terms of community disengagement and health disparities. Alternative approaches to analysis that move away from broad generalization of a mismeasured population at risk are necessary to identify the marginalized groups, where overall response may be very different from those observed in measured respondents. UR - https://publichealth.jmir.org/2022/9/e37887 UR - http://dx.doi.org/10.2196/37887 UR - http://www.ncbi.nlm.nih.gov/pubmed/36083618 ID - info:doi/10.2196/37887 ER - TY - JOUR AU - Templ, Matthias AU - Kanjala, Chifundo AU - Siems, Inken PY - 2022/9/2 TI - Privacy of Study Participants in Open-access Health and Demographic Surveillance System Data: Requirements Analysis for Data Anonymization JO - JMIR Public Health Surveill SP - e34472 VL - 8 IS - 9 KW - longitudinal data and event history data KW - low- and middle-income countries KW - LMIC KW - anonymization KW - health and demographic surveillance system N2 - Background: Data anonymization and sharing have become popular topics for individuals, organizations, and countries worldwide. Open-access sharing of anonymized data containing sensitive information about individuals makes the most sense whenever the utility of the data can be preserved and the risk of disclosure can be kept below acceptable levels. In this case, researchers can use the data without access restrictions and limitations. Objective: This study aimed to highlight the requirements and possible solutions for sharing health surveillance event history data. The challenges lie in the anonymization of multiple event dates and time-varying variables. Methods: A sequential approach that adds noise to event dates is proposed. This approach maintains the event order and preserves the average time between events. In addition, a nosy neighbor distance-based matching approach to estimate the risk is proposed. Regarding the key variables that change over time, such as educational level or occupation, we make 2 proposals: one based on limiting the intermediate statuses of the individual and the other to achieve k-anonymity in subsets of the data. The proposed approaches were applied to the Karonga health and demographic surveillance system (HDSS) core residency data set, which contains longitudinal data from 1995 to the end of 2016 and includes 280,381 events with time-varying socioeconomic variables and demographic information. Results: An anonymized version of the event history data, including longitudinal information on individuals over time, with high data utility, was created. Conclusions: The proposed anonymization of event history data comprising static and time-varying variables applied to HDSS data led to acceptable disclosure risk, preserved utility, and being sharable as public use data. It was found that high utility was achieved, even with the highest level of noise added to the core event dates. The details are important to ensure consistency or credibility. Importantly, the sequential noise addition approach presented in this study does not only maintain the event order recorded in the original data but also maintains the time between events. We proposed an approach that preserves the data utility well but limits the number of response categories for the time-varying variables. Furthermore, using distance-based neighborhood matching, we simulated an attack under a nosy neighbor situation and by using a worst-case scenario where attackers have full information on the original data. We showed that the disclosure risk is very low, even when assuming that the attacker?s database and information are optimal. The HDSS and medical science research communities in low- and middle-income country settings will be the primary beneficiaries of the results and methods presented in this paper; however, the results will be useful for anyone working on anonymizing longitudinal event history data with time-varying variables for the purposes of sharing. UR - https://publichealth.jmir.org/2022/9/e34472 UR - http://dx.doi.org/10.2196/34472 UR - http://www.ncbi.nlm.nih.gov/pubmed/36053573 ID - info:doi/10.2196/34472 ER - TY - JOUR AU - Tung, S. Keith T. AU - Wong, S. Rosa AU - Ho, K. Frederick AU - Chan, Ling Ko AU - Wong, S. Wilfred H. AU - Leung, Hugo AU - Leung, Ming AU - Leung, K. Gilberto K. AU - Chow, Bong Chun AU - Ip, Patrick PY - 2022/8/18 TI - Development and Validation of Indicators for Population Injury Surveillance in Hong Kong: Development and Usability Study JO - JMIR Public Health Surveill SP - e36861 VL - 8 IS - 8 KW - injury KW - indicators KW - modified Delphi research design KW - surveillance N2 - Background: Injury is an increasingly pressing global health issue. An effective surveillance system is required to monitor the trends and burden of injuries. Objective: This study aimed to identify a set of valid and context-specific injury indicators to facilitate the establishment of an injury surveillance program in Hong Kong. Methods: This development of indicators adopted a multiphased modified Delphi research design. A literature search was conducted on academic databases using injury-related search terms in various combinations. A list of potential indicators was sent to a panel of experts from various backgrounds to rate the validity and context-specificity of these indicators. Local hospital data on the selected core indicators were used to examine their applicability in the context of Hong Kong. Results: We reviewed 142 articles and identified 55 indicators, which were classified into 4 domains. On the basis of the ratings by the expert panel, 13 indicators were selected as core indicators because of their good validity and high relevance to the local context. Among these indicators, 10 were from the construct of health care service use, and 3 were from the construct of postdischarge outcomes. Regression analyses of local hospitalization data showed that the Hong Kong Safe Community certification status had no association with 5 core indicators (admission to intensive care unit, mortality rate, length of intensive care unit stay, need for a rehabilitation facility, and long-term behavioral and emotional outcomes), negative associations with 4 core indicators (operative intervention, infection rate, length of hospitalization, and disability-adjusted life years), and positive associations with the remaining 4 core indicators (attendance to accident and emergency department, discharge rate, suicide rate, and hospitalization rate after attending the accident and emergency department). These results confirmed the validity of the selected core indicators for the quantification of injury burden and evaluation of injury-related services, although some indicators may better measure the consequences of severe injuries. Conclusions: This study developed a set of injury outcome indicators that would be useful for monitoring injury trends and burdens in Hong Kong. UR - https://publichealth.jmir.org/2022/8/e36861 UR - http://dx.doi.org/10.2196/36861 UR - http://www.ncbi.nlm.nih.gov/pubmed/35980728 ID - info:doi/10.2196/36861 ER - TY - JOUR AU - Donegan, Connor AU - Hughes, E. Amy AU - Lee, Craddock Simon J. PY - 2022/8/16 TI - Colorectal Cancer Incidence, Inequalities, and Prevention Priorities in Urban Texas: Surveillance Study With the ?surveil? Software Package JO - JMIR Public Health Surveill SP - e34589 VL - 8 IS - 8 KW - Bayesian analysis KW - cancer prevention KW - colorectal cancer KW - health equity KW - open source software KW - public health monitoring KW - time-series analysis N2 - Background: Monitoring disease incidence rates over time with population surveillance data is fundamental to public health research and practice. Bayesian disease monitoring methods provide advantages over conventional methods including greater flexibility in model specification and the ability to conduct formal inference on model-derived quantities of interest. However, software platforms for Bayesian inference are often inaccessible to nonspecialists. Objective: To increase the accessibility of Bayesian methods among health surveillance researchers, we introduce a Bayesian methodology and open source software package, surveil, for time-series modeling of disease incidence and mortality. Given case count and population-at-risk data, the software enables health researchers to draw inferences about underlying risk and derivative quantities including age-standardized rates, annual and cumulative percent change, and measures of inequality. Methods: We specify a Poisson likelihood for case counts and model trends in log-risk using the first-difference (random-walk) prior. Models in the surveil R package were built using the Stan modeling language. We demonstrate the methodology and software by analyzing age-standardized colorectal cancer (CRC) incidence rates by race and ethnicity for non-Latino Black (Black), non-Latino White (White), and Hispanic/Latino (of any race) adults aged 50-79 years in Texas?s 4 largest metropolitan statistical areas between 1999 and 2018. Results: Our analysis revealed a cumulative decline of 31% (95% CI ?37% to ?25%) in CRC risk among Black adults, 17% (95% CI ?23% to ?11%) for Latino adults, and 35% (95% CI ?38% to ?31%) for White adults from 1999 to 2018. None of the 3 observed groups experienced significant incidence reduction in the final 4 years of the study (2015-2018). The Black-White rate difference (per 100,000) was 44 (95% CI 30-57) in 1999 and 35 (95% CI 28-43) in 2018. Cumulatively, the Black-White gap accounts for 3983 CRC cases (95% CI 3746-4219) or 31% (95% CI 29%-32%) of total CRC incidence among Black adults in this period. Conclusions: Stalled progress on CRC prevention and excess CRC risk among Black residents warrant special attention as cancer prevention and control priorities in urban Texas. Our methodology and software can help the public and health agencies monitor health inequalities and evaluate progress toward disease prevention goals. Advantages of the methodology over current common practice include the following: (1) the absence of piecewise linearity constraints on the model space, and (2) formal inference can be undertaken on any model-derived quantities of interest using Bayesian methods. UR - https://publichealth.jmir.org/2022/8/e34589 UR - http://dx.doi.org/10.2196/34589 UR - http://www.ncbi.nlm.nih.gov/pubmed/35972778 ID - info:doi/10.2196/34589 ER - TY - JOUR AU - Luo, Wei AU - Liu, Zhaoyin AU - Zhou, Yuxuan AU - Zhao, Yumin AU - Li, Elita Yunyue AU - Masrur, Arif AU - Yu, Manzhu PY - 2022/8/9 TI - Investigating Linkages Between Spatiotemporal Patterns of the COVID-19 Delta Variant and Public Health Interventions in Southeast Asia: Prospective Space-Time Scan Statistical Analysis Method JO - JMIR Public Health Surveill SP - e35840 VL - 8 IS - 8 KW - COVID-19 KW - Delta variant KW - space-time scan KW - intervention KW - Southeast Asia N2 - Background: The COVID-19 Delta variant has presented an unprecedented challenge to countries in Southeast Asia (SEA). Its transmission has shown spatial heterogeneity in SEA after countries have adopted different public health interventions during the process. Hence, it is crucial for public health authorities to discover potential linkages between epidemic progression and corresponding interventions such that collective and coordinated control measurements can be designed to increase their effectiveness at reducing transmission in SEA. Objective: The purpose of this study is to explore potential linkages between the spatiotemporal progression of the COVID-19 Delta variant and nonpharmaceutical intervention (NPI) measures in SEA. We detected the space-time clusters of outbreaks of COVID-19 and analyzed how the NPI measures relate to the propagation of COVID-19. Methods: We collected district-level daily new cases of COVID-19 from June 1 to October 31, 2021, and district-level population data in SEA. We adopted prospective space-time scan statistics to identify the space-time clusters. Using cumulative prospective space-time scan statistics, we further identified variations of relative risk (RR) across each district at a half-month interval and their potential public health intervention linkages. Results: We found 7 high-risk clusters (clusters 1-7) of COVID-19 transmission in Malaysia, the Philippines, Thailand, Vietnam, and Indonesia between June and August, 2021, with an RR of 5.45 (P<.001), 3.50 (P<.001), 2.30 (P<.001), 1.36 (P<.001), 5.62 (P<.001), 2.38 (P<.001), 3.45 (P<.001), respectively. There were 34 provinces in Indonesia that have successfully mitigated the risk of COVID-19, with a decreasing range between ?0.05 and ?1.46 due to the assistance of continuous restrictions. However, 58.6% of districts in Malaysia, Singapore, Thailand, and the Philippines saw an increase in the infection risk, which is aligned with their loosened restrictions. Continuous strict interventions were effective in mitigating COVID-19, while relaxing restrictions may exacerbate the propagation risk of this epidemic. Conclusions: The analyses of space-time clusters and RRs of districts benefit public health authorities with continuous surveillance of COVID-19 dynamics using real-time data. International coordination with more synchronized interventions amidst all SEA countries may play a key role in mitigating the progression of COVID-19. UR - https://publichealth.jmir.org/2022/8/e35840 UR - http://dx.doi.org/10.2196/35840 UR - http://www.ncbi.nlm.nih.gov/pubmed/35861674 ID - info:doi/10.2196/35840 ER - TY - JOUR AU - Stockham, Nathaniel AU - Washington, Peter AU - Chrisman, Brianna AU - Paskov, Kelley AU - Jung, Jae-Yoon AU - Wall, Paul Dennis PY - 2022/7/21 TI - Causal Modeling to Mitigate Selection Bias and Unmeasured Confounding in Internet-Based Epidemiology of COVID-19: Model Development and Validation JO - JMIR Public Health Surveill SP - e31306 VL - 8 IS - 7 KW - selection bias KW - COVID-19 KW - epidemiology KW - causality KW - sensitivity analysis KW - public health KW - surveillance KW - method KW - epidemiologic research design KW - model KW - bias KW - development KW - validation KW - utility KW - implementation KW - sensitivity KW - design KW - research N2 - Background: Selection bias and unmeasured confounding are fundamental problems in epidemiology that threaten study internal and external validity. These phenomena are particularly dangerous in internet-based public health surveillance, where traditional mitigation and adjustment methods are inapplicable, unavailable, or out of date. Recent theoretical advances in causal modeling can mitigate these threats, but these innovations have not been widely deployed in the epidemiological community. Objective: The purpose of our paper is to demonstrate the practical utility of causal modeling to both detect unmeasured confounding and selection bias and guide model selection to minimize bias. We implemented this approach in an applied epidemiological study of the COVID-19 cumulative infection rate in the New York City (NYC) spring 2020 epidemic. Methods: We collected primary data from Qualtrics surveys of Amazon Mechanical Turk (MTurk) crowd workers residing in New Jersey and New York State across 2 sampling periods: April 11-14 and May 8-11, 2020. The surveys queried the subjects on household health status and demographic characteristics. We constructed a set of possible causal models of household infection and survey selection mechanisms and ranked them by compatibility with the collected survey data. The most compatible causal model was then used to estimate the cumulative infection rate in each survey period. Results: There were 527 and 513 responses collected for the 2 periods, respectively. Response demographics were highly skewed toward a younger age in both survey periods. Despite the extremely strong relationship between age and COVID-19 symptoms, we recovered minimally biased estimates of the cumulative infection rate using only primary data and the most compatible causal model, with a relative bias of +3.8% and ?1.9% from the reported cumulative infection rate for the first and second survey periods, respectively. Conclusions: We successfully recovered accurate estimates of the cumulative infection rate from an internet-based crowdsourced sample despite considerable selection bias and unmeasured confounding in the primary data. This implementation demonstrates how simple applications of structural causal modeling can be effectively used to determine falsifiable model conditions, detect selection bias and confounding factors, and minimize estimate bias through model selection in a novel epidemiological context. As the disease and social dynamics of COVID-19 continue to evolve, public health surveillance protocols must continue to adapt; the emergence of Omicron variants and shift to at-home testing as recent challenges. Rigorous and transparent methods to develop, deploy, and diagnosis adapted surveillance protocols will be critical to their success. UR - https://publichealth.jmir.org/2022/7/e31306 UR - http://dx.doi.org/10.2196/31306 UR - http://www.ncbi.nlm.nih.gov/pubmed/35605128 ID - info:doi/10.2196/31306 ER - TY - JOUR AU - Li, Jingwei AU - Huang, Wei AU - Sia, Ling Choon AU - Chen, Zhuo AU - Wu, Tailai AU - Wang, Qingnan PY - 2022/6/16 TI - Enhancing COVID-19 Epidemic Forecasting Accuracy by Combining Real-time and Historical Data From Multiple Internet-Based Sources: Analysis of Social Media Data, Online News Articles, and Search Queries JO - JMIR Public Health Surveill SP - e35266 VL - 8 IS - 6 KW - SARS-CoV-2 KW - COVID 19 KW - epidemic forecasting KW - disease surveillance KW - infectious disease epidemiology KW - social medial KW - online news KW - search query KW - autoregression model N2 - Background: The SARS-COV-2 virus and its variants pose extraordinary challenges for public health worldwide. Timely and accurate forecasting of the COVID-19 epidemic is key to sustaining interventions and policies and efficient resource allocation. Internet-based data sources have shown great potential to supplement traditional infectious disease surveillance, and the combination of different Internet-based data sources has shown greater power to enhance epidemic forecasting accuracy than using a single Internet-based data source. However, existing methods incorporating multiple Internet-based data sources only used real-time data from these sources as exogenous inputs but did not take all the historical data into account. Moreover, the predictive power of different Internet-based data sources in providing early warning for COVID-19 outbreaks has not been fully explored. Objective: The main aim of our study is to explore whether combining real-time and historical data from multiple Internet-based sources could improve the COVID-19 forecasting accuracy over the existing baseline models. A secondary aim is to explore the COVID-19 forecasting timeliness based on different Internet-based data sources. Methods: We first used core terms and symptom-related keyword-based methods to extract COVID-19?related Internet-based data from December 21, 2019, to February 29, 2020. The Internet-based data we explored included 90,493,912 online news articles, 37,401,900 microblogs, and all the Baidu search query data during that period. We then proposed an autoregressive model with exogenous inputs, incorporating real-time and historical data from multiple Internet-based sources. Our proposed model was compared with baseline models, and all the models were tested during the first wave of COVID-19 epidemics in Hubei province and the rest of mainland China separately. We also used lagged Pearson correlations for COVID-19 forecasting timeliness analysis. Results: Our proposed model achieved the highest accuracy in all 5 accuracy measures, compared with all the baseline models of both Hubei province and the rest of mainland China. In mainland China, except for Hubei, the COVID-19 epidemic forecasting accuracy differences between our proposed model (model i) and all the other baseline models were statistically significant (model 1, t198=?8.722, P<.001; model 2, t198=?5.000, P<.001, model 3, t198=?1.882, P=.06; model 4, t198=?4.644, P<.001; model 5, t198=?4.488, P<.001). In Hubei province, our proposed model's forecasting accuracy improved significantly compared with the baseline model using historical new confirmed COVID-19 case counts only (model 1, t198=?1.732, P=.09). Our results also showed that Internet-based sources could provide a 2- to 6-day earlier warning for COVID-19 outbreaks. Conclusions: Our approach incorporating real-time and historical data from multiple Internet-based sources could improve forecasting accuracy for epidemics of COVID-19 and its variants, which may help improve public health agencies' interventions and resource allocation in mitigating and controlling new waves of COVID-19 or other relevant epidemics. UR - https://publichealth.jmir.org/2022/6/e35266 UR - http://dx.doi.org/10.2196/35266 UR - http://www.ncbi.nlm.nih.gov/pubmed/35507921 ID - info:doi/10.2196/35266 ER - TY - JOUR AU - Ren, Ningjun AU - Li, Yuansheng AU - Wang, Ruolan AU - Zhang, Wenxin AU - Chen, Run AU - Xiao, Ticheng AU - Chen, Hang AU - Li, Ailing AU - Fan, Song PY - 2022/6/14 TI - The Distribution of HIV and AIDS Cases in Luzhou, China, From 2011 to 2020: Bayesian Spatiotemporal Analysis JO - JMIR Public Health Surveill SP - e37491 VL - 8 IS - 6 KW - HIV and AIDS KW - reported incidence KW - Bayesian model KW - spatio-temporal distribution N2 - Background: The vastly increasing number of reported HIV and AIDS cases in Luzhou, China, in recent years, coupled with the city?s unique geographical location at the intersection of 4 provinces, makes it particularly important to conduct a spatiotemporal analysis of HIV and AIDS cases. Objective: The aim of this study is to understand the spatiotemporal distribution of HIV and the factors influencing this distribution in Luzhou, China, from 2011 to 2020. Methods: Data on the incidence of HIV and AIDS in Luzhou from 2011 to 2020 were obtained from the AIDS Information Management System of the Luzhou Center for Disease Control and Prevention. ArcGIS was used to visualize the spatiotemporal distribution of HIV and AIDS cases. The Bayesian spatiotemporal model was used to investigate factors affecting the spatiotemporal distribution of HIV and AIDS, including the gross domestic product (GDP) per capita, urbanization rate, number of hospital beds, population density, and road mileage. Results: The reported incidence of HIV and AIDS rose from 8.50 cases per 100,000 population in 2011 to 49.25 cases per 100,000 population in 2020?an increase of 578.87%. In the first 5 years, hotspots were concentrated in Jiangyang district, Longmatan district, and Luxian county. After 2016, Luzhou?s high HIV incidence areas gradually shifted eastward, with Hejiang county having the highest average prevalence rate (41.68 cases per 100,000 population) from 2011 to 2020, being 2.28 times higher than that in Gulin county (18.30 cases per 100,000), where cold spots were concentrated. The risk for the incidence of HIV and AIDS was associated with the urbanization rate, population density, and GDP per capita. For every 1% increase in the urbanization rate, the relative risk (RR) increases by 1.3%, while an increase of 100 people per square kilometer would increase the RR by 8.7%; for every 1000 Yuan (US $148.12) increase in GDP per capita, the RR decreases by 1.5%. Conclusions: In Luzhou, current HIV and AIDS prevention and control efforts must be focused on the location of each district or county government; we suggest the region balance urban development and HIV and AIDS prevention. Moreover, more attention should be paid to economically disadvantaged areas. UR - https://publichealth.jmir.org/2022/6/e37491 UR - http://dx.doi.org/10.2196/37491 UR - http://www.ncbi.nlm.nih.gov/pubmed/35700022 ID - info:doi/10.2196/37491 ER - TY - JOUR AU - Lundberg, L. Alexander AU - Lorenzo-Redondo, Ramon AU - Hultquist, F. Judd AU - Hawkins, A. Claudia AU - Ozer, A. Egon AU - Welch, B. Sarah AU - Prasad, Vara P. V. AU - Achenbach, J. Chad AU - White, I. Janine AU - Oehmke, F. James AU - Murphy, L. Robert AU - Havey, J. Robert AU - Post, A. Lori PY - 2022/6/3 TI - Overlapping Delta and Omicron Outbreaks During the COVID-19 Pandemic: Dynamic Panel Data Estimates JO - JMIR Public Health Surveill SP - e37377 VL - 8 IS - 6 KW - Omicron variant of concern KW - Delta KW - COVID-19 KW - SARS-CoV-2 KW - B.1.1.529 KW - outbreak KW - Arellano-Bond estimator KW - dynamic panel data KW - stringency index KW - surveillance KW - disease transmission metrics N2 - Background: The Omicron variant of SARS-CoV-2 is more transmissible than prior variants of concern (VOCs). It has caused the largest outbreaks in the pandemic, with increases in mortality and hospitalizations. Early data on the spread of Omicron were captured in countries with relatively low case counts, so it was unclear how the arrival of Omicron would impact the trajectory of the pandemic in countries already experiencing high levels of community transmission of Delta. Objective: The objective of this study is to quantify and explain the impact of Omicron on pandemic trajectories and how they differ between countries that were or were not in a Delta outbreak at the time Omicron occurred. Methods: We used SARS-CoV-2 surveillance and genetic sequence data to classify countries into 2 groups: those that were in a Delta outbreak (defined by at least 10 novel daily transmissions per 100,000 population) when Omicron was first sequenced in the country and those that were not. We used trend analysis, survival curves, and dynamic panel regression models to compare outbreaks in the 2 groups over the period from November 1, 2021, to February 11, 2022. We summarized the outbreaks in terms of their peak rate of SARS-CoV-2 infections and the duration of time the outbreaks took to reach the peak rate. Results: Countries that were already in an outbreak with predominantly Delta lineages when Omicron arrived took longer to reach their peak rate and saw greater than a twofold increase (2.04) in the average apex of the Omicron outbreak compared to countries that were not yet in an outbreak. Conclusions: These results suggest that high community transmission of Delta at the time of the first detection of Omicron was not protective, but rather preluded larger outbreaks in those countries. Outbreak status may reflect a generally susceptible population, due to overlapping factors, including climate, policy, and individual behavior. In the absence of strong mitigation measures, arrival of a new, more transmissible variant in these countries is therefore more likely to lead to larger outbreaks. Alternately, countries with enhanced surveillance programs and incentives may be more likely to both exist in an outbreak status and detect more cases during an outbreak, resulting in a spurious relationship. Either way, these data argue against herd immunity mitigating future outbreaks with variants that have undergone significant antigenic shifts. UR - https://publichealth.jmir.org/2022/6/e37377 UR - http://dx.doi.org/10.2196/37377 UR - http://www.ncbi.nlm.nih.gov/pubmed/35500140 ID - info:doi/10.2196/37377 ER - TY - JOUR AU - Couture, Alexia AU - Iuliano, Danielle A. AU - Chang, H. Howard AU - Patel, N. Neha AU - Gilmer, Matthew AU - Steele, Molly AU - Havers, P. Fiona AU - Whitaker, Michael AU - Reed, Carrie PY - 2022/6/2 TI - Estimating COVID-19 Hospitalizations in the United States With Surveillance Data Using a Bayesian Hierarchical Model: Modeling Study JO - JMIR Public Health Surveill SP - e34296 VL - 8 IS - 6 KW - COVID-19 KW - SARS-CoV-2 KW - hospitalization KW - Bayesian KW - COVID-NET KW - extrapolation KW - hospital KW - estimation KW - prediction KW - United States KW - surveillance KW - data KW - model KW - modeling KW - hierarchical KW - rate KW - novel KW - framework KW - monitoring N2 - Background: In the United States, COVID-19 is a nationally notifiable disease, meaning cases and hospitalizations are reported by states to the Centers for Disease Control and Prevention (CDC). Identifying and reporting every case from every facility in the United States may not be feasible in the long term. Creating sustainable methods for estimating the burden of COVID-19 from established sentinel surveillance systems is becoming more important. Objective: We aimed to provide a method leveraging surveillance data to create a long-term solution to estimate monthly rates of hospitalizations for COVID-19. Methods: We estimated monthly hospitalization rates for COVID-19 from May 2020 through April 2021 for the 50 states using surveillance data from the COVID-19-Associated Hospitalization Surveillance Network (COVID-NET) and a Bayesian hierarchical model for extrapolation. Hospitalization rates were calculated from patients hospitalized with a lab-confirmed SARS-CoV-2 test during or within 14 days before admission. We created a model for 6 age groups (0-17, 18-49, 50-64, 65-74, 75-84, and ?85 years) separately. We identified covariates from multiple data sources that varied by age, state, and month and performed covariate selection for each age group based on 2 methods, Least Absolute Shrinkage and Selection Operator (LASSO) and spike and slab selection methods. We validated our method by checking the sensitivity of model estimates to covariate selection and model extrapolation as well as comparing our results to external data. Results: We estimated 3,583,100 (90% credible interval [CrI] 3,250,500-3,945,400) hospitalizations for a cumulative incidence of 1093.9 (992.4-1204.6) hospitalizations per 100,000 population with COVID-19 in the United States from May 2020 through April 2021. Cumulative incidence varied from 359 to 1856 per 100,000 between states. The age group with the highest cumulative incidence was those aged ?85 years (5575.6; 90% CrI 5066.4-6133.7). The monthly hospitalization rate was highest in December (183.7; 90% CrI 154.3-217.4). Our monthly estimates by state showed variations in magnitudes of peak rates, number of peaks, and timing of peaks between states. Conclusions: Our novel approach to estimate hospitalizations for COVID-19 has potential to provide sustainable estimates for monitoring COVID-19 burden as well as a flexible framework leveraging surveillance data. UR - https://publichealth.jmir.org/2022/6/e34296 UR - http://dx.doi.org/10.2196/34296 UR - http://www.ncbi.nlm.nih.gov/pubmed/35452402 ID - info:doi/10.2196/34296 ER - TY - JOUR AU - Adhikari, KJ Neill AU - Pinto, Ruxandra AU - Day, G. Andrew AU - Masse, Marie-Hélène AU - Ménard, Julie AU - Sprague, Sheila AU - Annane, Djillali AU - Arabi, M. Yaseen AU - Battista, Marie-Claude AU - Cohen, Dian AU - Cook, J. Deborah AU - Guyatt, H. Gordon AU - Heyland, K. Daren AU - Kanji, Salmaan AU - McGuinness, P. Shay AU - Parke, L. Rachael AU - Tirupakuzhi Vijayaraghavan, Kumar Bharath AU - Charbonney, Emmanuel AU - Chassé, Michaël AU - Del Sorbo, Lorenzo AU - Kutsogiannis, James Demetrios AU - Lauzier, François AU - Leblanc, Rémi AU - Maslove, M. David AU - Mehta, Sangeeta AU - Mekontso Dessap, Armand AU - Mele, S. Tina AU - Rochwerg, Bram AU - Rewa, G. Oleksa AU - Shahin, Jason AU - Twardowski, Pawel AU - Young, Jeffrey Paul AU - Lamontagne, François AU - PY - 2022/5/20 TI - Lessening Organ Dysfunction With Vitamin C (LOVIT) Trial: Statistical Analysis Plan JO - JMIR Res Protoc SP - e36261 VL - 11 IS - 5 KW - sepsis KW - vitamin C KW - statistical analysis KW - organ KW - ascorbic acid KW - critical care KW - organ dysfunction KW - intensive care unit KW - intensive care KW - patient KW - vasopressor KW - infection KW - intravenous KW - health data KW - trial database KW - patient outcome KW - mortality KW - statistical framework KW - binomial distribution N2 - Background: The LOVIT (Lessening Organ Dysfunction with Vitamin C) trial is a blinded multicenter randomized clinical trial comparing high-dose intravenous vitamin C to placebo in patients admitted to the intensive care unit with proven or suspected infection as the main diagnosis and receiving a vasopressor. Objective: We aim to describe a prespecified statistical analysis plan (SAP) for the LOVIT trial prior to unblinding and locking of the trial database. Methods: The SAP was designed by the LOVIT principal investigators and statisticians, and approved by the steering committee and coinvestigators. The SAP defines the primary and secondary outcomes, and describes the planned primary, secondary, and subgroup analyses. Results: The SAP includes a draft participant flow diagram, tables, and planned figures. The primary outcome is a composite of mortality and persistent organ dysfunction (receipt of mechanical ventilation, vasopressors, or new renal replacement therapy) at 28 days, where day 1 is the day of randomization. All analyses will use a frequentist statistical framework. The analysis of the primary outcome will estimate the risk ratio and 95% CI in a generalized linear mixed model with binomial distribution and log link, with site as a random effect. We will perform a secondary analysis adjusting for prespecified baseline clinical variables. Subgroup analyses will include age, sex, frailty, severity of illness, Sepsis-3 definition of septic shock, baseline ascorbic acid level, and COVID-19 status. Conclusions: We have developed an SAP for the LOVIT trial and will adhere to it in the analysis phase. International Registered Report Identifier (IRRID): DERR1-10.2196/36261 UR - https://www.researchprotocols.org/2022/5/e36261 UR - http://dx.doi.org/10.2196/36261 UR - http://www.ncbi.nlm.nih.gov/pubmed/35420994 ID - info:doi/10.2196/36261 ER - TY - JOUR AU - Chen, Uan-I AU - Xu, Hua AU - Krause, Millard Trudy AU - Greenberg, Raymond AU - Dong, Xiao AU - Jiang, Xiaoqian PY - 2022/5/12 TI - Factors Associated With COVID-19 Death in the United States: Cohort Study JO - JMIR Public Health Surveill SP - e29343 VL - 8 IS - 5 KW - COVID-19 KW - risk factors KW - survival analysis KW - cohort studies KW - EHR data N2 - Background: Since the initial COVID-19 cases were identified in the United States in February 2020, the United States has experienced a high incidence of the disease. Understanding the risk factors for severe outcomes identifies the most vulnerable populations and helps in decision-making. Objective: This study aims to assess the factors associated with COVID-19?related deaths from a large, national, individual-level data set. Methods: A cohort study was conducted using data from the Optum de-identified COVID-19 electronic health record (EHR) data set; 1,271,033 adult participants were observed from February 1, 2020, to August 31, 2020, until their deaths due to COVID-19, deaths due to other reasons, or the end of the study. Cox proportional hazards models were constructed to evaluate the risks for each patient characteristic. Results: A total of 1,271,033 participants (age: mean 52.6, SD 17.9 years; male: 507,574/1,271,033, 39.93%) were included in the study, and 3315 (0.26%) deaths were attributed to COVID-19. Factors associated with COVID-19?related death included older age (?80 vs 50-59 years old: hazard ratio [HR] 13.28, 95% CI 11.46-15.39), male sex (HR 1.68, 95% CI 1.57-1.80), obesity (BMI ?40 vs <30 kg/m2: HR 1.71, 95% CI 1.50-1.96), race (Hispanic White, African American, Asian vs non-Hispanic White: HR 2.46, 95% CI 2.01-3.02; HR 2.27, 95% CI 2.06-2.50; HR 2.06, 95% CI 1.65-2.57), region (South, Northeast, Midwest vs West: HR 1.62, 95% CI 1.33-1.98; HR 2.50, 95% CI 2.06-3.03; HR 1.35, 95% CI 1.11-1.64), chronic respiratory disease (HR 1.21, 95% CI 1.12-1.32), cardiac disease (HR 1.10, 95% CI 1.01-1.19), diabetes (HR 1.92, 95% CI 1.75-2.10), recent diagnosis of lung cancer (HR 1.70, 95% CI 1.14-2.55), severely reduced kidney function (HR 1.92, 95% CI 1.69-2.19), stroke or dementia (HR 1.25, 95% CI 1.15-1.36), other neurological diseases (HR 1.77, 95% CI 1.59-1.98), organ transplant (HR 1.35, 95% CI 1.09-1.67), and other immunosuppressive conditions (HR 1.21, 95% CI 1.01-1.46). Conclusions: This is one of the largest national cohort studies in the United States; we identified several patient characteristics associated with COVID-19?related deaths, and the results can serve as the basis for policy making. The study also offered directions for future studies, including the effect of other socioeconomic factors on the increased risk for minority groups. UR - https://publichealth.jmir.org/2022/5/e29343 UR - http://dx.doi.org/10.2196/29343 UR - http://www.ncbi.nlm.nih.gov/pubmed/35377319 ID - info:doi/10.2196/29343 ER - TY - JOUR AU - Fränti, Pasi AU - Sieranoja, Sami AU - Wikström, Katja AU - Laatikainen, Tiina PY - 2022/5/4 TI - Clustering Diagnoses From 58 Million Patient Visits in Finland Between 2015 and 2018 JO - JMIR Med Inform SP - e35422 VL - 10 IS - 5 KW - multimorbidity KW - cluster analysis KW - disease co-occurrence KW - multimorbidity network KW - health care data analysis KW - graph clustering KW - k-means KW - data analysis KW - cluster KW - machine learning KW - comorbidity KW - register KW - big data KW - Finland KW - Europe KW - health record N2 - Background: Multiple chronic diseases in patients are a major burden on the health service system. Currently, diseases are mostly treated separately without paying sufficient attention to their relationships, which results in the fragmentation of the care process. The better integration of services can lead to the more effective organization of the overall health care system. Objective: This study aimed to analyze the connections between diseases based on their co-occurrences to support decision-makers in better organizing health care services. Methods: We performed a cluster analysis of diagnoses by using data from the Finnish Health Care Registers for primary and specialized health care visits and inpatient care. The target population of this study comprised those 3.8 million individuals (3,835,531/5,487,308, 69.90% of the whole population) aged ?18 years who used health care services from the years 2015 to 2018. They had a total of 58 million visits. Clustering was performed based on the co-occurrence of diagnoses. The more the same pair of diagnoses appeared in the records of the same patients, the more the diagnoses correlated with each other. On the basis of the co-occurrences, we calculated the relative risk of each pair of diagnoses and clustered the data by using a graph-based clustering algorithm called the M-algorithm?a variant of k-means. Results: The results revealed multimorbidity clusters, of which some were expected (eg, one representing hypertensive and cardiovascular diseases). Other clusters were more unexpected, such as the cluster containing lower respiratory tract diseases and systemic connective tissue disorders. The annual cost of all clusters was ?10.0 billion, and the costliest cluster was cardiovascular and metabolic problems, costing ?2.3 billion. Conclusions: The method and the achieved results provide new insights into identifying key multimorbidity groups, especially those resulting in burden and costs in health care services. UR - https://medinform.jmir.org/2022/5/e35422 UR - http://dx.doi.org/10.2196/35422 UR - http://www.ncbi.nlm.nih.gov/pubmed/35507390 ID - info:doi/10.2196/35422 ER - TY - JOUR AU - Rovetta, Alessandro AU - Bhagavathula, Srikanth Akshaya PY - 2022/4/7 TI - The Impact of COVID-19 on Mortality in Italy: Retrospective Analysis of Epidemiological Trends JO - JMIR Public Health Surveill SP - e36022 VL - 8 IS - 4 KW - COVID-19 KW - deniers KW - excess deaths KW - epidemiology KW - infodemic KW - infodemiology KW - Italy KW - longitudinal analysis KW - mortality KW - time series KW - pandemic KW - public health N2 - Background: Despite the available evidence on its severity, COVID-19 has often been compared with seasonal flu by some conspirators and even scientists. Various public discussions arose about the noncausal correlation between COVID-19 and the observed deaths during the pandemic period in Italy. Objective: This paper aimed to search for endogenous reasons for the mortality increase recorded in Italy during 2020 to test this controversial hypothesis. Furthermore, we provide a framework for epidemiological analyses of time series. Methods: We analyzed deaths by age, sex, region, and cause of death in Italy from 2011 to 2019. Ordinary least squares (OLS) linear regression analyses and autoregressive integrated moving average (ARIMA) were used to predict the best value for 2020. A Grubbs 1-sided test was used to assess the significance of the difference between predicted and observed 2020 deaths/mortality. Finally, a 1-sample t test was used to compare the population of regional excess deaths to a null mean. The relationship between mortality and predictive variables was assessed using OLS multiple regression models. Since there is no uniform opinion on multicomparison adjustment and false negatives imply great epidemiological risk, the less-conservative Siegel approach and more-conservative Holm-Bonferroni approach were employed. By doing so, we provided the reader with the means to carry out an independent analysis. Results: Both ARIMA and OLS linear regression models predicted the number of deaths in Italy during 2020 to be between 640,000 and 660,000 (range of 95% CIs: 620,000-695,000) against the observed value of above 750,000. We found strong evidence supporting that the death increase in all regions (average excess=12.2%) was not due to chance (t21=7.2; adjusted P<.001). Male and female national mortality excesses were 18.4% (P<.001; adjusted P=.006) and 14.1% (P=.005; adjusted P=.12), respectively. However, we found limited significance when comparing male and female mortality residuals? using the Mann-Whitney U test (P=.27; adjusted P=.99). Finally, mortality was strongly and positively correlated with latitude (R=0.82; adjusted P<.001). In this regard, the significance of the mortality increases during 2020 varied greatly from region to region. Lombardy recorded the highest mortality increase (38% for men, adjusted P<.001; 31% for women, P<.001; adjusted P=.006). Conclusions: Our findings support the absence of historical endogenous reasons capable of justifying the mortality increase observed in Italy during 2020. Together with the current knowledge on SARS-CoV-2, these results provide decisive evidence on the devastating impact of COVID-19. We suggest that this research be leveraged by government, health, and information authorities to furnish proof against conspiracy hypotheses that minimize COVID-19?related risks. Finally, given the marked concordance between ARIMA and OLS regression, we suggest that these models be exploited for public health surveillance. Specifically, meaningful information can be deduced by comparing predicted and observed epidemiological trends. UR - https://publichealth.jmir.org/2022/4/e36022 UR - http://dx.doi.org/10.2196/36022 UR - http://www.ncbi.nlm.nih.gov/pubmed/35238784 ID - info:doi/10.2196/36022 ER - TY - JOUR AU - Zhou, Lexin AU - Romero-García, Nekane AU - Martínez-Miranda, Juan AU - Conejero, Alberto J. AU - García-Gómez, M. Juan AU - Sáez, Carlos PY - 2022/3/30 TI - Subphenotyping of Mexican Patients With COVID-19 at Preadmission To Anticipate Severity Stratification: Age-Sex Unbiased Meta-Clustering Technique JO - JMIR Public Health Surveill SP - e30032 VL - 8 IS - 3 KW - COVID-19 KW - subphenotypes KW - clustering KW - characterization KW - observational KW - epidemiology KW - Mexico N2 - Background: The COVID-19 pandemic has led to an unprecedented global health care challenge for both medical institutions and researchers. Recognizing different COVID-19 subphenotypes?the division of populations of patients into more meaningful subgroups driven by clinical features?and their severity characterization may assist clinicians during the clinical course, the vaccination process, research efforts, the surveillance system, and the allocation of limited resources. Objective: We aimed to discover age-sex unbiased COVID-19 patient subphenotypes based on easily available phenotypical data before admission, such as pre-existing comorbidities, lifestyle habits, and demographic features, to study the potential early severity stratification capabilities of the discovered subgroups through characterizing their severity patterns, including prognostic, intensive care unit (ICU), and morbimortality outcomes. Methods: We used the Mexican Government COVID-19 open data, including 778,692 SARS-CoV-2 population-based patient-level data as of September 2020. We applied a meta-clustering technique that consists of a 2-stage clustering approach combining dimensionality reduction (ie, principal components analysis and multiple correspondence analysis) and hierarchical clustering using the Ward minimum variance method with Euclidean squared distance. Results: In the independent age-sex clustering analyses, 56 clusters supported 11 clinically distinguishable meta-clusters (MCs). MCs 1-3 showed high recovery rates (90.27%-95.22%), including healthy patients of all ages, children with comorbidities and priority in receiving medical resources (ie, higher rates of hospitalization, intubation, and ICU admission) compared with other adult subgroups that have similar conditions, and young obese smokers. MCs 4-5 showed moderate recovery rates (81.30%-82.81%), including patients with hypertension or diabetes of all ages and obese patients with pneumonia, hypertension, and diabetes. MCs 6-11 showed low recovery rates (53.96%-66.94%), including immunosuppressed patients with high comorbidity rates, patients with chronic kidney disease with a poor survival length and probability of recovery, older smokers with chronic obstructive pulmonary disease, older adults with severe diabetes and hypertension, and the oldest obese smokers with chronic obstructive pulmonary disease and mild cardiovascular disease. Group outcomes conformed to the recent literature on dedicated age-sex groups. Mexican states and several types of clinical institutions showed relevant heterogeneity regarding severity, potentially linked to socioeconomic or health inequalities. Conclusions: The proposed 2-stage cluster analysis methodology produced a discriminative characterization of the sample and explainability over age and sex. These results can potentially help in understanding the clinical patient and their stratification for automated early triage before further tests and laboratory results are available and even in locations where additional tests are not available or to help decide resource allocation among vulnerable subgroups such as to prioritize vaccination or treatments. UR - https://publichealth.jmir.org/2022/3/e30032 UR - http://dx.doi.org/10.2196/30032 UR - http://www.ncbi.nlm.nih.gov/pubmed/35144239 ID - info:doi/10.2196/30032 ER - TY - JOUR AU - Li, Yuanyuan AU - Xu, Junfang AU - Gu, Yuxuan AU - Sun, Xueshan AU - Dong, Hengjin AU - Chen, Changgui PY - 2022/3/2 TI - The Disease and Economic Burdens of Esophageal Cancer in China from 2013 to 2030: Dynamic Cohort Modeling Study JO - JMIR Public Health Surveill SP - e33191 VL - 8 IS - 3 KW - esophageal cancer KW - disease burden KW - disability-adjusted life year KW - economic burden N2 - Background: Esophageal cancer (EC) is the sixth leading cause of tumor-related deaths worldwide. Estimates of the EC burden are necessary and could offer evidence-based suggestions for local cancer control. Objective: The aim of this study was to predict the disease burden of EC in China through the estimation of disability-adjusted life years (DALYs) and direct medical expenditure by sex from 2013 to 2030. Methods: A dynamic cohort Markov model was developed to simulate EC prevalence, DALYs, and direct medical expenditure by sex. Input data were collected from the China Statistical Yearbooks, Statistical Report of China Children?s Development, World Population Prospects 2019, and published papers. The JoinPoint Regression Program was used to calculate the average annual percentage change (AAPC) of DALY rates, whereas the average annual growth rate (AAGR) was applied to analyze the changing direct medical expenditure trend over time. Results: From 2013 to 2030, the predicted EC prevalence is projected to increase from 61.0 to 64.5 per 100,000 people, with annual EC cases increasing by 11.5% (from 835,600 to 931,800). The DALYs will increase by 21.3% (from 30,034,000 to 36,444,000), and the years of life lost (YLL) will account for over 90% of the DALYs. The DALY rates per 100,000 people will increase from 219.2 to 252.3; however, there was a difference between sexes, with an increase from 302.9 to 384.3 in males and a decline from 131.2 to 115.9 in females. The AAPC was 0.8% (95% CI 0.8% to 0.9%), 1.4% (95% CI 1.3% to 1.5%), and ?0.7% (95% CI ?0.8% to ?0.7%) for both sexes, males, and females, respectively. The direct medical expenditure will increase by 128.7% (from US $33.4 to US $76.4 billion), with an AAGR of 5.0%. The direct medical expenditure is 2-3 times higher in males than in females. Conclusions: EC still causes severe disease and economic burdens. YLL are responsible for the majority of DALYs, which highlights an urgent need to establish a beneficial policy to reduce the EC burden. UR - https://publichealth.jmir.org/2022/3/e33191 UR - http://dx.doi.org/10.2196/33191 UR - http://www.ncbi.nlm.nih.gov/pubmed/34963658 ID - info:doi/10.2196/33191 ER - TY - JOUR AU - Wang, Liya AU - Qiu, Hang AU - Luo, Li AU - Zhou, Li PY - 2022/2/25 TI - Age- and Sex-Specific Differences in Multimorbidity Patterns and Temporal Trends on Assessing Hospital Discharge Records in Southwest China: Network-Based Study JO - J Med Internet Res SP - e27146 VL - 24 IS - 2 KW - multimorbidity pattern KW - temporal trend KW - network analysis KW - multimorbidity prevalence KW - administrative data KW - longitudinal study KW - regional research N2 - Background: Multimorbidity represents a global health challenge, which requires a more global understanding of multimorbidity patterns and trends. However, the majority of studies completed to date have often relied on self-reported conditions, and a simultaneous assessment of the entire spectrum of chronic disease co-occurrence, especially in developing regions, has not yet been performed. Objective: We attempted to provide a multidimensional approach to understand the full spectrum of chronic disease co-occurrence among general inpatients in southwest China, in order to investigate multimorbidity patterns and temporal trends, and assess their age and sex differences. Methods: We conducted a retrospective cohort analysis based on 8.8 million hospital discharge records of about 5.0 million individuals of all ages from 2015 to 2019 in a megacity in southwest China. We examined all chronic diagnoses using the ICD-10 (International Classification of Diseases, 10th revision) codes at 3 digits and focused on chronic diseases with ?1% prevalence for each of the age and sex strata, which resulted in a total of 149 and 145 chronic diseases in males and females, respectively. We constructed multimorbidity networks in the general population based on sex and age, and used the cosine index to measure the co-occurrence of chronic diseases. Then, we divided the networks into communities and assessed their temporal trends. Results: The results showed complex interactions among chronic diseases, with more intensive connections among males and inpatients ?40 years old. A total of 9 chronic diseases were simultaneously classified as central diseases, hubs, and bursts in the multimorbidity networks. Among them, 5 diseases were common to both males and females, including hypertension, chronic ischemic heart disease, cerebral infarction, other cerebrovascular diseases, and atherosclerosis. The earliest leaps (degree leaps ?6) appeared at a disorder of glycoprotein metabolism that happened at 25-29 years in males, about 15 years earlier than in females. The number of chronic diseases in the community increased over time, but the new entrants did not replace the root of the community. Conclusions: Our multimorbidity network analysis identified specific differences in the co-occurrence of chronic diagnoses by sex and age, which could help in the design of clinical interventions for inpatient multimorbidity. UR - https://www.jmir.org/2022/2/e27146 UR - http://dx.doi.org/10.2196/27146 UR - http://www.ncbi.nlm.nih.gov/pubmed/35212632 ID - info:doi/10.2196/27146 ER - TY - JOUR AU - Oehmke, B. Theresa AU - Moss, B. Charles AU - Oehmke, F. James PY - 2022/2/24 TI - COVID-19 Surveillance Updates in US Metropolitan Areas: Dynamic Panel Data Modeling JO - JMIR Public Health Surveill SP - e28737 VL - 8 IS - 2 KW - surveillance system KW - COVID-19 KW - coronavirus KW - Sars-CoV-2 KW - Houston KW - dynamic panel data model KW - speed KW - jerk KW - acceleration KW - 7-Day persistence KW - modeling KW - data KW - surveillance KW - monitoring KW - public health KW - United States KW - transmission KW - response N2 - Background: Despite the availability of vaccines, the US incidence of new COVID-19 cases per day nearly doubled from the beginning of July to the end of August 2021, fueled largely by the rapid spread of the Delta variant. While the ?Delta wave? appears to have peaked nationally, some states and municipalities continue to see elevated numbers of new cases. Vigilant surveillance including at a metropolitan level can help identify any reignition and validate continued and strong public health policy responses in problem localities. Objective: This surveillance report aimed to provide up-to-date information for the 25 largest US metropolitan areas about the rapidity of descent in the number of new cases following the Delta wave peak, as well as any potential reignition of the pandemic associated with declining vaccine effectiveness over time, new variants, or other factors. Methods: COVID-19 pandemic dynamics for the 25 largest US metropolitan areas were analyzed through September 19, 2021, using novel metrics of speed, acceleration, jerk, and 7-day persistence, calculated from the observed data on the cumulative number of cases as reported by USAFacts. Statistical analysis was conducted using dynamic panel data models estimated with the Arellano-Bond regression techniques. The results are presented in tabular and graphic forms for visual interpretation. Results: On average, speed in the 25 largest US metropolitan areas declined from 34 new cases per day per 100,000 population, during the week ending August 15, 2021, to 29 new cases per day per 100,000 population, during the week ending September 19, 2021. This average masks important differences across metropolitan areas. For example, Miami?s speed decreased from 105 for the week ending August 15, 2021, to 40 for the week ending September 19, 2021. Los Angeles, San Francisco, Riverside, and San Diego had decreasing speed over the sample period and ended with single-digit speeds for the week ending September 19, 2021. However, Boston, Washington DC, Detroit, Minneapolis, Denver, and Charlotte all had their highest speed of the sample during the week ending September 19, 2021. These cities, as well as Houston and Baltimore, had positive acceleration for the week ending September 19, 2021. Conclusions: There is great variation in epidemiological curves across US metropolitan areas, including increasing numbers of new cases in 8 of the largest 25 metropolitan areas for the week ending September 19, 2021. These trends, including the possibility of waning vaccine effectiveness and the emergence of resistant variants, strongly indicate the need for continued surveillance and perhaps a return to more restrictive public health guidelines for some areas. UR - https://publichealth.jmir.org/2022/2/e28737 UR - http://dx.doi.org/10.2196/28737 UR - http://www.ncbi.nlm.nih.gov/pubmed/34882569 ID - info:doi/10.2196/28737 ER - TY - JOUR AU - Postill, Gemma AU - Murray, Regan AU - Wilton, S. Andrew AU - Wells, A. Richard AU - Sirbu, Renee AU - Daley, J. Mark AU - Rosella, Laura PY - 2022/2/21 TI - The Use of Cremation Data for Timely Mortality Surveillance During the COVID-19 Pandemic in Ontario, Canada: Validation Study JO - JMIR Public Health Surveill SP - e32426 VL - 8 IS - 2 KW - excess deaths KW - real-time mortality KW - cremation KW - COVID-19 KW - SARS-CoV-2 KW - mortality KW - estimate KW - impact KW - public health KW - validation KW - pattern KW - trend KW - utility KW - Canada KW - mortality data KW - pandemic KW - death KW - cremation data KW - cause of death KW - vital statistics KW - excess mortality N2 - Background: Early estimates of excess mortality are crucial for understanding the impact of COVID-19. However, there is a lag of several months in the reporting of vital statistics mortality data for many jurisdictions, including across Canada. In Ontario, a Canadian province, certification by a coroner is required before cremation can occur, creating real-time mortality data that encompasses the majority of deaths within the province. Objective: This study aimed to validate the use of cremation data as a timely surveillance tool for all-cause mortality during a public health emergency in a jurisdiction with delays in vital statistics data. Specifically, this study aimed to validate this surveillance tool by determining the stability, timeliness, and robustness of its real-time estimation of all-cause mortality. Methods: Cremation records from January 2020 until April 2021 were compared to the historical records from 2017 to 2019, grouped according to week, age, sex, and whether COVID-19 was the cause of death. Cremation data were compared to Ontario?s provisional vital statistics mortality data released by Statistics Canada. The 2020 and 2021 records were then compared to previous years (2017-2019) to determine whether there was excess mortality within various age groups and whether deaths attributed to COVID-19 accounted for the entirety of the excess mortality. Results: Between 2017 and 2019, cremations were performed for 67.4% (95% CI 67.3%-67.5%) of deaths. The proportion of cremated deaths remained stable throughout 2020, even within age and sex categories. Cremation records are 99% complete within 3 weeks of the date of death, which precedes the compilation of vital statistics data by several months. Consequently, during the first wave (from April to June 2020), cremation records detected a 16.9% increase (95% CI 14.6%-19.3%) in all-cause mortality, a finding that was confirmed several months later with cremation data. Conclusions: The percentage of Ontarians cremated and the completion of cremation data several months before vital statistics did not change meaningfully during the COVID-19 pandemic period, establishing that the pandemic did not significantly alter cremation practices. Cremation data can be used to accurately estimate all-cause mortality in near real-time, particularly when real-time mortality estimates are needed to inform policy decisions for public health measures. The accuracy of this excess mortality estimation was confirmed by comparing it with official vital statistics data. These findings demonstrate the utility of cremation data as a complementary data source for timely mortality information during public health emergencies. UR - https://publichealth.jmir.org/2022/2/e32426 UR - http://dx.doi.org/10.2196/32426 UR - http://www.ncbi.nlm.nih.gov/pubmed/35038302 ID - info:doi/10.2196/32426 ER - TY - JOUR AU - Lundberg, L. Alexander AU - Lorenzo-Redondo, Ramon AU - Ozer, A. Egon AU - Hawkins, A. Claudia AU - Hultquist, F. Judd AU - Welch, B. Sarah AU - Prasad, Vara P. V. AU - Oehmke, F. James AU - Achenbach, J. Chad AU - Murphy, L. Robert AU - White, I. Janine AU - Havey, J. Robert AU - Post, Ann Lori PY - 2022/1/31 TI - Has Omicron Changed the Evolution of the Pandemic? JO - JMIR Public Health Surveill SP - e35763 VL - 8 IS - 1 KW - Omicron KW - SARS-CoV-2 KW - public health surveillance KW - VOC KW - variant of concern KW - Delta KW - Beta KW - COVID-19 KW - sub-Saharan Africa KW - public health KW - pandemic KW - epidemiology N2 - Background: Variants of the SARS-CoV-2 virus carry differential risks to public health. The Omicron (B.1.1.529) variant, first identified in Botswana on November 11, 2021, has spread globally faster than any previous variant of concern. Understanding the transmissibility of Omicron is vital in the development of public health policy. Objective: The aim of this study is to compare SARS-CoV-2 outbreaks driven by Omicron to those driven by prior variants of concern in terms of both the speed and magnitude of an outbreak. Methods: We analyzed trends in outbreaks by variant of concern with validated surveillance metrics in several southern African countries. The region offers an ideal setting for a natural experiment given that most outbreaks thus far have been driven primarily by a single variant at a time. With a daily longitudinal data set of new infections, total vaccinations, and cumulative infections in countries in sub-Saharan Africa, we estimated how the emergence of Omicron has altered the trajectory of SARS-CoV-2 outbreaks. We used the Arellano-Bond method to estimate regression coefficients from a dynamic panel model, in which new infections are a function of infections yesterday and last week. We controlled for vaccinations and prior infections in the population. To test whether Omicron has changed the average trajectory of a SARS-CoV-2 outbreak, we included an interaction between an indicator variable for the emergence of Omicron and lagged infections. Results: The observed Omicron outbreaks in this study reach the outbreak threshold within 5-10 days after first detection, whereas other variants of concern have taken at least 14 days and up to as many as 35 days. The Omicron outbreaks also reach peak rates of new cases that are roughly 1.5-2 times those of prior variants of concern. Dynamic panel regression estimates confirm Omicron has created a statistically significant shift in viral spread. Conclusions: The transmissibility of Omicron is markedly higher than prior variants of concern. At the population level, the Omicron outbreaks occurred more quickly and with larger magnitude, despite substantial increases in vaccinations and prior infections, which should have otherwise reduced susceptibility to new infections. Unless public health policies are substantially altered, Omicron outbreaks in other countries are likely to occur with little warning. UR - https://publichealth.jmir.org/2022/1/e35763 UR - http://dx.doi.org/10.2196/35763 UR - http://www.ncbi.nlm.nih.gov/pubmed/35072638 ID - info:doi/10.2196/35763 ER - TY - JOUR AU - Zimba, Rebecca AU - Romo, L. Matthew AU - Kulkarni, G. Sarah AU - Berry, Amanda AU - You, William AU - Mirzayi, Chloe AU - Westmoreland, A. Drew AU - Parcesepe, M. Angela AU - Waldron, Levi AU - Rane, S. Madhura AU - Kochhar, Shivani AU - Robertson, M. McKaylee AU - Maroko, R. Andrew AU - Grov, Christian AU - Nash, Denis PY - 2021/12/30 TI - Patterns of SARS-CoV-2 Testing Preferences in a National Cohort in the United States: Latent Class Analysis of a Discrete Choice Experiment JO - JMIR Public Health Surveill SP - e32846 VL - 7 IS - 12 KW - SARS-CoV-2 KW - testing KW - discrete choice experiment KW - latent class analysis KW - COVID-19 KW - pattern KW - trend KW - preference KW - cohort KW - United States KW - discrete choice KW - diagnostic KW - transmission KW - vaccine KW - uptake KW - public health N2 - Background: Inadequate screening and diagnostic testing in the United States throughout the first several months of the COVID-19 pandemic led to undetected cases transmitting disease in the community and an underestimation of cases. Though testing supply has increased, maintaining testing uptake remains a public health priority in the efforts to control community transmission considering the availability of vaccinations and threats from variants. Objective: This study aimed to identify patterns of preferences for SARS-CoV-2 screening and diagnostic testing prior to widespread vaccine availability and uptake. Methods: We conducted a discrete choice experiment (DCE) among participants in the national, prospective CHASING COVID (Communities, Households, and SARS-CoV-2 Epidemiology) Cohort Study from July 30 to September 8, 2020. The DCE elicited preferences for SARS-CoV-2 test type, specimen type, testing venue, and result turnaround time. We used latent class multinomial logit to identify distinct patterns of preferences related to testing as measured by attribute-level part-worth utilities and conducted a simulation based on the utility estimates to predict testing uptake if additional testing scenarios were offered. Results: Of the 5098 invited cohort participants, 4793 (94.0%) completed the DCE. Five distinct patterns of SARS-CoV-2 testing emerged. Noninvasive home testers (n=920, 19.2% of participants) were most influenced by specimen type and favored less invasive specimen collection methods, with saliva being most preferred; this group was the least likely to opt out of testing. Fast-track testers (n=1235, 25.8%) were most influenced by result turnaround time and favored immediate and same-day turnaround time. Among dual testers (n=889, 18.5%), test type was the most important attribute, and preference was given to both antibody and viral tests. Noninvasive dual testers (n=1578, 32.9%) were most strongly influenced by specimen type and test type, preferring saliva and cheek swab specimens and both antibody and viral tests. Among hesitant home testers (n=171, 3.6%), the venue was the most important attribute; notably, this group was the most likely to opt out of testing. In addition to variability in preferences for testing features, heterogeneity was observed in the distribution of certain demographic characteristics (age, race/ethnicity, education, and employment), history of SARS-CoV-2 testing, COVID-19 diagnosis, and concern about the pandemic. Simulation models predicted that testing uptake would increase from 81.6% (with a status quo scenario of polymerase chain reaction by nasal swab in a provider?s office and a turnaround time of several days) to 98.1% by offering additional scenarios using less invasive specimens, both viral and antibody tests from a single specimen, faster turnaround time, and at-home testing. Conclusions: We identified substantial differences in preferences for SARS-CoV-2 testing and found that offering additional testing options would likely increase testing uptake in line with public health goals. Additional studies may be warranted to understand if preferences for testing have changed since the availability and widespread uptake of vaccines. UR - https://publichealth.jmir.org/2021/12/e32846 UR - http://dx.doi.org/10.2196/32846 UR - http://www.ncbi.nlm.nih.gov/pubmed/34793320 ID - info:doi/10.2196/32846 ER - TY - JOUR AU - Husnayain, Atina AU - Shim, Eunha AU - Fuad, Anis AU - Su, Chia-Yu Emily PY - 2021/12/22 TI - Predicting New Daily COVID-19 Cases and Deaths Using Search Engine Query Data in South Korea From 2020 to 2021: Infodemiology Study JO - J Med Internet Res SP - e34178 VL - 23 IS - 12 KW - prediction KW - internet search KW - COVID-19 KW - South Korea KW - infodemiology N2 - Background: Given the ongoing COVID-19 pandemic situation, accurate predictions could greatly help in the health resource management for future waves. However, as a new entity, COVID-19?s disease dynamics seemed difficult to predict. External factors, such as internet search data, need to be included in the models to increase their accuracy. However, it remains unclear whether incorporating online search volumes into models leads to better predictive performances for long-term prediction. Objective: The aim of this study was to analyze whether search engine query data are important variables that should be included in the models predicting new daily COVID-19 cases and deaths in short- and long-term periods. Methods: We used country-level case-related data, NAVER search volumes, and mobility data obtained from Google and Apple for the period of January 20, 2020, to July 31, 2021, in South Korea. Data were aggregated into four subsets: 3, 6, 12, and 18 months after the first case was reported. The first 80% of the data in all subsets were used as the training set, and the remaining data served as the test set. Generalized linear models (GLMs) with normal, Poisson, and negative binomial distribution were developed, along with linear regression (LR) models with lasso, adaptive lasso, and elastic net regularization. Root mean square error values were defined as a loss function and were used to assess the performance of the models. All analyses and visualizations were conducted in SAS Studio, which is part of the SAS OnDemand for Academics. Results: GLMs with different types of distribution functions may have been beneficial in predicting new daily COVID-19 cases and deaths in the early stages of the outbreak. Over longer periods, as the distribution of cases and deaths became more normally distributed, LR models with regularization may have outperformed the GLMs. This study also found that models performed better when predicting new daily deaths compared to new daily cases. In addition, an evaluation of feature effects in the models showed that NAVER search volumes were useful variables in predicting new daily COVID-19 cases, particularly in the first 6 months of the outbreak. Searches related to logistical needs, particularly for ?thermometer? and ?mask strap,? showed higher feature effects in that period. For longer prediction periods, NAVER search volumes were still found to constitute an important variable, although with a lower feature effect. This finding suggests that search term use should be considered to maintain the predictive performance of models. Conclusions: NAVER search volumes were important variables in short- and long-term prediction, with higher feature effects for predicting new daily COVID-19 cases in the first 6 months of the outbreak. Similar results were also found for death predictions. UR - https://www.jmir.org/2021/12/e34178 UR - http://dx.doi.org/10.2196/34178 UR - http://www.ncbi.nlm.nih.gov/pubmed/34762064 ID - info:doi/10.2196/34178 ER - TY - JOUR AU - Taira, Kazuya AU - Hosokawa, Rikuya AU - Itatani, Tomoya AU - Fujita, Sumio PY - 2021/12/3 TI - Predicting the Number of Suicides in Japan Using Internet Search Queries: Vector Autoregression Time Series Model JO - JMIR Public Health Surveill SP - e34016 VL - 7 IS - 12 KW - suicide KW - internet search engine KW - infoveillance KW - query KW - time series analysis KW - vector autoregression model KW - COVID-19 KW - suicide-related terms KW - internet KW - information seeking KW - time series KW - model KW - loneliness KW - mental health KW - prediction KW - Japan KW - behavior KW - trend N2 - Background: The number of suicides in Japan increased during the COVID-19 pandemic. Predicting the number of suicides is important to take timely preventive measures. Objective: This study aims to clarify whether the number of suicides can be predicted by suicide-related search queries used before searching for the keyword ?suicide.? Methods: This study uses the infoveillance approach for suicide in Japan by search trends in search engines. The monthly number of suicides by gender, collected and published by the National Police Agency, was used as an outcome variable. The number of searches by gender with queries associated with ?suicide? on ?Yahoo! JAPAN Search? from January 2016 to December 2020 was used as a predictive variable. The following five phrases highly relevant to suicide were used as search terms before searching for the keyword ?suicide? and extracted and used for analyses: ?abuse?; ?work, don?t want to go?; ?company, want to quit?; ?divorce?; and ?no money.? The augmented Dickey-Fuller and Johansen tests were performed for the original series and to verify the existence of unit roots and cointegration for each variable, respectively. The vector autoregression model was applied to predict the number of suicides. The Breusch-Godfrey Lagrangian multiplier (BG-LM) test, autoregressive conditional heteroskedasticity Lagrangian multiplier (ARCH-LM) test, and Jarque-Bera (JB) test were used to confirm model convergence. In addition, a Granger causality test was performed for each predictive variable. Results: In the original series, unit roots were found in the trend model, whereas in the first-order difference series, both men (minimum tau 3: ?9.24; max tau 3: ?5.38) and women (minimum tau 3: ?9.24; max tau 3: ?5.38) had no unit roots for all variables. In the Johansen test, a cointegration relationship was observed among several variables. The queries used in the converged models were ?divorce? for men (BG-LM test: P=.55; ARCH-LM test: P=.63; JB test: P=.66) and ?no money? for women (BG-LM test: P=.17; ARCH-LM test: P=.15; JB test: P=.10). In the Granger causality test for each variable, ?divorce? was significant for both men (F104=3.29; P=.04) and women (F104=3.23; P=.04). Conclusions: The number of suicides can be predicted by search queries related to the keyword ?suicide.? Previous studies have reported that financial poverty and divorce are associated with suicide. The results of this study, in which search queries on ?no money? and ?divorce? predicted suicide, support the findings of previous studies. Further research on the economic poverty of women and those with complex problems is necessary. UR - https://publichealth.jmir.org/2021/12/e34016 UR - http://dx.doi.org/10.2196/34016 UR - http://www.ncbi.nlm.nih.gov/pubmed/34823225 ID - info:doi/10.2196/34016 ER - TY - JOUR AU - Donnat, Claire AU - Bunbury, Freddy AU - Kreindler, Jack AU - Liu, David AU - Filippidis, T. Filippos AU - Esko, Tonu AU - El-Osta, Austen AU - Harris, Matthew PY - 2021/12/1 TI - Predicting COVID-19 Transmission to Inform the Management of Mass Events: Model-Based Approach JO - JMIR Public Health Surveill SP - e30648 VL - 7 IS - 12 KW - COVID-19 KW - transmission dynamics KW - live event management KW - Monte Carlo simulation N2 - Background: Modelling COVID-19 transmission at live events and public gatherings is essential to controlling the probability of subsequent outbreaks and communicating to participants their personalized risk. Yet, despite the fast-growing body of literature on COVID-19 transmission dynamics, current risk models either neglect contextual information including vaccination rates or disease prevalence or do not attempt to quantitatively model transmission. Objective: This paper attempted to bridge this gap by providing informative risk metrics for live public events, along with a measure of their uncertainty. Methods: Building upon existing models, our approach ties together 3 main components: (1) reliable modelling of the number of infectious cases at the time of the event, (2) evaluation of the efficiency of pre-event screening, and (3) modelling of the event?s transmission dynamics and their uncertainty using Monte Carlo simulations. Results: We illustrated the application of our pipeline for a concert at the Royal Albert Hall and highlighted the risk?s dependency on factors such as prevalence, mask wearing, and event duration. We demonstrate how this event held on 3 different dates (August 20, 2020; January 20, 2021; and March 20, 2021) would likely lead to transmission events that are similar to community transmission rates (0.06 vs 0.07, 2.38 vs 2.39, and 0.67 vs 0.60, respectively). However, differences between event and background transmissions substantially widened in the upper tails of the distribution of the number of infections (as denoted by their respective 99th quantiles: 1 vs 1, 19 vs 8, and 6 vs 3, respectively, for our 3 dates), further demonstrating that sole reliance on vaccination and antigen testing to gain entry would likely significantly underestimate the tail risk of the event. Conclusions: Despite the unknowns surrounding COVID-19 transmission, our estimation pipeline opens the discussion on contextualized risk assessment by combining the best tools at hand to assess the order of magnitude of the risk. Our model can be applied to any future event and is presented in a user-friendly RShiny interface. Finally, we discussed our model?s limitations as well as avenues for model evaluation and improvement. UR - https://publichealth.jmir.org/2021/12/e30648 UR - http://dx.doi.org/10.2196/30648 UR - http://www.ncbi.nlm.nih.gov/pubmed/34583317 ID - info:doi/10.2196/30648 ER - TY - JOUR AU - Shi, Xin AU - Lima, Silva Simone Maria da AU - Mota, Miranda Caroline Maria de AU - Lu, Ying AU - Stafford, S. Randall AU - Pereira, Viana Corintho PY - 2021/11/25 TI - Prevalence of Multimorbidity of Chronic Noncommunicable Diseases in Brazil: Population-Based Study JO - JMIR Public Health Surveill SP - e29693 VL - 7 IS - 11 KW - multimorbidity KW - prevalence KW - health care KW - public health KW - Brazil KW - logistic regression N2 - Background: Multimorbidity is the co-occurrence of two or more chronic diseases. Objective: This study, based on self-reported medical diagnosis, aims to investigate the dynamic distribution of multimorbidity across sociodemographic levels and its impacts on health-related issues over 15 years in Brazil using national data. Methods: Data were analyzed using descriptive statistics, hypothesis tests, and logistic regression. The study sample comprised 679,572 adults (18-59 years of age) and 115,699 elderly people (?60 years of age) from the two latest cross-sectional, multiple-cohort, national-based studies: the National Sample Household Survey (PNAD) of 1998, 2003, and 2008, and the Brazilian National Health Survey (PNS) of 2013. Results: Overall, the risk of multimorbidity in adults was 1.7 times higher in women (odds ratio [OR] 1.73, 95% CI 1.67-1.79) and 1.3 times higher among people without education (OR 1.34, 95% CI 1.28-1.41). Multiple chronic diseases considerably increased with age in Brazil, and people between 50 and 59 years old were about 12 times more likely to have multimorbidity than adults between 18 and 29 years of age (OR 11.89, 95% CI 11.27-12.55). Seniors with multimorbidity had more than twice the likelihood of receiving health assistance in community services or clinics (OR 2.16, 95% CI 2.02-2.31) and of being hospitalized (OR 2.37, 95% CI 2.21-2.56). The subjective well-being of adults with multimorbidity was often worse than people without multiple chronic diseases (OR=12.85, 95% CI: 12.07-13.68). These patterns were similar across all 4 cohorts analyzed and were relatively stable over 15 years. Conclusions: Our study shows little variation in the prevalence of the multimorbidity of chronic diseases in Brazil over time, but there are differences in the prevalence of multimorbidity across different social groups. It is hoped that the analysis of multimorbidity from the two latest Brazil national surveys will support policy making on epidemic prevention and management. UR - https://publichealth.jmir.org/2021/11/e29693 UR - http://dx.doi.org/10.2196/29693 UR - http://www.ncbi.nlm.nih.gov/pubmed/34842558 ID - info:doi/10.2196/29693 ER - TY - JOUR AU - Yang, Shi-Ping AU - Su, Hui-Luan AU - Chen, Xiu-Bei AU - Hua, Li AU - Chen, Jian-Xian AU - Hu, Min AU - Lei, Jian AU - Wu, San-Gang AU - Zhou, Juan PY - 2021/11/17 TI - Long-Term Survival Among Histological Subtypes in Advanced Epithelial Ovarian Cancer: Population-Based Study Using the Surveillance, Epidemiology, and End Results Database JO - JMIR Public Health Surveill SP - e25976 VL - 7 IS - 11 KW - ovarian epithelial carcinoma KW - survivors KW - histology KW - survival rate KW - survival KW - ovarian KW - cancer KW - surveillance KW - epidemiology KW - women?s health KW - reproductive health KW - Surveillance, Epidemiology, and End Results KW - ovary KW - oncology KW - survivorship KW - long-term outcome KW - epithelial N2 - Background: Actual long-term survival rates for advanced epithelial ovarian cancer (EOC) are rarely reported. Objective: This study aimed to assess the role of histological subtypes in predicting the prognosis among long-term survivors (?5 years) of advanced EOC. Methods: We performed a retrospective analysis of data among patients with stage III-IV EOC diagnosed from 2000 to 2014 using the Surveillance, Epidemiology, and End Results cancer data of the United States. We used the chi-square test, Kaplan?Meier analysis, and multivariate Cox proportional hazards model for the analyses. Results: We included 8050 patients in this study, including 6929 (86.1%), 743 (9.2%), 237 (2.9%), and 141 (1.8%) patients with serous, endometrioid, clear cell, and mucinous tumors, respectively. With a median follow-up of 91 months, the most common cause of death was primary ovarian cancer (80.3%), followed by other cancers (8.1%), other causes of death (7.3%), cardiac-related death (3.2%), and nonmalignant pulmonary disease (3.2%). Patients with the serous subtype were more likely to die from primary ovarian cancer, and patients with the mucinous subtype were more likely to die from other cancers and cardiac-related disease. Multivariate Cox analysis showed that patients with endometrioid (hazard ratio [HR] 0.534, P<.001), mucinous (HR 0.454, P<.001), and clear cell (HR 0.563, P<.001) subtypes showed better ovarian cancer-specific survival than those with the serous subtype. Similar results were found regarding overall survival. However, ovarian cancer?specific survival and overall survival were comparable among those with endometrioid, clear cell, and mucinous tumors. Conclusions: Ovarian cancer remains the primary cause of death in long-term ovarian cancer survivors. Moreover, the probability of death was significantly different among those with different histological subtypes. It is important for clinicians to individualize the surveillance program for long-term ovarian cancer survivors. UR - https://publichealth.jmir.org/2021/11/e25976 UR - http://dx.doi.org/10.2196/25976 UR - http://www.ncbi.nlm.nih.gov/pubmed/34787583 ID - info:doi/10.2196/25976 ER - TY - JOUR AU - Monzani, Dario AU - Vergani, Laura AU - Pizzoli, Maria Silvia Francesca AU - Marton, Giulia AU - Pravettoni, Gabriella PY - 2021/10/27 TI - Emotional Tone, Analytical Thinking, and Somatosensory Processes of a Sample of Italian Tweets During the First Phases of the COVID-19 Pandemic: Observational Study JO - J Med Internet Res SP - e29820 VL - 23 IS - 10 KW - internet KW - mHealth KW - infodemiology KW - infoveillance KW - pandemic KW - public health KW - COVID-19 KW - Twitter KW - psycholinguistic analysis KW - trauma N2 - Background: The COVID-19 pandemic is a traumatic individual and collective chronic experience, with tremendous consequences on mental and psychological health that can also be reflected in people?s use of words. Psycholinguistic analysis of tweets from Twitter allows obtaining information about people?s emotional expression, analytical thinking, and somatosensory processes, which are particularly important in traumatic events contexts. Objective: We aimed to analyze the influence of official Italian COVID-19 daily data (new cases, deaths, and hospital discharges) and the phase of managing the pandemic on how people expressed emotions and their analytical thinking and somatosensory processes in Italian tweets written during the first phases of the COVID-19 pandemic in Italy. Methods: We retrieved 1,697,490 Italian COVID-19?related tweets written from February 24, 2020 to June 14, 2020 and analyzed them using LIWC2015 to calculate 3 summary psycholinguistic variables: emotional tone, analytical thinking, and somatosensory processes. Official daily data about new COVID-19 cases, deaths, and hospital discharges were retrieved from the Italian Prime Minister's Office and Civil Protection Department GitHub page. We considered 3 phases of managing the COVID-19 pandemic in Italy. We performed 3 general models, 1 for each summary variable as the dependent variable and with daily data and phase of managing the pandemic as independent variables. Results: General linear models to assess differences in daily scores of emotional tone, analytical thinking, and somatosensory processes were significant (F6,104=21.53, P<.001, R2= .55; F5,105=9.20, P<.001, R2= .30; F6,104=6.15, P<.001, R2=.26, respectively). Conclusions: The COVID-19 pandemic affects how people express emotions, analytical thinking, and somatosensory processes in tweets. Our study contributes to the investigation of pandemic psychological consequences through psycholinguistic analysis of social media textual data. UR - https://www.jmir.org/2021/10/e29820 UR - http://dx.doi.org/10.2196/29820 UR - http://www.ncbi.nlm.nih.gov/pubmed/34516386 ID - info:doi/10.2196/29820 ER - TY - JOUR AU - Gwon, Hansle AU - Ahn, Imjin AU - Kim, Yunha AU - Kang, Jun Hee AU - Seo, Hyeram AU - Cho, Na Ha AU - Choi, Heejung AU - Jun, Joon Tae AU - Kim, Young-Hak PY - 2021/10/13 TI - Self?Training With Quantile Errors for Multivariate Missing Data Imputation for Regression Problems in Electronic Medical Records: Algorithm Development Study JO - JMIR Public Health Surveill SP - e30824 VL - 7 IS - 10 KW - self-training KW - artificial intelligence KW - electronic medical records KW - imputation N2 - Background: When using machine learning in the real world, the missing value problem is the first problem encountered. Methods to impute this missing value include statistical methods such as mean, expectation-maximization, and multiple imputations by chained equations (MICE) as well as machine learning methods such as multilayer perceptron, k-nearest neighbor, and decision tree. Objective: The objective of this study was to impute numeric medical data such as physical data and laboratory data. We aimed to effectively impute data using a progressive method called self-training in the medical field where training data are scarce. Methods: In this paper, we propose a self-training method that gradually increases the available data. Models trained with complete data predict the missing values in incomplete data. Among the incomplete data, the data in which the missing value is validly predicted are incorporated into the complete data. Using the predicted value as the actual value is called pseudolabeling. This process is repeated until the condition is satisfied. The most important part of this process is how to evaluate the accuracy of pseudolabels. They can be evaluated by observing the effect of the pseudolabeled data on the performance of the model. Results: In self-training using random forest (RF), mean squared error was up to 12% lower than pure RF, and the Pearson correlation coefficient was 0.1% higher. This difference was confirmed statistically. In the Friedman test performed on MICE and RF, self-training showed a P value between .003 and .02. A Wilcoxon signed-rank test performed on the mean imputation showed the lowest possible P value, 3.05e-5, in all situations. Conclusions: Self-training showed significant results in comparing the predicted values and actual values, but it needs to be verified in an actual machine learning system. And self-training has the potential to improve performance according to the pseudolabel evaluation method, which will be the main subject of our future research. UR - https://publichealth.jmir.org/2021/10/e30824 UR - http://dx.doi.org/10.2196/30824 UR - http://www.ncbi.nlm.nih.gov/pubmed/34643539 ID - info:doi/10.2196/30824 ER - TY - JOUR AU - Khader, Yousef AU - Al Nsour, Mohannad PY - 2021/10/7 TI - Excess Mortality During the COVID-19 Pandemic in Jordan: Secondary Data Analysis JO - JMIR Public Health Surveill SP - e32559 VL - 7 IS - 10 KW - COVID-19 KW - excess mortality KW - pandemic N2 - Background: All-cause mortality and estimates of excess deaths are commonly used in different countries to estimate the burden of COVID-19 and assess its direct and indirect effects. Objective: This study aimed to analyze the excess mortality during the COVID-19 pandemic in Jordan in April-December 2020. Methods: Official data on deaths in Jordan for 2020 and previous years (2016-2019) were obtained from the Department of Civil Status. We contrasted mortality rates in 2020 with those in each year and the pooled period 2016-2020 using a standardized mortality ratio (SMR) measure. Expected deaths for 2020 were estimated by fitting the overdispersed Poisson generalized linear models to the monthly death counts for the period of 2016-2019. Results: Overall, a 21% increase in standardized mortality (SMR 1.21, 95% CI 1.19-1.22) occurred in April-December 2020 compared with the April-December months in the pooled period 2016-2019. The SMR was more pronounced for men than for women (SMR 1.26, 95% CI 1.24-1.29 vs SMR 1.12, 95% CI 1.10-1.14), and it was statistically significant for both genders (P<.05). Using overdispersed Poisson generalized linear models, the number of expected deaths in April-December 2020 was 12,845 (7957 for women and 4888 for men). The total number of excess deaths during this period was estimated at 4583 (95% CI 4451-4716), with higher excess deaths in men (3112, 95% CI 3003-3221) than in women (1503, 95% CI 1427-1579). Almost 83.66% of excess deaths were attributed to COVID-19 in the Ministry of Health database. The vast majority of excess deaths occurred in people aged 60 years or older. Conclusions: The reported COVID-19 death counts underestimated mortality attributable to COVID-19. Excess deaths could reflect the increased deaths secondary to the pandemic and its containment measures. The majority of excess deaths occurred among old age groups. It is, therefore, important to maintain essential services for the elderly during pandemics. UR - https://publichealth.jmir.org/2021/10/e32559 UR - http://dx.doi.org/10.2196/32559 UR - http://www.ncbi.nlm.nih.gov/pubmed/34617910 ID - info:doi/10.2196/32559 ER - TY - JOUR AU - Wong, Chi-Yin Kenneth AU - Xiang, Yong AU - Yin, Liangying AU - So, Hon-Cheong PY - 2021/9/30 TI - Uncovering Clinical Risk Factors and Predicting Severe COVID-19 Cases Using UK Biobank Data: Machine Learning Approach JO - JMIR Public Health Surveill SP - e29544 VL - 7 IS - 9 KW - prediction KW - COVID-19 KW - risk factors KW - machine learning KW - pandemic KW - biobank KW - public health KW - prediction models KW - medical informatics N2 - Background: COVID-19 is a major public health concern. Given the extent of the pandemic, it is urgent to identify risk factors associated with disease severity. More accurate prediction of those at risk of developing severe infections is of high clinical importance. Objective: Based on the UK Biobank (UKBB), we aimed to build machine learning models to predict the risk of developing severe or fatal infections, and uncover major risk factors involved. Methods: We first restricted the analysis to infected individuals (n=7846), then performed analysis at a population level, considering those with no known infection as controls (ncontrols=465,728). Hospitalization was used as a proxy for severity. A total of 97 clinical variables (collected prior to the COVID-19 outbreak) covering demographic variables, comorbidities, blood measurements (eg, hematological/liver/renal function/metabolic parameters), anthropometric measures, and other risk factors (eg, smoking/drinking) were included as predictors. We also constructed a simplified (lite) prediction model using 27 covariates that can be more easily obtained (demographic and comorbidity data). XGboost (gradient-boosted trees) was used for prediction and predictive performance was assessed by cross-validation. Variable importance was quantified by Shapley values (ShapVal), permutation importance (PermImp), and accuracy gain. Shapley dependency and interaction plots were used to evaluate the pattern of relationships between risk factors and outcomes. Results: A total of 2386 severe and 477 fatal cases were identified. For analyses within infected individuals (n=7846), our prediction model achieved area under the receiving-operating characteristic curve (AUC?ROC) of 0.723 (95% CI 0.711-0.736) and 0.814 (95% CI 0.791-0.838) for severe and fatal infections, respectively. The top 5 contributing factors (sorted by ShapVal) for severity were age, number of drugs taken (cnt_tx), cystatin C (reflecting renal function), waist-to-hip ratio (WHR), and Townsend deprivation index (TDI). For mortality, the top features were age, testosterone, cnt_tx, waist circumference (WC), and red cell distribution width. For analyses involving the whole UKBB population, AUCs for severity and fatality were 0.696 (95% CI 0.684-0.708) and 0.825 (95% CI 0.802-0.848), respectively. The same top 5 risk factors were identified for both outcomes, namely, age, cnt_tx, WC, WHR, and TDI. Apart from the above, age, cystatin C, TDI, and cnt_tx were among the top 10 across all 4 analyses. Other diseases top ranked by ShapVal or PermImp were type 2 diabetes mellitus (T2DM), coronary artery disease, atrial fibrillation, and dementia, among others. For the ?lite? models, predictive performances were broadly similar, with estimated AUCs of 0.716, 0.818, 0.696, and 0.830, respectively. The top ranked variables were similar to above, including age, cnt_tx, WC, sex (male), and T2DM. Conclusions: We identified numerous baseline clinical risk factors for severe/fatal infection by XGboost. For example, age, central obesity, impaired renal function, multiple comorbidities, and cardiometabolic abnormalities may predispose to poorer outcomes. The prediction models may be useful at a population level to identify those susceptible to developing severe/fatal infections, facilitating targeted prevention strategies. A risk-prediction tool is also available online. Further replications in independent cohorts are required to verify our findings. UR - https://publichealth.jmir.org/2021/9/e29544 UR - http://dx.doi.org/10.2196/29544 UR - http://www.ncbi.nlm.nih.gov/pubmed/34591027 ID - info:doi/10.2196/29544 ER - TY - JOUR AU - Hu, Tao AU - Wang, Siqin AU - Luo, Wei AU - Zhang, Mengxi AU - Huang, Xiao AU - Yan, Yingwei AU - Liu, Regina AU - Ly, Kelly AU - Kacker, Viraj AU - She, Bing AU - Li, Zhenlong PY - 2021/9/10 TI - Revealing Public Opinion Towards COVID-19 Vaccines With Twitter Data in the United States: Spatiotemporal Perspective JO - J Med Internet Res SP - e30854 VL - 23 IS - 9 KW - Twitter KW - public opinion KW - COVID-19 vaccines KW - sentiment analysis KW - emotion analysis KW - topic modeling KW - COVID-19 N2 - Background: The COVID-19 pandemic has imposed a large, initially uncontrollable, public health crisis both in the United States and across the world, with experts looking to vaccines as the ultimate mechanism of defense. The development and deployment of COVID-19 vaccines have been rapidly advancing via global efforts. Hence, it is crucial for governments, public health officials, and policy makers to understand public attitudes and opinions towards vaccines, such that effective interventions and educational campaigns can be designed to promote vaccine acceptance. Objective: The aim of this study was to investigate public opinion and perception on COVID-19 vaccines in the United States. We investigated the spatiotemporal trends of public sentiment and emotion towards COVID-19 vaccines and analyzed how such trends relate to popular topics found on Twitter. Methods: We collected over 300,000 geotagged tweets in the United States from March 1, 2020 to February 28, 2021. We examined the spatiotemporal patterns of public sentiment and emotion over time at both national and state scales and identified 3 phases along the pandemic timeline with sharp changes in public sentiment and emotion. Using sentiment analysis, emotion analysis (with cloud mapping of keywords), and topic modeling, we further identified 11 key events and major topics as the potential drivers to such changes. Results: An increasing trend in positive sentiment in conjunction with a decrease in negative sentiment were generally observed in most states, reflecting the rising confidence and anticipation of the public towards vaccines. The overall tendency of the 8 types of emotion implies that the public trusts and anticipates the vaccine. This is accompanied by a mixture of fear, sadness, and anger. Critical social or international events or announcements by political leaders and authorities may have potential impacts on public opinion towards vaccines. These factors help identify underlying themes and validate insights from the analysis. Conclusions: The analyses of near real-time social media big data benefit public health authorities by enabling them to monitor public attitudes and opinions towards vaccine-related information in a geo-aware manner, address the concerns of vaccine skeptics, and promote the confidence that individuals within a certain region or community have towards vaccines. UR - https://www.jmir.org/2021/9/e30854 UR - http://dx.doi.org/10.2196/30854 UR - http://www.ncbi.nlm.nih.gov/pubmed/34346888 ID - info:doi/10.2196/30854 ER - TY - JOUR AU - Giacopelli, Giuseppe PY - 2021/9/10 TI - A Full-Scale Agent-Based Model to Hypothetically Explore the Impact of Lockdown, Social Distancing, and Vaccination During the COVID-19 Pandemic in Lombardy, Italy: Model Development JO - JMIRx Med SP - e24630 VL - 2 IS - 3 KW - epidemiology KW - computational KW - model KW - COVID-19 KW - modeling KW - outbreak KW - virus KW - infectious disease KW - simulation KW - impact KW - vaccine KW - agent-based model N2 - Background: The COVID-19 outbreak, an event of global concern, has provided scientists the opportunity to use mathematical modeling to run simulations and test theories about the pandemic. Objective: The aim of this study was to propose a full-scale individual-based model of the COVID-19 outbreak in Lombardy, Italy, to test various scenarios pertaining to the pandemic and achieve novel performance metrics. Methods: The model was designed to simulate all 10 million inhabitants of Lombardy person by person via a simple agent-based approach using a commercial computer. In order to obtain performance data, a collision detection model was developed to enable cluster nodes in small cells that can be processed fully in parallel. Within this collision detection model, an epidemic model based mostly on experimental findings about COVID-19 was developed. Results: The model was used to explain the behavior of the COVID-19 outbreak in Lombardy. Different parameters were used to simulate various scenarios relating to social distancing and lockdown. According to the model, these simple actions were enough to control the virus. The model also explained the decline in cases in the spring and simulated a hypothetical vaccination scenario, confirming, for example, the herd immunity threshold computed in previous works. Conclusions: The model made it possible to test the impact of people?s daily actions (eg, maintaining social distance) on the epidemic and to investigate interactions among agents within a social network. It also provided insight on the impact of a hypothetical vaccine. UR - https://med.jmirx.org/2021/3/e24630 UR - http://dx.doi.org/10.2196/24630 UR - http://www.ncbi.nlm.nih.gov/pubmed/34606524 ID - info:doi/10.2196/24630 ER - TY - JOUR AU - Hamadeh, Abdullah AU - Feng, Zeny AU - Niergarth, Jessmyn AU - Wong, WL William PY - 2021/9/9 TI - Estimation of COVID-19 Period Prevalence and the Undiagnosed Population in Canadian Provinces: Model-Based Analysis JO - JMIR Public Health Surveill SP - e26409 VL - 7 IS - 9 KW - COVID-19 KW - prevalence KW - undiagnosed proportion KW - mathematical modeling KW - estimate KW - Canada KW - diagnosis KW - control KW - distribution KW - infectious disease KW - model KW - framework KW - progression KW - transmission N2 - Background: The development of a successful COVID-19 control strategy requires a thorough understanding of the trends in geographic and demographic distributions of disease burden. In terms of the estimation of the population prevalence, this includes the crucial process of unravelling the number of patients who remain undiagnosed. Objective: This study estimates the period prevalence of COVID-19 between March 1, 2020, and November 30, 2020, and the proportion of the infected population that remained undiagnosed in the Canadian provinces of Quebec, Ontario, Alberta, and British Columbia. Methods: A model-based mathematical framework based on a disease progression and transmission model was developed to estimate the historical prevalence of COVID-19 using provincial-level statistics reporting seroprevalence, diagnoses, and deaths resulting from COVID-19. The framework was applied to three different age cohorts (< 30; 30-69; and ?70 years) in each of the provinces studied. Results: The estimates of COVID-19 period prevalence between March 1, 2020, and November 30, 2020, were 4.73% (95% CI 4.42%-4.99%) for Quebec, 2.88% (95% CI 2.75%-3.02%) for Ontario, 3.27% (95% CI 2.72%-3.70%) for Alberta, and 2.95% (95% CI 2.77%-3.15%) for British Columbia. Among the cohorts considered in this study, the estimated total number of infections ranged from 2-fold the number of diagnoses (among Quebecers, aged ?70 years: 26,476/53,549, 49.44%) to 6-fold the number of diagnoses (among British Columbians aged ?70 years: 3108/18,147, 17.12%). Conclusions: Our estimates indicate that a high proportion of the population infected between March 1 and November 30, 2020, remained undiagnosed. Knowledge of COVID-19 period prevalence and the undiagnosed population can provide vital evidence that policy makers can consider when planning COVID-19 control interventions and vaccination programs. UR - https://publichealth.jmir.org/2021/9/e26409 UR - http://dx.doi.org/10.2196/26409 UR - http://www.ncbi.nlm.nih.gov/pubmed/34228626 ID - info:doi/10.2196/26409 ER - TY - JOUR AU - Kishore, Kamal AU - Jaswal, Vidushi AU - Verma, Madhur AU - Koushal, Vipin PY - 2021/8/30 TI - Exploring the Utility of Google Mobility Data During the COVID-19 Pandemic in India: Digital Epidemiological Analysis JO - JMIR Public Health Surveill SP - e29957 VL - 7 IS - 8 KW - COVID-19 KW - lockdown KW - nonpharmaceutical Interventions KW - social distancing KW - digital surveillance KW - Google Community Mobility Reports KW - community mobility N2 - Background: Association between human mobility and disease transmission has been established for COVID-19, but quantifying the levels of mobility over large geographical areas is difficult. Google has released Community Mobility Reports (CMRs) containing data about the movement of people, collated from mobile devices. Objective: The aim of this study is to explore the use of CMRs to assess the role of mobility in spreading COVID-19 infection in India. Methods: In this ecological study, we analyzed CMRs to determine human mobility between March and October 2020. The data were compared for the phases before the lockdown (between March 14 and 25, 2020), during lockdown (March 25-June 7, 2020), and after the lockdown (June 8-October 15, 2020) with the reference periods (ie, January 3-February 6, 2020). Another data set depicting the burden of COVID-19 as per various disease severity indicators was derived from a crowdsourced API. The relationship between the two data sets was investigated using the Kendall tau correlation to depict the correlation between mobility and disease severity. Results: At the national level, mobility decreased from ?38% to ?77% for all areas but residential (which showed an increase of 24.6%) during the lockdown compared to the reference period. At the beginning of the unlock phase, the state of Sikkim (minimum cases: 7) with a ?60% reduction in mobility depicted more mobility compared to ?82% in Maharashtra (maximum cases: 1.59 million). Residential mobility was negatively correlated (?0.05 to ?0.91) with all other measures of mobility. The magnitude of the correlations for intramobility indicators was comparatively low for the lockdown phase (correlation ?0.5 for 12 indicators) compared to the other phases (correlation ?0.5 for 45 and 18 indicators in the prelockdown and unlock phases, respectively). A high correlation coefficient between epidemiological and mobility indicators was observed for the lockdown and unlock phases compared to the prelockdown phase. Conclusions: Mobile-based open-source mobility data can be used to assess the effectiveness of social distancing in mitigating disease spread. CMR data depicted an association between mobility and disease severity, and we suggest using this technique to supplement future COVID-19 surveillance. UR - https://publichealth.jmir.org/2021/8/e29957 UR - http://dx.doi.org/10.2196/29957 UR - http://www.ncbi.nlm.nih.gov/pubmed/34174780 ID - info:doi/10.2196/29957 ER - TY - JOUR AU - Trotter II, Robert AU - Baldwin, Julie AU - Buck, Loren Charles AU - Remiker, Mark AU - Aguirre, Amanda AU - Milner, Trudie AU - Torres, Emma AU - von Hippel, Arthur Frank PY - 2021/8/11 TI - Health Impacts of Perchlorate and Pesticide Exposure: Protocol for Community-Engaged Research to Evaluate Environmental Toxicants in a US Border Community JO - JMIR Res Protoc SP - e15864 VL - 10 IS - 8 KW - community-engaged research KW - endocrine disruption KW - environmental contaminants KW - health disparities KW - toxic metal contamination KW - perchlorates KW - pesticides KW - population health KW - thyroid disease N2 - Background: The Northern Arizona University (NAU) Center for Health Equity Research (CHER) is conducting community-engaged health research involving ?environmental scans? in Yuma County in collaboration with community health stakeholders, including the Yuma Regional Medical Center (YRMC), Regional Center for Border Health, Inc. (RCBH), Campesinos Sin Fronteras (CSF), Yuma County Public Health District, and government agencies and nongovernmental organizations (NGOs) working on border health issues. The purpose of these efforts is to address community-generated environmental health hazards identified through ongoing coalitions among NAU, and local health care and research institutions. Objective: We are undertaking joint community/university efforts to examine human exposures to perchlorate and agricultural pesticides. This project also includes the parallel development of a new animal model for investigating the mechanisms of toxicity following a ?one health? approach. The ultimate goal of this community-engaged effort is to develop interventions to reduce exposures and health impacts of contaminants in Yuma populations. Methods: All participants completed the informed consent process, which included information on the purpose of the study, a request for access to health histories and medical records, and interviews. The interview included questions related to (1) demographics, (2) social determinants of health, (3) health screening, (4) occupational and environmental exposures to perchlorate and pesticides, and (5) access to health services. Each participant provided a hair sample for quantifying the metals used in pesticides, urine sample for perchlorate quantification, and blood sample for endocrine assays. Modeling will examine the relationships between the concentrations of contaminants and hormones, demographics and social determinants of health, and health status of the study population, including health markers known to be impacted by perchlorate and pesticides. Results: We recruited 323 adults residing in Yuma County during a 1-year pilot/feasibility study. Among these, 147 residents were patients from either YRMC or RCBH with a primary diagnosis of thyroid disease, including hyperthyroidism, hypothyroidism, thyroid cancer, or goiter. The remaining 176 participants were from the general population but with no history of thyroid disorder. The pilot study confirmed the feasibility of using the identified community-engaged protocol to recruit, consent, and collect data from a difficult-to-access, vulnerable population. The demographics of the pilot study population and positive feedback on the success of the community-engaged approach indicate that the project can be scaled up to a broader study with replicable population health findings. Conclusions: Using a community-engaged approach, the research protocol provided substantial evidence regarding the effectiveness of designing and implementing culturally relevant recruitment and dissemination processes that combine laboratory findings and public health information. Future findings will elucidate the mechanisms of toxicity and the population health effects of the contaminants of concern, as well as provide a new animal model to develop precision medicine capabilities for the population. International Registered Report Identifier (IRRID): DERR1-10.2196/15864 UR - https://www.researchprotocols.org/2021/8/e15864 UR - http://dx.doi.org/10.2196/15864 UR - http://www.ncbi.nlm.nih.gov/pubmed/34383679 ID - info:doi/10.2196/15864 ER - TY - JOUR AU - Nguyen, M. Hieu AU - Turk, J. Philip AU - McWilliams, D. Andrew PY - 2021/8/4 TI - Forecasting COVID-19 Hospital Census: A Multivariate Time-Series Model Based on Local Infection Incidence JO - JMIR Public Health Surveill SP - e28195 VL - 7 IS - 8 KW - COVID-19 KW - forecasting KW - time-series model KW - vector error correction model KW - hospital census KW - hospital resource utilization KW - infection incidence N2 - Background: COVID-19 has been one of the most serious global health crises in world history. During the pandemic, health care systems require accurate forecasts for key resources to guide preparation for patient surges. Forecasting the COVID-19 hospital census is among the most important planning decisions to ensure adequate staffing, number of beds, intensive care units, and vital equipment. Objective: The goal of this study was to explore the potential utility of local COVID-19 infection incidence data in developing a forecasting model for the COVID-19 hospital census. Methods: The study data comprised aggregated daily COVID-19 hospital census data across 11 Atrium Health hospitals plus a virtual hospital in the greater Charlotte metropolitan area of North Carolina, as well as the total daily infection incidence across the same region during the May 15 to December 5, 2020, period. Cross-correlations between hospital census and local infection incidence lagging up to 21 days were computed. A multivariate time-series framework, called the vector error correction model (VECM), was used to simultaneously incorporate both time series and account for their possible long-run relationship. Hypothesis tests and model diagnostics were performed to test for the long-run relationship and examine model goodness of fit. The 7-days-ahead forecast performance was measured by mean absolute percentage error (MAPE), with time-series cross-validation. The forecast performance was also compared with an autoregressive integrated moving average (ARIMA) model in the same cross-validation time frame. Based on different scenarios of the pandemic, the fitted model was leveraged to produce 60-days-ahead forecasts. Results: The cross-correlations were uniformly high, falling between 0.7 and 0.8. There was sufficient evidence that the two time series have a stable long-run relationship at the .01 significance level. The model had very good fit to the data. The out-of-sample MAPE had a median of 5.9% and a 95th percentile of 13.4%. In comparison, the MAPE of the ARIMA had a median of 6.6% and a 95th percentile of 14.3%. Scenario-based 60-days-ahead forecasts exhibited concave trajectories with peaks lagging 2 to 3 weeks later than the peak infection incidence. In the worst-case scenario, the COVID-19 hospital census can reach a peak over 3 times greater than the peak observed during the second wave. Conclusions: When used in the VECM framework, the local COVID-19 infection incidence can be an effective leading indicator to predict the COVID-19 hospital census. The VECM model had a very good 7-days-ahead forecast performance and outperformed the traditional ARIMA model. Leveraging the relationship between the two time series, the model can produce realistic 60-days-ahead scenario-based projections, which can inform health care systems about the peak timing and volume of the hospital census for long-term planning purposes. UR - https://publichealth.jmir.org/2021/8/e28195 UR - http://dx.doi.org/10.2196/28195 UR - http://www.ncbi.nlm.nih.gov/pubmed/34346897 ID - info:doi/10.2196/28195 ER - TY - JOUR AU - Gupta, K. Agrayan AU - Grannis, J. Shaun AU - Kasthurirathne, N. Suranga PY - 2021/7/26 TI - Evaluation of a Parsimonious COVID-19 Outbreak Prediction Model: Heuristic Modeling Approach Using Publicly Available Data Sets JO - J Med Internet Res SP - e28812 VL - 23 IS - 7 KW - coronavirus KW - COVID-19 KW - emerging outbreak KW - modeling disease outbreak KW - precision public health KW - predictive modeling N2 - Background: The COVID-19 pandemic has changed public health policies and human and community behaviors through lockdowns and mandates. Governments are rapidly evolving policies to increase hospital capacity and supply personal protective equipment and other equipment to mitigate disease spread in affected regions. Current models that predict COVID-19 case counts and spread are complex by nature and offer limited explainability and generalizability. This has highlighted the need for accurate and robust outbreak prediction models that balance model parsimony and performance. Objective: We sought to leverage readily accessible data sets extracted from multiple states to train and evaluate a parsimonious predictive model capable of identifying county-level risk of COVID-19 outbreaks on a day-to-day basis. Methods: Our modeling approach leveraged the following data inputs: COVID-19 case counts per county per day and county populations. We developed an outbreak gold standard across California, Indiana, and Iowa. The model utilized a per capita running 7-day sum of the case counts per county per day and the mean cumulative case count to develop baseline values. The model was trained with data recorded between March 1 and August 31, 2020, and tested on data recorded between September 1 and October 31, 2020. Results: The model reported sensitivities of 81%, 92%, and 90% for California, Indiana, and Iowa, respectively. The precision in each state was above 85% while specificity and accuracy scores were generally >95%. Conclusions: Our parsimonious model provides a generalizable and simple alternative approach to outbreak prediction. This methodology can be applied to diverse regions to help state officials and hospitals with resource allocation and to guide risk management, community education, and mitigation strategies. UR - https://www.jmir.org/2021/7/e28812 UR - http://dx.doi.org/10.2196/28812 UR - http://www.ncbi.nlm.nih.gov/pubmed/34156964 ID - info:doi/10.2196/28812 ER - TY - JOUR AU - Castro, A. Lauren AU - Shelley, D. Courtney AU - Osthus, Dave AU - Michaud, Isaac AU - Mitchell, Jason AU - Manore, A. Carrie AU - Del Valle, Y. Sara PY - 2021/6/9 TI - How New Mexico Leveraged a COVID-19 Case Forecasting Model to Preemptively Address the Health Care Needs of the State: Quantitative Analysis JO - JMIR Public Health Surveill SP - e27888 VL - 7 IS - 6 KW - COVID-19 KW - forecasting KW - health care KW - prediction KW - forecast KW - model KW - quantitative KW - hospital KW - ICU KW - ventilator KW - intensive care unit KW - probability KW - trend KW - plan N2 - Background: Prior to the COVID-19 pandemic, US hospitals relied on static projections of future trends for long-term planning and were only beginning to consider forecasting methods for short-term planning of staffing and other resources. With the overwhelming burden imposed by COVID-19 on the health care system, an emergent need exists to accurately forecast hospitalization needs within an actionable timeframe. Objective: Our goal was to leverage an existing COVID-19 case and death forecasting tool to generate the expected number of concurrent hospitalizations, occupied intensive care unit (ICU) beds, and in-use ventilators 1 day to 4 weeks in the future for New Mexico and each of its five health regions. Methods: We developed a probabilistic model that took as input the number of new COVID-19 cases for New Mexico from Los Alamos National Laboratory?s COVID-19 Forecasts Using Fast Evaluations and Estimation tool, and we used the model to estimate the number of new daily hospital admissions 4 weeks into the future based on current statewide hospitalization rates. The model estimated the number of new admissions that would require an ICU bed or use of a ventilator and then projected the individual lengths of hospital stays based on the resource need. By tracking the lengths of stay through time, we captured the projected simultaneous need for inpatient beds, ICU beds, and ventilators. We used a postprocessing method to adjust the forecasts based on the differences between prior forecasts and the subsequent observed data. Thus, we ensured that our forecasts could reflect a dynamically changing situation on the ground. Results: Forecasts made between September 1 and December 9, 2020, showed variable accuracy across time, health care resource needs, and forecast horizon. Forecasts made in October, when new COVID-19 cases were steadily increasing, had an average accuracy error of 20.0%, while the error in forecasts made in September, a month with low COVID-19 activity, was 39.7%. Across health care use categories, state-level forecasts were more accurate than those at the regional level. Although the accuracy declined as the forecast was projected further into the future, the stated uncertainty of the prediction improved. Forecasts were within 5% of their stated uncertainty at the 50% and 90% prediction intervals at the 3- to 4-week forecast horizon for state-level inpatient and ICU needs. However, uncertainty intervals were too narrow for forecasts of state-level ventilator need and all regional health care resource needs. Conclusions: Real-time forecasting of the burden imposed by a spreading infectious disease is a crucial component of decision support during a public health emergency. Our proposed methodology demonstrated utility in providing near-term forecasts, particularly at the state level. This tool can aid other stakeholders as they face COVID-19 population impacts now and in the future. UR - https://publichealth.jmir.org/2021/6/e27888 UR - http://dx.doi.org/10.2196/27888 UR - http://www.ncbi.nlm.nih.gov/pubmed/34003763 ID - info:doi/10.2196/27888 ER - TY - JOUR AU - Tso, Foon Chak AU - Garikipati, Anurag AU - Green-Saxena, Abigail AU - Mao, Qingqing AU - Das, Ritankar PY - 2021/6/3 TI - Correlation of Population SARS-CoV-2 Cycle Threshold Values to Local Disease Dynamics: Exploratory Observational Study JO - JMIR Public Health Surveill SP - e28265 VL - 7 IS - 6 KW - reverse transcription polymerase chain reaction KW - testing KW - cycle threshold KW - COVID-19 KW - epidemiology KW - Rt KW - exploratory KW - correlation KW - population KW - threshold KW - disease dynamic KW - distribution KW - transmission N2 - Background: Despite the limitations in the use of cycle threshold (CT) values for individual patient care, population distributions of CT values may be useful indicators of local outbreaks. Objective: We aimed to conduct an exploratory analysis of potential correlations between the population distribution of cycle threshold (CT) values and COVID-19 dynamics, which were operationalized as percent positivity, transmission rate (Rt), and COVID-19 hospitalization count. Methods: In total, 148,410 specimens collected between September 15, 2020, and January 11, 2021, from the greater El Paso area were processed in the Dascena COVID-19 Laboratory. The daily median CT value, daily Rt, daily count of COVID-19 hospitalizations, daily change in percent positivity, and rolling averages of these features were plotted over time. Two-way scatterplots and linear regression were used to evaluate possible associations between daily median CT values and outbreak measures. Cross-correlation plots were used to determine whether a time delay existed between changes in daily median CT values and measures of community disease dynamics. Results: Daily median CT values negatively correlated with the daily Rt values (P<.001), the daily COVID-19 hospitalization counts (with a 33-day time delay; P<.001), and the daily changes in percent positivity among testing samples (P<.001). Despite visual trends suggesting time delays in the plots for median CT values and outbreak measures, a statistically significant delay was only detected between changes in median CT values and COVID-19 hospitalization counts (P<.001). Conclusions: This study adds to the literature by analyzing samples collected from an entire geographical area and contextualizing the results with other research investigating population CT values. UR - https://publichealth.jmir.org/2021/6/e28265 UR - http://dx.doi.org/10.2196/28265 UR - http://www.ncbi.nlm.nih.gov/pubmed/33999831 ID - info:doi/10.2196/28265 ER - TY - JOUR AU - Benoni, Roberto AU - Panunzi, Silvia AU - Campagna, Irene AU - Moretti, Francesca AU - Lo Cascio, Giuliana AU - Spiteri, Gianluca AU - Porru, Stefano AU - Tardivo, Stefano PY - 2021/6/3 TI - The Effect of Test Timing on the Probability of Positive SARS-CoV-2 Swab Test Results: Mixed Model Approach JO - JMIR Public Health Surveill SP - e27189 VL - 7 IS - 6 KW - close contact KW - COVID-19 KW - health care workers KW - health surveillance KW - swab test timing N2 - Background: During the COVID-19 pandemic, swab tests proved to be effective in containing the infection and served as a means for early diagnosis and contact tracing. However, little evidence exists regarding the correct timing for the execution of the swab test, especially for asymptomatic individuals and health care workers. Objective: The objective of this study was to analyze changes in the positive findings over time in individual SARS-CoV-2 swab tests during a health surveillance program. Methods: The study was conducted with 2071 health care workers at the University Hospital of Verona, with a known date of close contact with a patient with COVID-19, between February 29 and April 17, 2020. The health care workers underwent a health surveillance program with repeated swab tests to track their virological status. A generalized additive mixed model was used to investigate how the probability of a positive test result changes over time since the last known date of close contact, in an overall sample of individuals who tested positive for COVID-19 and in a subset of individuals with an initial negative swab test finding before being proven positive, to assess different surveillance time intervals. Results: Among the 2071 health care workers in this study, 191 (9.2%) tested positive for COVID-19, and 103 (54%) were asymptomatic with no differences based on sex or age. Among 49 (25.7%) cases, the initial swab test yielded negative findings after close contact with a patient with COVID-19. Sex, age, symptoms, and the time of sampling were not different between individuals with an initial negative swab test finding and those who initially tested positive after close contact. In the overall sample, the estimated probability of testing positive was 0.74 on day 1 after close contact, which increased to 0.77 between days 5 and 8. In the 3 different scenarios for scheduled repeated testing intervals (3, 5, and 7 days) in the subgroup of individuals with an initially negative swab test finding, the probability peaked on the sixth, ninth and tenth, and 13th and 14th days, respectively. Conclusions: Swab tests can initially yield false-negative outcomes. The probability of testing positive increases from day 1, peaking between days 5 and 8 after close contact with a patient with COVID-19. Early testing, especially in this final time window, is recommended together with a health surveillance program scheduled in close intervals. UR - https://publichealth.jmir.org/2021/6/e27189 UR - http://dx.doi.org/10.2196/27189 UR - http://www.ncbi.nlm.nih.gov/pubmed/34003761 ID - info:doi/10.2196/27189 ER - TY - JOUR AU - Moghalles, Ameen Suaad AU - Aboasba, Ahmed Basher AU - Alamad, Abdullah Mohammed AU - Khader, Saleh Yousef PY - 2021/6/2 TI - Epidemiology of Diphtheria in Yemen, 2017-2018: Surveillance Data Analysis JO - JMIR Public Health Surveill SP - e27590 VL - 7 IS - 6 KW - diphtheria KW - epidemiology KW - incidence KW - case fatality rate N2 - Background: As a consequence of war and the collapse of the health system in Yemen, which prevented many people from accessing health facilities to obtain primary health care, vaccination coverage was affected, leading to a deadly diphtheria epidemic at the end of 2017. Objective: This study aimed to describe the epidemiology of diphtheria in Yemen and determine its incidence and case fatality rate. Methods: Data were obtained from the diphtheria surveillance program 2017-2018, using case definitions of the World Health Organization. A probable case was defined as a case involving a person having laryngitis, pharyngitis, or tonsillitis and an adherent membrane of the tonsils, pharynx, and/or nose. A confirmed case was defined as a probable case that was laboratory confirmed or linked epidemiologically to a laboratory-confirmed case. Data from the Central Statistical Organization was used to calculate the incidence per 100,000 population. A P value <.05 was considered significant. Results: A total of 2243 cases were reported during the period between July 2017 and August 2018. About 49% (1090/2243, 48.6%) of the cases were males. About 44% (978/2243, 43.6%) of the cases involved children aged 5 to 15 years. Respiratory tract infection was the predominant symptom (2044/2243, 91.1%), followed by pseudomembrane (1822/2243, 81.2%). Based on the vaccination status, the percentages of partially vaccinated, vaccinated, unvaccinated, and unknown status patients were 6.6% (148/2243), 30.8% (690/2243), 48.6% (10902243), and 14.0% (315/2243), respectively. The overall incidence of diphtheria was 8 per 100,000 population. The highest incidence was among the age group <15 years (11 per 100,000 population), and the lowest incidence was among the age group ?15 years (5 per 100,000 population). The overall case fatality rate among all age groups was 5%, and it was higher (10%) in the age group <5 years. Five governorates that were difficult to access (Raymah, Abyan, Sa'ada, Lahj, and Al Jawf) had a very high case fatality rate (22%). Conclusions: Diphtheria affected a large number of people in Yemen in 2017-2018. The majority of patients were partially or not vaccinated. Children aged ?15 years were more affected, with higher fatality among children aged <5 years. Five governorates that were difficult to access had a case fatality rate twice that of the World Health Organization estimate (5%-10%). To control the diphtheria epidemic in Yemen, it is recommended to increase routine vaccination coverage and booster immunizations, increase public health awareness toward diphtheria, and strengthen the surveillance system for early detection and immediate response. UR - https://publichealth.jmir.org/2021/6/e27590 UR - http://dx.doi.org/10.2196/27590 UR - http://www.ncbi.nlm.nih.gov/pubmed/34076583 ID - info:doi/10.2196/27590 ER - TY - JOUR AU - Lee, Hyojung AU - Kim, Yeahwon AU - Kim, Eunsu AU - ?Lee, Sunmi PY - 2021/6/1 TI - Risk Assessment of Importation and Local Transmission of COVID-19 in South Korea: Statistical Modeling Approach JO - JMIR Public Health Surveill SP - e26784 VL - 7 IS - 6 KW - COVID-19 KW - transmission dynamics KW - South Korea KW - international travels KW - imported and local transmission KW - basic reproduction number KW - effective reproduction number KW - mitigation intervention strategies KW - risk KW - assessment KW - transmission KW - mitigation KW - strategy KW - travel KW - mobility KW - spread KW - intervention KW - diagnosis KW - monitoring KW - testing N2 - Background: Despite recent achievements in vaccines, antiviral drugs, and medical infrastructure, the emergence of COVID-19 has posed a serious threat to humans worldwide. Most countries are well connected on a global scale, making it nearly impossible to implement perfect and prompt mitigation strategies for infectious disease outbreaks. In particular, due to the explosive growth of international travel, the complex network of human mobility enabled the rapid spread of COVID-19 globally. Objective: South Korea was one of the earliest countries to be affected by COVID-19. In the absence of vaccines and treatments, South Korea has implemented and maintained stringent interventions, such as large-scale epidemiological investigations, rapid diagnosis, social distancing, and prompt clinical classification of severely ill patients with appropriate medical measures. In particular, South Korea has implemented effective airport screenings and quarantine measures. In this study, we aimed to assess the country-specific importation risk of COVID-19 and investigate its impact on the local transmission of COVID-19. Methods: The country-specific importation risk of COVID-19 in South Korea was assessed. We investigated the relationships between country-specific imported cases, passenger numbers, and the severity of country-specific COVID-19 prevalence from January to October 2020. We assessed the country-specific risk by incorporating country-specific information. A renewal mathematical model was employed, considering both imported and local cases of COVID-19 in South Korea. Furthermore, we estimated the basic and effective reproduction numbers. Results: The risk of importation from China was highest between January and February 2020, while that from North America (the United States and Canada) was high from April to October 2020. The R0 was estimated at 1.87 (95% CI 1.47-2.34), using the rate of ?=0.07 for secondary transmission caused by imported cases. The Rt was estimated in South Korea and in both Seoul and Gyeonggi. Conclusions: A statistical model accounting for imported and locally transmitted cases was employed to estimate R0 and Rt. Our results indicated that the prompt implementation of airport screening measures (contact tracing with case isolation and quarantine) successfully reduced local transmission caused by imported cases despite passengers arriving from high-risk countries throughout the year. Moreover, various mitigation interventions, including social distancing and travel restrictions within South Korea, have been effectively implemented to reduce the spread of local cases in South Korea. UR - https://publichealth.jmir.org/2021/6/e26784 UR - http://dx.doi.org/10.2196/26784 UR - http://www.ncbi.nlm.nih.gov/pubmed/33819165 ID - info:doi/10.2196/26784 ER - TY - JOUR AU - Bongolan, Pearl Vena AU - Minoza, Antonio Jose Marie AU - de Castro, Romulo AU - Sevilleja, Emmanuel Jesus PY - 2021/5/31 TI - Age-Stratified Infection Probabilities Combined With a Quarantine-Modified Model for COVID-19 Needs Assessments: Model Development Study JO - J Med Internet Res SP - e19544 VL - 23 IS - 5 KW - COVID-19 KW - epidemic modeling KW - age stratification theory KW - infection probability KW - SEIR KW - mathematical modelling N2 - Background: Classic compartmental models such as the susceptible-exposed-infectious-removed (SEIR) model all have the weakness of assuming a homogenous population, where everyone has an equal chance of getting infected and dying. Since it was identified in Hubei, China, in December 2019, COVID-19 has rapidly spread around the world and been declared a pandemic. Based on data from Hubei, infection and death distributions vary with age. To control the spread of the disease, various preventive and control measures such as community quarantine and social distancing have been widely used. Objective: Our aim is to develop a model where age is a factor, considering the study area?s age stratification. Additionally, we want to account for the effects of quarantine on the SEIR model. Methods: We use the age-stratified COVID-19 infection and death distributions from Hubei, China (more than 44,672 infections as of February 11, 2020) as an estimate or proxy for a study area?s infection and mortality probabilities for each age group. We then apply these probabilities to the actual age-stratified population of Quezon City, Philippines, to predict infectious individuals and deaths at peak. Testing with different countries shows the predicted number of infectious individuals skewing with the country?s median age and age stratification, as expected. We added a Q parameter to the SEIR model to include the effects of quarantine (Q-SEIR). Results: The projections from the age-stratified probabilities give much lower predicted incidences of infection than the Q-SEIR model. As expected, quarantine tends to delay the peaks for both the exposed and infectious groups, and to ?flatten? the curve or lower the predicted values for each compartment. These two estimates were used as a range to inform the local government?s planning and response to the COVID-19 threat. Conclusions: Age stratification combined with a quarantine-modified model has good qualitative agreement with observations on infections and death rates. That younger populations will have lower death rates due to COVID-19 is a fair expectation for a disease where most fatalities are among older adults. UR - https://www.jmir.org/2021/5/e19544 UR - http://dx.doi.org/10.2196/19544 UR - http://www.ncbi.nlm.nih.gov/pubmed/33900929 ID - info:doi/10.2196/19544 ER - TY - JOUR AU - Jang, Beakcheol AU - Kim, Inhwan AU - Kim, Wook Jong PY - 2021/5/25 TI - Effective Training Data Extraction Method to Improve Influenza Outbreak Prediction from Online News Articles: Deep Learning Model Study JO - JMIR Med Inform SP - e23305 VL - 9 IS - 5 KW - influenza KW - training data extraction KW - keyword KW - sorting KW - word embedding KW - Pearson correlation coefficient KW - long short-term memory KW - surveillance KW - infodemiology KW - infoveillance KW - model N2 - Background: Each year, influenza affects 3 to 5 million people and causes 290,000 to 650,000 fatalities worldwide. To reduce the fatalities caused by influenza, several countries have established influenza surveillance systems to collect early warning data. However, proper and timely warnings are hindered by a 1- to 2-week delay between the actual disease outbreaks and the publication of surveillance data. To address the issue, novel methods for influenza surveillance and prediction using real-time internet data (such as search queries, microblogging, and news) have been proposed. Some of the currently popular approaches extract online data and use machine learning to predict influenza occurrences in a classification mode. However, many of these methods extract training data subjectively, and it is difficult to capture the latent characteristics of the data correctly. There is a critical need to devise new approaches that focus on extracting training data by reflecting the latent characteristics of the data. Objective: In this paper, we propose an effective method to extract training data in a manner that reflects the hidden features and improves the performance by filtering and selecting only the keywords related to influenza before the prediction. Methods: Although word embedding provides a distributed representation of words by encoding the hidden relationships between various tokens, we enhanced the word embeddings by selecting keywords related to the influenza outbreak and sorting the extracted keywords using the Pearson correlation coefficient in order to solely keep the tokens with high correlation with the actual influenza outbreak. The keyword extraction process was followed by a predictive model based on long short-term memory that predicts the influenza outbreak. To assess the performance of the proposed predictive model, we used and compared a variety of word embedding techniques. Results: Word embedding without our proposed sorting process showed 0.8705 prediction accuracy when 50.2 keywords were selected on average. Conversely, word embedding using our proposed sorting process showed 0.8868 prediction accuracy and an improvement in prediction accuracy of 12.6%, although smaller amounts of training data were selected, with only 20.6 keywords on average. Conclusions: The sorting stage empowers the embedding process, which improves the feature extraction process because it acts as a knowledge base for the prediction component. The model outperformed other current approaches that use flat extraction before prediction. UR - https://medinform.jmir.org/2021/5/e23305 UR - http://dx.doi.org/10.2196/23305 UR - http://www.ncbi.nlm.nih.gov/pubmed/34032577 ID - info:doi/10.2196/23305 ER - TY - JOUR AU - Fokas, S. Athanassios AU - Kastis, A. George PY - 2021/5/18 TI - SARS-CoV-2: The Second Wave in Europe JO - J Med Internet Res SP - e22431 VL - 23 IS - 5 KW - mathematical modelling of epidemics KW - COVID-19 KW - SARS CoV-2 KW - pandemic KW - lockdown in Europe UR - https://www.jmir.org/2021/5/e22431 UR - http://dx.doi.org/10.2196/22431 UR - http://www.ncbi.nlm.nih.gov/pubmed/33939621 ID - info:doi/10.2196/22431 ER - TY - JOUR AU - Post, Lori AU - Boctor, J. Michael AU - Issa, Z. Tariq AU - Moss, B. Charles AU - Murphy, Leo Robert AU - Achenbach, J. Chad AU - Ison, G. Michael AU - Resnick, Danielle AU - Singh, Lauren AU - White, Janine AU - Welch, B. Sarah AU - Oehmke, F. James PY - 2021/5/10 TI - SARS-CoV-2 Surveillance System in Canada: Longitudinal Trend Analysis JO - JMIR Public Health Surveill SP - e25753 VL - 7 IS - 5 KW - global COVID surveillance KW - COVID-19 KW - COVID-21 KW - new COVID strains KW - Canada Public Health Surveillance KW - Great COVID Shutdown KW - Canadian COVID-19 KW - surveillance metrics KW - wave 2 Canada COVID-19 KW - dynamic panel data KW - generalized method of the moments KW - Canadian econometrics KW - Canada SARS-CoV-2 KW - Canadian COVID-19 surveillance system KW - Canadian COVID transmission speed KW - Canadian COVID transmission acceleration KW - COVID transmission deceleration KW - COVID transmission jerk KW - COVID 7-day lag KW - Alberta KW - British Columbia KW - Manitoba KW - New Brunswick KW - Newfoundland and Labrador KW - Northwest Territories KW - Nova Scotia KW - Nunavut KW - Ontario KW - Prince Edward Island KW - Quebec KW - Saskatchewan KW - Yukon N2 - Background: The COVID-19 global pandemic has disrupted structures and communities across the globe. Numerous regions of the world have had varying responses in their attempts to contain the spread of the virus. Factors such as public health policies, governance, and sociopolitical climate have led to differential levels of success at controlling the spread of SARS-CoV-2. Ultimately, a more advanced surveillance metric for COVID-19 transmission is necessary to help government systems and national leaders understand which responses have been effective and gauge where outbreaks occur. Objective: The goal of this study is to provide advanced COVID-19 surveillance metrics for Canada at the country, province, and territory level that account for shifts in the pandemic including speed, acceleration, jerk, and persistence. Enhanced surveillance identifies risks for explosive growth and regions that have controlled outbreaks successfully. Methods: Using a longitudinal trend analysis study design, we extracted 62 days of COVID-19 data from Canadian public health registries for 13 provinces and territories. We used an empirical difference equation to measure the daily number of cases in Canada as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano-Bond estimator in R. Results: We compare the week of February 7-13, 2021, with the week of February 14-20, 2021. Canada, as a whole, had a decrease in speed from 8.4 daily new cases per 100,000 population to 7.5 daily new cases per 100,000 population. The persistence of new cases during the week of February 14-20 reported 7.5 cases that are a result of COVID-19 transmissions 7 days earlier. The two most populous provinces of Ontario and Quebec both experienced decreases in speed from 7.9 and 11.5 daily new cases per 100,000 population for the week of February 7-13 to speeds of 6.9 and 9.3 for the week of February 14-20, respectively. Nunavut experienced a significant increase in speed during this time, from 3.3 daily new cases per 100,000 population to 10.9 daily new cases per 100,000 population. Conclusions: Canada excelled at COVID-19 control early on in the pandemic, especially during the first COVID-19 shutdown. The second wave at the end of 2020 resulted in a resurgence of the outbreak, which has since been controlled. Enhanced surveillance identifies outbreaks and where there is the potential for explosive growth, which informs proactive health policy. UR - https://publichealth.jmir.org/2021/5/e25753 UR - http://dx.doi.org/10.2196/25753 UR - http://www.ncbi.nlm.nih.gov/pubmed/33852410 ID - info:doi/10.2196/25753 ER - TY - JOUR AU - Kundu, Sampurna AU - Chauhan, Kirti AU - Mandal, Debarghya PY - 2021/5/6 TI - Survival Analysis of Patients With COVID-19 in India by Demographic Factors: Quantitative Study JO - JMIR Form Res SP - e23251 VL - 5 IS - 5 KW - survival analysis KW - COVID-19 KW - patient data KW - Kaplan-Meier KW - hazard model KW - modeling KW - survival KW - mortality KW - demographic KW - India KW - transmission N2 - Background: Studies of the transmission dynamics of COVID-19 have depicted the rate, patterns, and predictions of cases of this pandemic disease. To combat transmission of the disease in India, the government declared a lockdown on March 25, 2020. Even after this strict lockdown was enacted nationwide, the number of COVID-19 cases increased and surpassed 450,000. A positive point to note is that the number of recovered cases began to slowly exceed that of active cases. The survival of patients, taking death as the event that varies by age group and sex, is noteworthy. Objective: The aim of this study was to conduct a survival analysis to establish the variability in survivorship of patients with COVID-19 in India by age group and sex at different levels, that is, the national, state, and district levels. Methods: The study period was taken from the date of the first reported case of COVID-19 in India, which was January 30, 2020, up to June 30, 2020. Due to the amount of underreported data and removal of missing columns, a total sample of 26,815 patients was considered. Kaplan-Meier survival estimation, the Cox proportional hazard model, and the multilevel survival model were used to perform the survival analysis. Results: The Kaplan-Meier survival function showed that the probability of survival of patients with COVID-19 declined during the study period of 5 months, which was supplemented by the log rank test (P<.001) and Wilcoxon test (P<.001) to compare the survival functions. Significant variability was observed in the age groups, as evident from all the survival estimates; with increasing age, the risk of dying of COVID-19 increased. The Cox proportional hazard model reiterated that male patients with COVID-19 had a 1.14 times higher risk of dying than female patients (hazard ratio 1.14; SE 0.11; 95% CI 0.93-1.38). Western and Central India showed decreasing survival rates in the framed time period, while Eastern, North Eastern, and Southern India showed slightly better results in terms of survival. Conclusions: This study depicts a grave scenario of decreasing survival rates in various regions of India and shows variability in these rates by age and sex. In essence, we can safely conclude that the critical appraisal of the survival rate and thorough analysis of patient data in this study equipped us to identify risk groups and perform comparative studies of various segments in India. International Registered Report Identifier (IRRID): RR2-10.1101/2020.08.01.20162115 UR - https://formative.jmir.org/2021/5/e23251 UR - http://dx.doi.org/10.2196/23251 UR - http://www.ncbi.nlm.nih.gov/pubmed/33882017 ID - info:doi/10.2196/23251 ER - TY - JOUR AU - Li, Junjiang AU - Giabbanelli, Philippe PY - 2021/4/29 TI - Returning to a Normal Life via COVID-19 Vaccines in the United States: A Large-scale Agent-Based Simulation Study JO - JMIR Med Inform SP - e27419 VL - 9 IS - 4 KW - agent-based model KW - cloud-based simulations KW - COVID-19 KW - large-scale simulations KW - vaccine KW - model KW - simulation KW - United States KW - agent-based KW - effective KW - willingness KW - capacity KW - plan KW - strategy KW - outcome KW - interaction KW - intervention KW - scenario KW - impact N2 - Background: In 2020, COVID-19 has claimed more than 300,000 deaths in the United States alone. Although nonpharmaceutical interventions were implemented by federal and state governments in the United States, these efforts have failed to contain the virus. Following the Food and Drug Administration's approval of two COVID-19 vaccines, however, the hope for the return to normalcy has been renewed. This hope rests on an unprecedented nationwide vaccine campaign, which faces many logistical challenges and is also contingent on several factors whose values are currently unknown. Objective: We study the effectiveness of a nationwide vaccine campaign in response to different vaccine efficacies, the willingness of the population to be vaccinated, and the daily vaccine capacity under two different federal plans. To characterize the possible outcomes most accurately, we also account for the interactions between nonpharmaceutical interventions and vaccines through 6 scenarios that capture a range of possible impacts from nonpharmaceutical interventions. Methods: We used large-scale, cloud-based, agent-based simulations by implementing the vaccination campaign using COVASIM, an open-source agent-based model for COVID-19 that has been used in several peer-reviewed studies and accounts for individual heterogeneity and a multiplicity of contact networks. Several modifications to the parameters and simulation logic were made to better align the model with current evidence. We chose 6 nonpharmaceutical intervention scenarios and applied the vaccination intervention following both the plan proposed by Operation Warp Speed (former Trump administration) and the plan of one million vaccines per day, proposed by the Biden administration. We accounted for unknowns in vaccine efficacies and levels of population compliance by varying both parameters. For each experiment, the cumulative infection growth was fitted to a logistic growth model, and the carrying capacities and the growth rates were recorded. Results: For both vaccination plans and all nonpharmaceutical intervention scenarios, the presence of the vaccine intervention considerably lowers the total number of infections when life returns to normal, even when the population compliance to vaccines is as low as 20%. We noted an unintended consequence; given the vaccine availability estimates under both federal plans and the focus on vaccinating individuals by age categories, a significant reduction in nonpharmaceutical interventions results in a counterintuitive situation in which higher vaccine compliance then leads to more total infections. Conclusions: Although potent, vaccines alone cannot effectively end the pandemic given the current availability estimates and the adopted vaccination strategy. Nonpharmaceutical interventions need to continue and be enforced to ensure high compliance so that the rate of immunity established by vaccination outpaces that induced by infections. UR - https://medinform.jmir.org/2021/4/e27419 UR - http://dx.doi.org/10.2196/27419 UR - http://www.ncbi.nlm.nih.gov/pubmed/33872188 ID - info:doi/10.2196/27419 ER - TY - JOUR AU - Post, Lori AU - Culler, Kasen AU - Moss, B. Charles AU - Murphy, L. Robert AU - Achenbach, J. Chad AU - Ison, G. Michael AU - Resnick, Danielle AU - Singh, Nadya Lauren AU - White, Janine AU - Boctor, J. Michael AU - Welch, B. Sarah AU - Oehmke, Francis James PY - 2021/4/28 TI - Surveillance of the Second Wave of COVID-19 in Europe: Longitudinal Trend Analyses JO - JMIR Public Health Surveill SP - e25695 VL - 7 IS - 4 KW - SARS-CoV-2 surveillance KW - wave two KW - second wave KW - global COVID surveillance KW - Europe Public Health Surveillance KW - Europe COVID KW - Europe surveillance metrics KW - dynamic panel data KW - generalized method of the moments KW - Europe econometrics KW - Europe SARS-CoV-2 KW - Europe COVID surveillance system KW - European COVID transmission speed KW - European COVID transmission acceleration KW - COVID transmission deceleration KW - COVID transmission jerk KW - COVID 7-day lag KW - SARS-CoV-2 KW - Arellano-Bond estimator KW - GMM KW - Albania KW - Andorra KW - Austria KW - Belarus KW - Belgium KW - Bosnia and Herzegovina KW - Bulgaria KW - Croatia KW - Czech Republic KW - Denmark KW - Estonia KW - Finland KW - France KW - Germany KW - Greece KW - Greenland KW - Hungary KW - Iceland KW - Ireland KW - Isle of Man KW - Italy KW - Latvia KW - Liechtenstein KW - Lithuania KW - Luxembourg KW - Moldova KW - Monaco KW - Montenegro KW - Netherlands KW - Norway KW - Poland KW - Portugal KW - Romania KW - San Marino KW - Serbia KW - Slovakia KW - Slovenia KW - Spain KW - Sweden KW - Switzerland KW - Ukraine KW - United Kingdom KW - Vatican City N2 - Background: The COVID-19 pandemic has severely impacted Europe, resulting in a high caseload and deaths that varied by country. The second wave of the COVID-19 pandemic has breached the borders of Europe. Public health surveillance is necessary to inform policy and guide leaders. Objective: This study aimed to provide advanced surveillance metrics for COVID-19 transmission that account for weekly shifts in the pandemic, speed, acceleration, jerk, and persistence, to better understand countries at risk for explosive growth and those that are managing the pandemic effectively. Methods: We performed a longitudinal trend analysis and extracted 62 days of COVID-19 data from public health registries. We used an empirical difference equation to measure the daily number of cases in Europe as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano-Bond estimator in R. Results: New COVID-19 cases slightly decreased from 158,741 (week 1, January 4-10, 2021) to 152,064 (week 2, January 11-17, 2021), and cumulative cases increased from 22,507,271 (week 1) to 23,890,761 (week 2), with a weekly increase of 1,383,490 between January 10 and January 17. France, Germany, Italy, Spain, and the United Kingdom had the largest 7-day moving averages for new cases during week 1. During week 2, the 7-day moving average for France and Spain increased. From week 1 to week 2, the speed decreased (37.72 to 33.02 per 100,000), acceleration decreased (0.39 to ?0.16 per 100,000), and jerk increased (?1.30 to 1.37 per 100,000). Conclusions: The United Kingdom, Spain, and Portugal, in particular, are at risk for a rapid expansion in COVID-19 transmission. An examination of the European region suggests that there was a decrease in the COVID-19 caseload between January 4 and January 17, 2021. Unfortunately, the rates of jerk, which were negative for Europe at the beginning of the month, reversed course and became positive, despite decreases in speed and acceleration. Finally, the 7-day persistence rate was higher during week 2 than during week 1. These measures indicate that the second wave of the pandemic may be subsiding, but some countries remain at risk for new outbreaks and increased transmission in the absence of rapid policy responses. UR - https://publichealth.jmir.org/2021/4/e25695 UR - http://dx.doi.org/10.2196/25695 UR - http://www.ncbi.nlm.nih.gov/pubmed/33818391 ID - info:doi/10.2196/25695 ER - TY - JOUR AU - Post, Lori AU - Ohiomoba, O. Ramael AU - Maras, Ashley AU - Watts, J. Sean AU - Moss, B. Charles AU - Murphy, Leo Robert AU - Ison, G. Michael AU - Achenbach, J. Chad AU - Resnick, Danielle AU - Singh, Nadya Lauren AU - White, Janine AU - Chaudhury, S. Azraa AU - Boctor, J. Michael AU - Welch, B. Sarah AU - Oehmke, Francis James PY - 2021/4/27 TI - Latin America and the Caribbean SARS-CoV-2 Surveillance: Longitudinal Trend Analysis JO - JMIR Public Health Surveill SP - e25728 VL - 7 IS - 4 KW - 7-day persistence KW - acceleration KW - Arellano?Bond estimator KW - COVID-19 surveillance system KW - COVID-19 KW - dynamic panel data KW - econometrics KW - economic KW - generalized method of moments KW - global COVID-19 surveillance KW - Latin America and the Caribbean KW - longitudinal KW - metric KW - persistence KW - policy KW - public health surveillance KW - SARS-CoV-2 KW - second wave KW - surveillance metrics KW - transmission deceleration KW - transmission jerk KW - transmission speed KW - trend analysis N2 - Background: The COVID-19 pandemic has placed unprecedented stress on economies, food systems, and health care resources in Latin America and the Caribbean (LAC). Existing surveillance provides a proxy of the COVID-19 caseload and mortalities; however, these measures make it difficult to identify the dynamics of the pandemic and places where outbreaks are likely to occur. Moreover, existing surveillance techniques have failed to measure the dynamics of the pandemic. Objective: This study aimed to provide additional surveillance metrics for COVID-19 transmission to track changes in the speed, acceleration, jerk, and persistence in the transmission of the pandemic more accurately than existing metrics. Methods: Through a longitudinal trend analysis, we extracted COVID-19 data over 45 days from public health registries. We used an empirical difference equation to monitor the daily number of cases in the LAC as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano?Bond estimator in R. COVID-19 transmission rates were tracked for the LAC between September 30 and October 6, 2020, and between October 7 and 13, 2020. Results: The LAC saw a reduction in the speed, acceleration, and jerk for the week of October 13, 2020, compared to the week of October 6, 2020, accompanied by reductions in new cases and the 7-day moving average. For the week of October 6, 2020, Belize reported the highest acceleration and jerk, at 1.7 and 1.8, respectively, which is particularly concerning, given its high mortality rate. The Bahamas also had a high acceleration at 1.5. In total, 11 countries had a positive acceleration during the week of October 6, 2020, whereas only 6 countries had a positive acceleration for the week of October 13, 2020. The TAC displayed an overall positive trend, with a speed of 10.40, acceleration of 0.27, and jerk of ?0.31, all of which decreased in the subsequent week to 9.04, ?0.81, and ?0.03, respectively. Conclusions: Metrics such as new cases, cumulative cases, deaths, and 7-day moving averages provide a static view of the pandemic but fail to identify where and the speed at which SARS-CoV-2 infects new individuals, the rate of acceleration or deceleration of the pandemic, and weekly comparison of the rate of acceleration of the pandemic indicate impending explosive growth or control of the pandemic. Enhanced surveillance will inform policymakers and leaders in the LAC about COVID-19 outbreaks. UR - https://publichealth.jmir.org/2021/4/e25728 UR - http://dx.doi.org/10.2196/25728 UR - http://www.ncbi.nlm.nih.gov/pubmed/33852413 ID - info:doi/10.2196/25728 ER - TY - JOUR AU - Yeung, YS Arnold AU - Roewer-Despres, Francois AU - Rosella, Laura AU - Rudzicz, Frank PY - 2021/4/23 TI - Machine Learning?Based Prediction of Growth in Confirmed COVID-19 Infection Cases in 114 Countries Using Metrics of Nonpharmaceutical Interventions and Cultural Dimensions: Model Development and Validation JO - J Med Internet Res SP - e26628 VL - 23 IS - 4 KW - COVID-19 KW - machine learning KW - nonpharmaceutical interventions KW - cultural dimensions KW - random forest KW - AdaBoost KW - forecast KW - informatics KW - epidemiology KW - artificial intelligence N2 - Background: National governments worldwide have implemented nonpharmaceutical interventions to control the COVID-19 pandemic and mitigate its effects. Objective: The aim of this study was to investigate the prediction of future daily national confirmed COVID-19 infection growth?the percentage change in total cumulative cases?across 14 days for 114 countries using nonpharmaceutical intervention metrics and cultural dimension metrics, which are indicative of specific national sociocultural norms. Methods: We combined the Oxford COVID-19 Government Response Tracker data set, Hofstede cultural dimensions, and daily reported COVID-19 infection case numbers to train and evaluate five non?time series machine learning models in predicting confirmed infection growth. We used three validation methods?in-distribution, out-of-distribution, and country-based cross-validation?for the evaluation, each of which was applicable to a different use case of the models. Results: Our results demonstrate high R2 values between the labels and predictions for the in-distribution method (0.959) and moderate R2 values for the out-of-distribution and country-based cross-validation methods (0.513 and 0.574, respectively) using random forest and adaptive boosting (AdaBoost) regression. Although these models may be used to predict confirmed infection growth, the differing accuracies obtained from the three tasks suggest a strong influence of the use case. Conclusions: This work provides new considerations in using machine learning techniques with nonpharmaceutical interventions and cultural dimensions as metrics to predict the national growth of confirmed COVID-19 infections. UR - https://www.jmir.org/2021/4/e26628 UR - http://dx.doi.org/10.2196/26628 UR - http://www.ncbi.nlm.nih.gov/pubmed/33844636 ID - info:doi/10.2196/26628 ER - TY - JOUR AU - Her, Qoua AU - Kent, Thomas AU - Samizo, Yuji AU - Slavkovic, Aleksandra AU - Vilk, Yury AU - Toh, Sengwee PY - 2021/4/23 TI - Automatable Distributed Regression Analysis of Vertically Partitioned Data Facilitated by PopMedNet: Feasibility and Enhancement Study JO - JMIR Med Inform SP - e21459 VL - 9 IS - 4 KW - distributed regression analysis KW - distributed data networks KW - privacy-protecting analytics KW - vertically partitioned data KW - informatics KW - data networks KW - data N2 - Background: In clinical research, important variables may be collected from multiple data sources. Physical pooling of patient-level data from multiple sources often raises several challenges, including proper protection of patient privacy and proprietary interests. We previously developed an SAS-based package to perform distributed regression?a suite of privacy-protecting methods that perform multivariable-adjusted regression analysis using only summary-level information?with horizontally partitioned data, a setting where distinct cohorts of patients are available from different data sources. We integrated the package with PopMedNet, an open-source file transfer software, to facilitate secure file transfer between the analysis center and the data-contributing sites. The feasibility of using PopMedNet to facilitate distributed regression analysis (DRA) with vertically partitioned data, a setting where the data attributes from a cohort of patients are available from different data sources, was unknown. Objective: The objective of the study was to describe the feasibility of using PopMedNet and enhancements to PopMedNet to facilitate automatable vertical DRA (vDRA) in real-world settings. Methods: We gathered the statistical and informatic requirements of using PopMedNet to facilitate automatable vDRA. We enhanced PopMedNet based on these requirements to improve its technical capability to support vDRA. Results: PopMedNet can enable automatable vDRA. We identified and implemented two enhancements to PopMedNet that improved its technical capability to perform automatable vDRA in real-world settings. The first was the ability to simultaneously upload and download multiple files, and the second was the ability to directly transfer summary-level information between the data-contributing sites without a third-party analysis center. Conclusions: PopMedNet can be used to facilitate automatable vDRA to protect patient privacy and support clinical research in real-world settings. UR - https://medinform.jmir.org/2021/4/e21459 UR - http://dx.doi.org/10.2196/21459 UR - http://www.ncbi.nlm.nih.gov/pubmed/33890866 ID - info:doi/10.2196/21459 ER - TY - JOUR AU - Post, Lori AU - Mason, Maryann AU - Singh, Nadya Lauren AU - Wleklinski, P. Nicholas AU - Moss, B. Charles AU - Mohammad, Hassan AU - Issa, Z. Tariq AU - Akhetuamhen, I. Adesuwa AU - Brandt, A. Cynthia AU - Welch, B. Sarah AU - Oehmke, Francis James PY - 2021/4/22 TI - Impact of Firearm Surveillance on Gun Control Policy: Regression Discontinuity Analysis JO - JMIR Public Health Surveill SP - e26042 VL - 7 IS - 4 KW - firearm surveillance KW - assault weapons ban KW - large-capacity magazines KW - guns control policy KW - mass shootings KW - regression lines of discontinuity N2 - Background: Public mass shootings are a significant public health problem that require ongoing systematic surveillance to test and inform policies that combat gun injuries. Although there is widespread agreement that something needs to be done to stop public mass shootings, opinions on exactly which policies that entails vary, such as the prohibition of assault weapons and large-capacity magazines. Objective: The aim of this study was to determine if the Federal Assault Weapons Ban (FAWB) (1994-2004) reduced the number of public mass shootings while it was in place. Methods: We extracted public mass shooting surveillance data from the Violence Project that matched our inclusion criteria of 4 or more fatalities in a public space during a single event. We performed regression discontinuity analysis, taking advantage of the imposition of the FAWB, which included a prohibition on large-capacity magazines in addition to assault weapons. We estimated a regression model of the 5-year moving average number of public mass shootings per year for the period of 1966 to 2019 controlling for population growth and homicides in general, introduced regression discontinuities in the intercept and a time trend for years coincident with the federal legislation (ie, 1994-2004), and also allowed for a differential effect of the homicide rate during this period. We introduced a second set of trend and intercept discontinuities for post-FAWB years to capture the effects of termination of the policy. We used the regression results to predict what would have happened from 1995 to 2019 had there been no FAWB and also to project what would have happened from 2005 onward had it remained in place. Results: The FAWB resulted in a significant decrease in public mass shootings, number of gun deaths, and number of gun injuries. We estimate that the FAWB prevented 11 public mass shootings during the decade the ban was in place. A continuation of the FAWB would have prevented 30 public mass shootings that killed 339 people and injured an additional 1139 people. Conclusions: This study demonstrates the utility of public health surveillance on gun violence. Surveillance informs policy on whether a ban on assault weapons and large-capacity magazines reduces public mass shootings. As society searches for effective policies to prevent the next mass shooting, we must consider the overwhelming evidence that bans on assault weapons and/or large-capacity magazines work. UR - https://publichealth.jmir.org/2021/4/e26042 UR - http://dx.doi.org/10.2196/26042 UR - http://www.ncbi.nlm.nih.gov/pubmed/33783360 ID - info:doi/10.2196/26042 ER - TY - JOUR AU - De Leo, Stefano PY - 2021/4/21 TI - Impact of COVID-19 Testing Strategies and Lockdowns on Disease Management Across Europe, South America, and the United States: Analysis Using Skew-Normal Distributions JO - JMIRx Med SP - e21269 VL - 2 IS - 2 KW - COVID-19 KW - testing strategy KW - skew-normal distributions KW - lockdown KW - forecast KW - modeling KW - outbreak KW - infectious disease KW - prediction N2 - Background: As COVID-19 infections worldwide exceed 6 million confirmed cases, the data reveal that the first wave of the outbreak is coming to an end in many European countries. There is variation in the testing strategies (eg, massive testing vs testing only those displaying symptoms) and the strictness of lockdowns imposed by countries around the world. For example, Brazil?s mitigation measures lie between the strict lockdowns imposed by many European countries and the more liberal approach taken by Sweden. This can influence COVID-19 metrics (eg, total deaths, confirmed cases) in unexpected ways. Objective: This study aimed to evaluate the effectiveness of local authorities? strategies in managing the COVID-19 pandemic in Europe, South America, and the United States. Methods: The early stage of the COVID-19 outbreak in Brazil was compared to Europe using the weekly transmission rate. Using the European data as a basis for our analysis, we examined the spread of COVID-19 and modeled curves pertaining to daily confirmed cases and deaths per million using skew-normal probability density functions. For Sweden, the United Kingdom, and the United States, we forecasted the end of the pandemic, and for Brazil, we predicted the peak value for daily deaths per million. We also discussed additional factors that could play an important role in the fight against COVID-19, such as the fast response of local authorities, testing strategies, number of beds in the intensive care unit, and isolation strategies adopted. Results: The European data analysis demonstrated that the transmission rate of COVID-19 increased similarly for all countries in the initial stage of the pandemic but changed as the total confirmed cases per million in each country grew. This was caused by the variation in timely action by local authorities in adopting isolation measures and/or massive testing strategies. The behavior of daily confirmed cases for the United States and Brazil during the early stage of the outbreak was similar to that of Italy and Sweden, respectively. For daily deaths per million, transmission in the United States was similar to that of Switzerland, whereas for Brazil, it was greater than the counts for Portugal, Germany, and Austria (which had, in terms of total deaths per million, the best results in Europe) but lower than other European countries. Conclusions: The fitting skew parameters used to model the curves for daily confirmed cases per million and daily deaths per million allow for a more realistic prediction of the end of the pandemic and permit us to compare the mitigation measures adopted by local authorities by analyzing their respective skew-normal parameters. The massive testing strategy adopted in the early stage of the pandemic by German authorities made a positive difference compared to other countries like Italy where an effective testing strategy was adopted too late. This explains why, despite a strictly indiscriminate lockdown, Italy?s mortality rate was one of the highest in the world. UR - https://xmed.jmir.org/2021/2/e21269 UR - http://dx.doi.org/10.2196/21269 UR - http://www.ncbi.nlm.nih.gov/pubmed/34032814 ID - info:doi/10.2196/21269 ER - TY - JOUR AU - Ilyin, O. Sergey PY - 2021/4/19 TI - A Recursive Model of the Spread of COVID-19: Modelling Study JO - JMIR Public Health Surveill SP - e21468 VL - 7 IS - 4 KW - epidemiology KW - COVID-19 KW - model KW - modelling KW - prediction KW - spread KW - infection KW - effective KW - contagious KW - transmission N2 - Background: The major medical and social challenge of the 21st century is COVID-19, caused by the novel coronavirus SARS-CoV-2. Critical issues include the rate at which the coronavirus spreads and the effect of quarantine measures and population vaccination on this rate. Knowledge of the laws of the spread of COVID-19 will enable assessment of the effectiveness and reasonableness of the quarantine measures used, as well as determination of the necessary level of vaccination needed to overcome this crisis. Objective: This study aims to establish the laws of the spread of COVID-19 and to use them to develop a mathematical model to predict changes in the number of active cases over time, possible human losses, and the rate of recovery of patients, to make informed decisions about the number of necessary beds in hospitals, the introduction and type of quarantine measures, and the required threshold of vaccination of the population. Methods: This study analyzed the onset of COVID-19 spread in countries such as China, Italy, Spain, the United States, the United Kingdom, Japan, France, and Germany based on publicly available statistical data. The change in the number of COVID-19 cases, deaths, and recovered persons over time was examined, considering the possible introduction of quarantine measures and isolation of infected people in these countries. Based on the data, the virus transmissibility and the average duration of the disease at different stages were evaluated, and a model based on the principle of recursion was developed. Its key features are the separation of active (nonisolated) infected persons into a distinct category and the prediction of their number based on the average duration of the disease in the inactive phase and the concentration of these persons in the population in the preceding days. Results: Specific values for SARS-CoV-2 transmissibility and COVID-19 duration were estimated for different countries. In China, the viral transmissibility was 3.12 before quarantine measures were implemented and 0.36 after these measures were lifted. For the other countries, the viral transmissibility was 2.28-2.76 initially, and it then decreased to 0.87-1.29 as a result of quarantine measures. Therefore, it can be expected that the spread of SARS-CoV-2 will be suppressed if 56%-64% of the total population becomes vaccinated or survives COVID-19. Conclusions: The quarantine measures adopted in most countries are too weak compared to those previously used in China. Therefore, it is not expected that the spread of COVID-19 will stop and the disease will cease to exist naturally or owing to quarantine measures. Active vaccination of the population is needed to prevent the spread of COVID-19. Furthermore, the required specific percentage of vaccinated individuals depends on the magnitude of viral transmissibility, which can be evaluated using the proposed model and statistical data for the country of interest. UR - https://publichealth.jmir.org/2021/4/e21468 UR - http://dx.doi.org/10.2196/21468 UR - http://www.ncbi.nlm.nih.gov/pubmed/33871381 ID - info:doi/10.2196/21468 ER - TY - JOUR AU - Shapiro, B. Mark AU - Karim, Fazle AU - Muscioni, Guido AU - Augustine, Saju Abel PY - 2021/4/7 TI - Adaptive Susceptible-Infectious-Removed Model for Continuous Estimation of the COVID-19 Infection Rate and Reproduction Number in the United States: Modeling Study JO - J Med Internet Res SP - e24389 VL - 23 IS - 4 KW - compartmental models KW - COVID-19 KW - decision-making KW - estimate KW - infection rate KW - infectious disease KW - modeling KW - pandemic KW - prediction KW - reproduction number KW - SARS-CoV-2 KW - United States N2 - Background: The dynamics of the COVID-19 pandemic vary owing to local population density and policy measures. During decision-making, policymakers consider an estimate of the effective reproduction number Rt, which is the expected number of secondary infections spread by a single infected individual. Objective: We propose a simple method for estimating the time-varying infection rate and the Rt. Methods: We used a sliding window approach with a Susceptible-Infectious-Removed (SIR) model. We estimated the infection rate from the reported cases over a 7-day window to obtain a continuous estimation of Rt. A proposed adaptive SIR (aSIR) model was applied to analyze the data at the state and county levels. Results: The aSIR model showed an excellent fit for the number of reported COVID-19 cases, and the 1-day forecast mean absolute prediction error was <2.6% across all states. However, the 7-day forecast mean absolute prediction error approached 16.2% and strongly overestimated the number of cases when the Rt was rapidly decreasing. The maximal Rt displayed a wide range of 2.0 to 4.5 across all states, with the highest values for New York (4.4) and Michigan (4.5). We found that the aSIR model can rapidly adapt to an increase in the number of tests and an associated increase in the reported cases of infection. Our results also suggest that intensive testing may be an effective method of reducing Rt. Conclusions: The aSIR model provides a simple and accurate computational tool for continuous Rt estimation and evaluation of the efficacy of mitigation measures. UR - https://www.jmir.org/2021/4/e24389 UR - http://dx.doi.org/10.2196/24389 UR - http://www.ncbi.nlm.nih.gov/pubmed/33755577 ID - info:doi/10.2196/24389 ER - TY - JOUR AU - Benneyan, James AU - Gehrke, Christopher AU - Ilies, Iulian AU - Nehls, Nicole PY - 2021/4/7 TI - Community and Campus COVID-19 Risk Uncertainty Under University Reopening Scenarios: Model-Based Analysis JO - JMIR Public Health Surveill SP - e24292 VL - 7 IS - 4 KW - COVID-19 KW - university reopening KW - community impact KW - epidemic model KW - model KW - community KW - university KW - safety KW - strategy KW - risk KW - infectious disease N2 - Background: Significant uncertainty has existed about the safety of reopening college and university campuses before the COVID-19 pandemic is better controlled. Moreover, little is known about the effects that on-campus students may have on local higher-risk communities. Objective: We aimed to estimate the range of potential community and campus COVID-19 exposures, infections, and mortality under various university reopening plans and uncertainties. Methods: We developed campus-only, community-only, and campus × community epidemic differential equations and agent-based models, with inputs estimated via published and grey literature, expert opinion, and parameter search algorithms. Campus opening plans (spanning fully open, hybrid, and fully virtual approaches) were identified from websites and publications. Additional student and community exposures, infections, and mortality over 16-week semesters were estimated under each scenario, with 10% trimmed medians, standard deviations, and probability intervals computed to omit extreme outliers. Sensitivity analyses were conducted to inform potential effective interventions. Results: Predicted 16-week campus and additional community exposures, infections, and mortality for the base case with no precautions (or negligible compliance) varied significantly from their medians (4- to 10-fold). Over 5% of on-campus students were infected after a mean of 76 (SD 17) days, with the greatest increase (first inflection point) occurring on average on day 84 (SD 10.2 days) of the semester and with total additional community exposures, infections, and mortality ranging from 1-187, 13-820, and 1-21 per 10,000 residents, respectively. Reopening precautions reduced infections by 24%-26% and mortality by 36%-50% in both populations. Beyond campus and community reproductive numbers, sensitivity analysis indicated no dominant factors that interventions could primarily target to reduce the magnitude and variability in outcomes, suggesting the importance of comprehensive public health measures and surveillance. Conclusions: Community and campus COVID-19 exposures, infections, and mortality resulting from reopening campuses are highly unpredictable regardless of precautions. Public health implications include the need for effective surveillance and flexible campus operations. UR - https://publichealth.jmir.org/2021/4/e24292 UR - http://dx.doi.org/10.2196/24292 UR - http://www.ncbi.nlm.nih.gov/pubmed/33667173 ID - info:doi/10.2196/24292 ER - TY - JOUR AU - Staffini, Alessio AU - Svensson, Kishi Akiko AU - Chung, Ung-Il AU - Svensson, Thomas PY - 2021/4/6 TI - An Agent-Based Model of the Local Spread of SARS-CoV-2: Modeling Study JO - JMIR Med Inform SP - e24192 VL - 9 IS - 4 KW - computational epidemiology KW - COVID-19 KW - SARS-CoV-2 KW - agent-based modeling KW - public health KW - computational models KW - modeling KW - agent KW - spread KW - computation KW - epidemiology KW - policy N2 - Background: The spread of SARS-CoV-2, originating in Wuhan, China, was classified as a pandemic by the World Health Organization on March 11, 2020. The governments of affected countries have implemented various measures to limit the spread of the virus. The starting point of this paper is the different government approaches, in terms of promulgating new legislative regulations to limit the virus diffusion and to contain negative effects on the populations. Objective: This paper aims to study how the spread of SARS-CoV-2 is linked to government policies and to analyze how different policies have produced different results on public health. Methods: Considering the official data provided by 4 countries (Italy, Germany, Sweden, and Brazil) and from the measures implemented by each government, we built an agent-based model to study the effects that these measures will have over time on different variables such as the total number of COVID-19 cases, intensive care unit (ICU) bed occupancy rates, and recovery and case-fatality rates. The model we implemented provides the possibility of modifying some starting variables, and it was thus possible to study the effects that some policies (eg, keeping the national borders closed or increasing the ICU beds) would have had on the spread of the infection. Results: The 4 considered countries have adopted different containment measures for COVID-19, and the forecasts provided by the model for the considered variables have given different results. Italy and Germany seem to be able to limit the spread of the infection and any eventual second wave, while Sweden and Brazil do not seem to have the situation under control. This situation is also reflected in the forecasts of pressure on the National Health Services, which see Sweden and Brazil with a high occupancy rate of ICU beds in the coming months, with a consequent high number of deaths. Conclusions: In line with what we expected, the obtained results showed that the countries that have taken restrictive measures in terms of limiting the population mobility have managed more successfully than others to contain the spread of COVID-19. Moreover, the model demonstrated that herd immunity cannot be reached even in countries that have relied on a strategy without strict containment measures. UR - https://medinform.jmir.org/2021/4/e24192 UR - http://dx.doi.org/10.2196/24192 UR - http://www.ncbi.nlm.nih.gov/pubmed/33750735 ID - info:doi/10.2196/24192 ER - TY - JOUR AU - Chu, MY Amanda AU - Chan, NL Jacky AU - Tsang, TY Jenny AU - Tiwari, Agnes AU - So, KP Mike PY - 2021/3/29 TI - Analyzing Cross-country Pandemic Connectedness During COVID-19 Using a Spatial-Temporal Database: Network Analysis JO - JMIR Public Health Surveill SP - e27317 VL - 7 IS - 3 KW - air traffic KW - coronavirus KW - COVID-19 KW - human mobility KW - network analysis KW - travel restrictions UR - https://publichealth.jmir.org/2021/3/e27317 UR - http://dx.doi.org/10.2196/27317 UR - http://www.ncbi.nlm.nih.gov/pubmed/33711799 ID - info:doi/10.2196/27317 ER - TY - JOUR AU - Huang, Yingxiang AU - Radenkovic, Dina AU - Perez, Kevin AU - Nadeau, Kari AU - Verdin, Eric AU - Furman, David PY - 2021/3/25 TI - Modeling Predictive Age-Dependent and Age-Independent Symptoms and Comorbidities of Patients Seeking Treatment for COVID-19: Model Development and Validation Study JO - J Med Internet Res SP - e25696 VL - 23 IS - 3 KW - clinical informatics KW - predictive modeling KW - COVID-19 KW - app KW - model KW - prediction KW - symptom KW - informatics KW - age KW - morbidity KW - hospital N2 - Background: The COVID-19 pandemic continues to ravage and burden hospitals around the world. The epidemic started in Wuhan, China, and was subsequently recognized by the World Health Organization as an international public health emergency and declared a pandemic in March 2020. Since then, the disruptions caused by the COVID-19 pandemic have had an unparalleled effect on all aspects of life. Objective: With increasing total hospitalization and intensive care unit admissions, a better understanding of features related to patients with COVID-19 could help health care workers stratify patients based on the risk of developing a more severe case of COVID-19. Using predictive models, we strive to select the features that are most associated with more severe cases of COVID-19. Methods: Over 3 million participants reported their potential symptoms of COVID-19, along with their comorbidities and demographic information, on a smartphone-based app. Using data from the >10,000 individuals who indicated that they had tested positive for COVID-19 in the United Kingdom, we leveraged the Elastic Net regularized binary classifier to derive the predictors that are most correlated with users having a severe enough case of COVID-19 to seek treatment in a hospital setting. We then analyzed such features in relation to age and other demographics and their longitudinal trend. Results: The most predictive features found include fever, use of immunosuppressant medication, use of a mobility aid, shortness of breath, and severe fatigue. Such features are age-related, and some are disproportionally high in minority populations. Conclusions: Predictors selected from the predictive models can be used to stratify patients into groups based on how much medical attention they are expected to require. This could help health care workers devote valuable resources to prevent the escalation of the disease in vulnerable populations. UR - https://www.jmir.org/2021/3/e25696 UR - http://dx.doi.org/10.2196/25696 UR - http://www.ncbi.nlm.nih.gov/pubmed/33621185 ID - info:doi/10.2196/25696 ER - TY - JOUR AU - Lynch, J. Christopher AU - Gore, Ross PY - 2021/3/23 TI - Short-Range Forecasting of COVID-19 During Early Onset at County, Health District, and State Geographic Levels Using Seven Methods: Comparative Forecasting Study JO - J Med Internet Res SP - e24925 VL - 23 IS - 3 KW - coronavirus disease 2019 KW - COVID-19 KW - infectious disease KW - emerging outbreak KW - forecasting KW - modeling and simulation KW - public health KW - modeling disease outbreaks N2 - Background: Forecasting methods rely on trends and averages of prior observations to forecast COVID-19 case counts. COVID-19 forecasts have received much media attention, and numerous platforms have been created to inform the public. However, forecasting effectiveness varies by geographic scope and is affected by changing assumptions in behaviors and preventative measures in response to the pandemic. Due to time requirements for developing a COVID-19 vaccine, evidence is needed to inform short-term forecasting method selection at county, health district, and state levels. Objective: COVID-19 forecasts keep the public informed and contribute to public policy. As such, proper understanding of forecasting purposes and outcomes is needed to advance knowledge of health statistics for policy makers and the public. Using publicly available real-time data provided online, we aimed to evaluate the performance of seven forecasting methods utilized to forecast cumulative COVID-19 case counts. Forecasts were evaluated based on how well they forecast 1, 3, and 7 days forward when utilizing 1-, 3-, 7-, or all prior?day cumulative case counts during early virus onset. This study provides an objective evaluation of the forecasting methods to identify forecasting model assumptions that contribute to lower error in forecasting COVID-19 cumulative case growth. This information benefits professionals, decision makers, and the public relying on the data provided by short-term case count estimates at varied geographic levels. Methods: We created 1-, 3-, and 7-day forecasts at the county, health district, and state levels using (1) a naïve approach, (2) Holt-Winters (HW) exponential smoothing, (3) a growth rate approach, (4) a moving average (MA) approach, (5) an autoregressive (AR) approach, (6) an autoregressive moving average (ARMA) approach, and (7) an autoregressive integrated moving average (ARIMA) approach. Forecasts relied on Virginia?s 3464 historical county-level cumulative case counts from March 7 to April 22, 2020, as reported by The New York Times. Statistically significant results were identified using 95% CIs of median absolute error (MdAE) and median absolute percentage error (MdAPE) metrics of the resulting 216,698 forecasts. Results: The next-day MA forecast with 3-day look-back length obtained the lowest MdAE (median 0.67, 95% CI 0.49-0.84, P<.001) and statistically significantly differed from 39 out of 59 alternatives (66%) to 53 out of 59 alternatives (90%) at each geographic level at a significance level of .01. For short-range forecasting, methods assuming stationary means of prior days? counts outperformed methods with assumptions of weak stationarity or nonstationarity means. MdAPE results revealed statistically significant differences across geographic levels. Conclusions: For short-range COVID-19 cumulative case count forecasting at the county, health district, and state levels during early onset, the following were found: (1) the MA method was effective for forecasting 1-, 3-, and 7-day cumulative case counts; (2) exponential growth was not the best representation of case growth during early virus onset when the public was aware of the virus; and (3) geographic resolution was a factor in the selection of forecasting methods. UR - https://www.jmir.org/2021/3/e24925 UR - http://dx.doi.org/10.2196/24925 UR - http://www.ncbi.nlm.nih.gov/pubmed/33621186 ID - info:doi/10.2196/24925 ER - TY - JOUR AU - De Carvalho, Atem Eduardo AU - De Carvalho, Atem Rogerio PY - 2021/3/18 TI - A Framework for a Statistical Characterization of Epidemic Cycles: COVID-19 Case Study JO - JMIRx Med SP - e22617 VL - 2 IS - 1 KW - COVID-19 KW - SARS-CoV-2 KW - pandemics KW - infection control KW - models KW - experimental KW - longitudinal studies KW - statistical modeling KW - epidemic cycles N2 - Background: Since the beginning of the COVID-19 pandemic, researchers and health authorities have sought to identify the different parameters that drive its local transmission cycles to make better decisions regarding prevention and control measures. Different modeling approaches have been proposed in an attempt to predict the behavior of these local cycles. Objective: This paper presents a framework to characterize the different variables that drive the local, or epidemic, cycles of the COVID-19 pandemic, in order to provide a set of relatively simple, yet efficient, statistical tools to be used by local health authorities to support decision making. Methods: Virtually closed cycles were compared to cycles in progress from different locations that present similar patterns in the figures that describe them. With the aim to compare populations of different sizes at different periods of time and locations, the cycles were normalized, allowing an analysis based on the core behavior of the numerical series. A model for the reproduction number was derived from the experimental data, and its performance was presented, including the effect of subnotification (ie, underreporting). A variation of the logistic model was used together with an innovative inventory model to calculate the actual number of infected persons, analyze the incubation period, and determine the actual onset of local epidemic cycles. Results: The similarities among cycles were demonstrated. A pattern between the cycles studied, which took on a triangular shape, was identified and used to make predictions about the duration of future cycles. Analyses on effective reproduction number (Rt) and subnotification effects for Germany, Italy, and Sweden were presented to show the performance of the framework introduced here. After comparing data from the three countries, it was possible to determine the probable dates of the actual onset of the epidemic cycles for each country, the typical duration of the incubation period for the disease, and the total number of infected persons during each cycle. In general terms, a probable average incubation time of 5 days was found, and the method used here was able to estimate the end of the cycles up to 34 days in advance, while demonstrating that the impact of the subnotification level (ie, error) on the effective reproduction number was <5%. Conclusions: It was demonstrated that, with relatively simple mathematical tools, it is possible to obtain a reliable understanding of the behavior of COVID-19 local epidemic cycles, by introducing an integrated framework for identifying cycle patterns and calculating the variables that drive it, namely: the Rt, the subnotification effects on estimations, the most probable actual cycles start dates, the total number of infected, and the most likely incubation period for SARS-CoV-2. UR - https://xmed.jmir.org/2021/1/e22617 UR - http://dx.doi.org/10.2196/22617 UR - http://www.ncbi.nlm.nih.gov/pubmed/34077489 ID - info:doi/10.2196/22617 ER - TY - JOUR AU - Tran, Phoebe AU - Tran, Lam AU - Tran, Liem PY - 2021/3/18 TI - The Influence of Social Distancing on COVID-19 Mortality in US Counties: Cross-sectional Study JO - JMIR Public Health Surveill SP - e21606 VL - 7 IS - 3 KW - COVID-19 KW - marginal effects KW - mortality KW - negative binomial model KW - social distancing N2 - Background: Previous studies on the impact of social distancing on COVID-19 mortality in the United States have predominantly examined this relationship at the national level and have not separated COVID-19 deaths in nursing homes from total COVID-19 deaths. This approach may obscure differences in social distancing behaviors by county in addition to the actual effectiveness of social distancing in preventing COVID-19 deaths. Objective: This study aimed to determine the influence of county-level social distancing behavior on COVID-19 mortality (deaths per 100,000 people) across US counties over the period of the implementation of stay-at-home orders in most US states (March-May 2020). Methods: Using social distancing data from tracked mobile phones in all US counties, we estimated the relationship between social distancing (average proportion of mobile phone usage outside of home between March and May 2020) and COVID-19 mortality (when the state in which the county is located reported its first confirmed case of COVID-19 and up to May 31, 2020) with a mixed-effects negative binomial model while distinguishing COVID-19 deaths in nursing homes from total COVID-19 deaths and accounting for social distancing? and COVID-19?related factors (including the period between the report of the first confirmed case of COVID-19 and May 31, 2020; population density; social vulnerability; and hospital resource availability). Results from the mixed-effects negative binomial model were then used to generate marginal effects at the mean, which helped separate the influence of social distancing on COVID-19 deaths from other covariates while calculating COVID-19 deaths per 100,000 people. Results: We observed that a 1% increase in average mobile phone usage outside of home between March and May 2020 led to a significant increase in COVID-19 mortality by a factor of 1.18 (P<.001), while every 1% increase in the average proportion of mobile phone usage outside of home in February 2020 was found to significantly decrease COVID-19 mortality by a factor of 0.90 (P<.001). Conclusions: As stay-at-home orders have been lifted in many US states, continued adherence to other social distancing measures, such as avoiding large gatherings and maintaining physical distance in public, are key to preventing additional COVID-19 deaths in counties across the country. UR - https://publichealth.jmir.org/2021/3/e21606 UR - http://dx.doi.org/10.2196/21606 UR - http://www.ncbi.nlm.nih.gov/pubmed/33497348 ID - info:doi/10.2196/21606 ER - TY - JOUR AU - Kurita, Junko AU - Sugishita, Yoshiyuki AU - Sugawara, Tamie AU - Ohkusa, Yasushi PY - 2021/2/15 TI - Evaluating Apple Inc Mobility Trend Data Related to the COVID-19 Outbreak in Japan: Statistical Analysis JO - JMIR Public Health Surveill SP - e20335 VL - 7 IS - 2 KW - peak KW - COVID-19 KW - effective reproduction number KW - mobility trend data KW - Apple KW - countermeasure N2 - Background: In Japan, as a countermeasure against the COVID-19 outbreak, both the national and local governments issued voluntary restrictions against going out from residences at the end of March 2020 in preference to the lockdowns instituted in European and North American countries. The effect of such measures can be studied with mobility data, such as data which is generated by counting the number of requests made to Apple Maps for directions in select countries/regions, sub-regions, and cities. Objective: We investigate the associations of mobility data provided by Apple Inc and an estimate an an effective reproduction number R(t). Methods: We regressed R(t) on a polynomial function of daily Apple data, estimated using the whole period, and analyzed subperiods delimited by March 10, 2020. Results: In the estimation results, R(t) was 1.72 when voluntary restrictions against going out ceased and mobility reverted to a normal level. However, the critical level of reducing R(t) to <1 was obtained at 89.3% of normal mobility. Conclusions: We demonstrated that Apple mobility data are useful for short-term prediction of R(t). The results indicate that the number of trips should decrease by 10% until herd immunity is achieved and that higher voluntary restrictions against going out might not be necessary for avoiding a re-emergence of the outbreak. UR - http://publichealth.jmir.org/2021/2/e20335/ UR - http://dx.doi.org/10.2196/20335 UR - http://www.ncbi.nlm.nih.gov/pubmed/33481755 ID - info:doi/10.2196/20335 ER - TY - JOUR AU - Lin, Sheng-Hsuan AU - Fu, Shih-Chen AU - Kao, Michael Chu-Lan PY - 2021/1/27 TI - Using the Novel Mortality-Prevalence Ratio to Evaluate Potentially Undocumented SARS-CoV-2 Infection: Correlational Study JO - JMIR Public Health Surveill SP - e23034 VL - 7 IS - 1 KW - COVID-19 KW - prevalence KW - mortality KW - undocumented infection KW - mortality-prevalence ratio KW - China N2 - Background: The high prevalence of COVID-19 has resulted in 200,000 deaths as of early 2020. The corresponding mortality rate among different countries and times varies. Objective: This study aims to investigate the relationship between the mortality rate and prevalence of COVID-19 within a country. Methods: We collected data from the Johns Hopkins Coronavirus Resource Center. These data included the daily cumulative death count, recovered count, and confirmed count for each country. This study focused on a total of 36 countries with over 10,000 confirmed COVID-19 cases. Mortality was the main outcome and dependent variable, and it was computed by dividing the number of COVID-19 deaths by the number of confirmed cases. Results: The results of our global panel regression analysis showed that there was a highly significant correlation between prevalence and mortality (?=0.8304; P<.001). We found that every increment of 1 confirmed COVID-19 case per 1000 individuals led to a 1.29268% increase in mortality, after controlling for country-specific baseline mortality and time-fixed effects. Over 70% of excess mortality could be attributed to prevalence, and the heterogeneity among countries? mortality-prevalence ratio was significant (P<.001). Further, our results showed that China had an abnormally high and significant mortality-prevalence ratio compared to other countries (P<.001). This unusual deviation in the mortality-prevalence ratio disappeared with the removal of the data that was collected from China after February 17, 2020. It is worth noting that the prevalence of a disease relies on accurate diagnoses and comprehensive surveillance, which can be difficult to achieve due to practical or political concerns. Conclusions: The association between COVID-19 mortality and prevalence was observed and quantified as the mortality-prevalence ratio. Our results highlight the importance of constraining disease transmission to decrease mortality rates. The comparison of mortality-prevalence ratios between countries can be a powerful method for detecting, or even quantifying, the proportion of individuals with undocumented SARS-CoV-2 infection. UR - http://publichealth.jmir.org/2021/1/e23034/ UR - http://dx.doi.org/10.2196/23034 UR - http://www.ncbi.nlm.nih.gov/pubmed/33332282 ID - info:doi/10.2196/23034 ER - TY - JOUR AU - McKee, L. Kevin AU - Crandell, C. Ian AU - Hanlon, L. Alexandra PY - 2020/12/23 TI - County-Level Social Distancing and Policy Impact in the United States: A Dynamical Systems Model JO - JMIR Public Health Surveill SP - e23902 VL - 6 IS - 4 KW - pandemic KW - SARS-CoV-2 KW - infection control KW - COVID-19 KW - social distancing KW - lockdown KW - nonpharmaceutical interventions KW - public health KW - intervention KW - model KW - infectious disease KW - policy N2 - Background: Social distancing and public policy have been crucial for minimizing the spread of SARS-CoV-2 in the United States. Publicly available, county-level time series data on mobility are derived from individual devices with global positioning systems, providing a variety of indices of social distancing behavior per day. Such indices allow a fine-grained approach to modeling public behavior during the pandemic. Previous studies of social distancing and policy have not accounted for the occurrence of pre-policy social distancing and other dynamics reflected in the long-term trajectories of public mobility data. Objective: We propose a differential equation state-space model of county-level social distancing that accounts for distancing behavior leading up to the first official policies, equilibrium dynamics reflected in the long-term trajectories of mobility, and the specific impacts of four kinds of policy. The model is fit to each US county individually, producing a nationwide data set of novel estimated mobility indices. Methods: A differential equation model was fit to three indicators of mobility for each of 3054 counties, with T=100 occasions per county of the following: distance traveled, visitations to key sites, and the log number of interpersonal encounters. The indicators were highly correlated and assumed to share common underlying latent trajectory, dynamics, and responses to policy. Maximum likelihood estimation with the Kalman-Bucy filter was used to estimate the model parameters. Bivariate distributional plots and descriptive statistics were used to examine the resulting county-level parameter estimates. The association of chronology with policy impact was also considered. Results: Mobility dynamics show moderate correlations with two census covariates: population density (Spearman r ranging from 0.11 to 0.31) and median household income (Spearman r ranging from ?0.03 to 0.39). Stay-at-home order effects were negatively correlated with both (r=?0.37 and r=?0.38, respectively), while the effects of the ban on all gatherings were positively correlated with both (r=0.51, r=0.39). Chronological ordering of policies was a moderate to strong determinant of their effect per county (Spearman r ranging from ?0.12 to ?0.56), with earlier policies accounting for most of the change in mobility, and later policies having little or no additional effect. Conclusions: Chronological ordering, population density, and median household income were all associated with policy impact. The stay-at-home order and the ban on gatherings had the largest impacts on mobility on average. The model is implemented in a graphical online app for exploring county-level statistics and running counterfactual simulations. Future studies can incorporate the model-derived indices of social distancing and policy impacts as important social determinants of COVID-19 health outcomes. UR - http://publichealth.jmir.org/2020/4/e23902/ UR - http://dx.doi.org/10.2196/23902 UR - http://www.ncbi.nlm.nih.gov/pubmed/33296866 ID - info:doi/10.2196/23902 ER - TY - JOUR AU - Ebrahim, Senan AU - Ashworth, Henry AU - Noah, Cray AU - Kadambi, Adesh AU - Toumi, Asmae AU - Chhatwal, Jagpreet PY - 2020/12/21 TI - Reduction of COVID-19 Incidence and Nonpharmacologic Interventions: Analysis Using a US County?Level Policy Data Set JO - J Med Internet Res SP - e24614 VL - 22 IS - 12 KW - communicable diseases KW - COVID-19 KW - data set KW - pandemic KW - policy KW - public health KW - data KW - intervention KW - effectiveness KW - incidence KW - time series N2 - Background: Worldwide, nonpharmacologic interventions (NPIs) have been the main tool used to mitigate the COVID-19 pandemic. This includes social distancing measures (closing businesses, closing schools, and quarantining symptomatic persons) and contacttracing (tracking and following exposed individuals). While preliminary research across the globe has shown these policies to be effective, there is currently a lack of information on the effectiveness of NPIs in the United States. Objective: The purpose of this study was to create a granular NPI data set at the county level and then analyze the relationship between NPI policies and changes in reported COVID-19 cases. Methods: Using a standardized crowdsourcing methodology, we collected time-series data on 7 key NPIs for 1320 US counties. Results: This open-source data set is the largest and most comprehensive collection of county NPI policy data and meets the need for higher-resolution COVID-19 policy data. Our analysis revealed a wide variation in county-level policies both within and among states (P<.001). We identified a correlation between workplace closures and lower growth rates of COVID-19 cases (P=.004). We found weak correlations between shelter-in-place enforcement and measures of Democratic local voter proportion (R=0.21) and elected leadership (R=0.22). Conclusions: This study is the first large-scale NPI analysis at the county level demonstrating a correlation between NPIs and decreased rates of COVID-19. Future work using this data set will explore the relationship between county-level policies and COVID-19 transmission to optimize real-time policy formulation. UR - http://www.jmir.org/2020/12/e24614/ UR - http://dx.doi.org/10.2196/24614 UR - http://www.ncbi.nlm.nih.gov/pubmed/33302253 ID - info:doi/10.2196/24614 ER - TY - JOUR AU - Kim, Ki-Hun AU - Kim, Kwang-Jae PY - 2020/12/17 TI - Missing-Data Handling Methods for Lifelogs-Based Wellness Index Estimation: Comparative Analysis With Panel Data JO - JMIR Med Inform SP - e20597 VL - 8 IS - 12 KW - lifelogs-based wellness index KW - missing-data handling KW - health behavior lifelogs KW - panel data KW - smart wellness service N2 - Background: A lifelogs-based wellness index (LWI) is a function for calculating wellness scores based on health behavior lifelogs (eg, daily walking steps and sleep times collected via a smartwatch). A wellness score intuitively shows the users of smart wellness services the overall condition of their health behaviors. LWI development includes estimation (ie, estimating coefficients in LWI with data). A panel data set comprising health behavior lifelogs allows LWI estimation to control for unobserved variables, thereby resulting in less bias. However, these data sets typically have missing data due to events that occur in daily life (eg, smart devices stop collecting data when batteries are depleted), which can introduce biases into LWI coefficients. Thus, the appropriate choice of method to handle missing data is important for reducing biases in LWI estimations with panel data. However, there is a lack of research in this area. Objective: This study aims to identify a suitable missing-data handling method for LWI estimation with panel data. Methods: Listwise deletion, mean imputation, expectation maximization?based multiple imputation, predictive-mean matching?based multiple imputation, k-nearest neighbors?based imputation, and low-rank approximation?based imputation were comparatively evaluated by simulating an existing case of LWI development. A panel data set comprising health behavior lifelogs of 41 college students over 4 weeks was transformed into a reference data set without any missing data. Then, 200 simulated data sets were generated by randomly introducing missing data at proportions from 1% to 80%. The missing-data handling methods were each applied to transform the simulated data sets into complete data sets, and coefficients in a linear LWI were estimated for each complete data set. For each proportion for each method, a bias measure was calculated by comparing the estimated coefficient values with values estimated from the reference data set. Results: Methods performed differently depending on the proportion of missing data. For 1% to 30% proportions, low-rank approximation?based imputation, predictive-mean matching?based multiple imputation, and expectation maximization?based multiple imputation were superior. For 31% to 60% proportions, low-rank approximation?based imputation and predictive-mean matching?based multiple imputation performed best. For over 60% proportions, only low-rank approximation?based imputation performed acceptably. Conclusions: Low-rank approximation?based imputation was the best of the 6 data-handling methods regardless of the proportion of missing data. This superiority is generalizable to other panel data sets comprising health behavior lifelogs given their verified low-rank nature, for which low-rank approximation?based imputation is known to perform effectively. This result will guide missing-data handling in reducing coefficient biases in new development cases of linear LWIs with panel data. UR - http://medinform.jmir.org/2020/12/e20597/ UR - http://dx.doi.org/10.2196/20597 UR - http://www.ncbi.nlm.nih.gov/pubmed/33331831 ID - info:doi/10.2196/20597 ER - TY - JOUR AU - Utamura, Motoaki AU - Koizumi, Makoto AU - Kirikami, Seiichi PY - 2020/12/16 TI - An Epidemiological Model Considering Isolation to Predict COVID-19 Trends in Tokyo, Japan: Numerical Analysis JO - JMIR Public Health Surveill SP - e23624 VL - 6 IS - 4 KW - coronavirus KW - COVID-19 KW - epidemiological model KW - prediction KW - Tokyo KW - delay differential equation KW - SIR model KW - model KW - epidemiology KW - isolation KW - trend N2 - Background: COVID-19 currently poses a global public health threat. Although Tokyo, Japan, is no exception to this, it was initially affected by only a small-level epidemic. Nevertheless, medical collapse nearly happened since no predictive methods were available to assess infection counts. A standard susceptible-infectious-removed (SIR) epidemiological model has been widely used, but its applicability is limited often to the early phase of an epidemic in the case of a large collective population. A full numerical simulation of the entire period from beginning until end would be helpful for understanding COVID-19 trends in (separate) counts of inpatient and infectious cases and can also aid the preparation of hospital beds and development of quarantine strategies. Objective: This study aimed to develop an epidemiological model that considers the isolation period to simulate a comprehensive trend of the initial epidemic in Tokyo that yields separate counts of inpatient and infectious cases. It was also intended to induce important corollaries of governing equations (ie, effective reproductive number) and equations for the final count. Methods: Time-series data related to SARS-CoV-2 from February 28 to May 23, 2020, from Tokyo and antibody testing conducted by the Japanese government were adopted for this study. A novel epidemiological model based on a discrete delay differential equation (apparent time-lag model [ATLM]) was introduced. The model can predict trends in inpatient and infectious cases in the field. Various data such as daily new confirmed cases, cumulative infections, inpatients, and PCR (polymerase chain reaction) test positivity ratios were used to verify the model. This approach also derived an alternative formulation equivalent to the standard SIR model. Results: In a typical parameter setting, the present ATLM provided 20% less infectious cases in the field compared to the standard SIR model prediction owing to isolation. The basic reproductive number was inferred as 2.30 under the condition that the time lag T from infection to detection and isolation is 14 days. Based on this, an adequate vaccine ratio to avoid an outbreak was evaluated for 57% of the population. We assessed the date (May 23) that the government declared a rescission of the state of emergency. Taking into consideration the number of infectious cases in the field, a date of 1 week later (May 30) would have been most effective. Furthermore, simulation results with a shorter time lag of T=7 and a larger transmission rate of ?=1.43?0 suggest that infections at large should reduce by half and inpatient numbers should be similar to those of the first wave of COVID-19. Conclusions: A novel mathematical model was proposed and examined using SARS-CoV-2 data for Tokyo. The simulation agreed with data from the beginning of the pandemic. Shortening the period from infection to hospitalization is effective against outbreaks without rigorous public health interventions and control. UR - http://publichealth.jmir.org/2020/4/e23624/ UR - http://dx.doi.org/10.2196/23624 UR - http://www.ncbi.nlm.nih.gov/pubmed/33259325 ID - info:doi/10.2196/23624 ER - TY - JOUR AU - Post, Ann Lori AU - Issa, Ziad Tariq AU - Boctor, J. Michael AU - Moss, B. Charles AU - Murphy, L. Robert AU - Ison, G. Michael AU - Achenbach, J. Chad AU - Resnick, Danielle AU - Singh, Nadya Lauren AU - White, Janine AU - Faber, Mitchell Joshua Marco AU - Culler, Kasen AU - Brandt, A. Cynthia AU - Oehmke, Francis James PY - 2020/12/3 TI - Dynamic Public Health Surveillance to Track and Mitigate the US COVID-19 Epidemic: Longitudinal Trend Analysis Study JO - J Med Internet Res SP - e24286 VL - 22 IS - 12 KW - global COVID-19 surveillance KW - United States public health surveillance KW - US COVID-19 KW - surveillance metrics KW - dynamic panel data KW - generalized method of the moments KW - United States econometrics KW - US SARS-CoV-2 KW - US COVID-19 surveillance system KW - US COVID-19 transmission speed KW - COVID-19 transmission acceleration KW - COVID-19 speed KW - COVID-19 acceleration KW - COVID-19 jerk KW - COVID-19 persistence KW - Arellano-Bond estimator KW - COVID-19 N2 - Background: The emergence of SARS-CoV-2, the virus that causes COVID-19, has led to a global pandemic. The United States has been severely affected, accounting for the most COVID-19 cases and deaths worldwide. Without a coordinated national public health plan informed by surveillance with actionable metrics, the United States has been ineffective at preventing and mitigating the escalating COVID-19 pandemic. Existing surveillance has incomplete ascertainment and is limited by the use of standard surveillance metrics. Although many COVID-19 data sources track infection rates, informing prevention requires capturing the relevant dynamics of the pandemic. Objective: The aim of this study is to develop dynamic metrics for public health surveillance that can inform worldwide COVID-19 prevention efforts. Advanced surveillance techniques are essential to inform public health decision making and to identify where and when corrective action is required to prevent outbreaks. Methods: Using a longitudinal trend analysis study design, we extracted COVID-19 data from global public health registries. We used an empirical difference equation to measure daily case numbers for our use case in 50 US states and the District of Colombia as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano-Bond estimator in R. Results: Examination of the United States and state data demonstrated that most US states are experiencing outbreaks as measured by these new metrics of speed, acceleration, jerk, and persistence. Larger US states have high COVID-19 caseloads as a function of population size, density, and deficits in adherence to public health guidelines early in the epidemic, and other states have alarming rates of speed, acceleration, jerk, and 7-day persistence in novel infections. North and South Dakota have had the highest rates of COVID-19 transmission combined with positive acceleration, jerk, and 7-day persistence. Wisconsin and Illinois also have alarming indicators and already lead the nation in daily new COVID-19 infections. As the United States enters its third wave of COVID-19, all 50 states and the District of Colombia have positive rates of speed between 7.58 (Hawaii) and 175.01 (North Dakota), and persistence, ranging from 4.44 (Vermont) to 195.35 (North Dakota) new infections per 100,000 people. Conclusions: Standard surveillance techniques such as daily and cumulative infections and deaths are helpful but only provide a static view of what has already occurred in the pandemic and are less helpful in prevention. Public health policy that is informed by dynamic surveillance can shift the country from reacting to COVID-19 transmissions to being proactive and taking corrective action when indicators of speed, acceleration, jerk, and persistence remain positive week over week. Implicit within our dynamic surveillance is an early warning system that indicates when there is problematic growth in COVID-19 transmissions as well as signals when growth will become explosive without action. A public health approach that focuses on prevention can prevent major outbreaks in addition to endorsing effective public health policies. Moreover, subnational analyses on the dynamics of the pandemic allow us to zero in on where transmissions are increasing, meaning corrective action can be applied with precision in problematic areas. Dynamic public health surveillance can inform specific geographies where quarantines are necessary while preserving the economy in other US areas. UR - https://www.jmir.org/2020/12/e24286 UR - http://dx.doi.org/10.2196/24286 UR - http://www.ncbi.nlm.nih.gov/pubmed/33216726 ID - info:doi/10.2196/24286 ER - TY - JOUR AU - Nakano, Takashi AU - Ikeda, Yoichi PY - 2020/11/30 TI - Novel Indicator to Ascertain the Status and Trend of COVID-19 Spread: Modeling Study JO - J Med Internet Res SP - e20144 VL - 22 IS - 11 KW - communicable diseases KW - COVID-19 KW - SARS-CoV-2 KW - model KW - modeling KW - virus KW - infectious disease KW - spread N2 - Background: In the fight against the pandemic of COVID-19, it is important to ascertain the status and trend of the infection spread quickly and accurately. Objective: The purpose of our study is to formulate a new and simple indicator that represents the COVID-19 spread rate by using publicly available data. Methods: The new indicator K is a backward difference approximation of the logarithmic derivative of the cumulative number of cases with a time interval of 7 days. It is calculated as a ratio of the number of newly confirmed cases in a week to the total number of cases. Results: The analysis of the current status of COVID-19 spreading over countries showed an approximate linear decrease in the time evolution of the K value. The slope of the linear decrease differed from country to country. In addition, it was steeper for East and Southeast Asian countries than for European countries. The regional difference in the slope seems to reflect both social and immunological circumstances for each country. Conclusions: The approximate linear decrease of the K value indicates that the COVID-19 spread does not grow exponentially but starts to attenuate from the early stage. The K trajectory in a wide range was successfully reproduced by a phenomenological model with the constant attenuation assumption, indicating that the total number of the infected people follows the Gompertz curve. Focusing on the change in the value of K will help to improve and refine epidemiological models of COVID-19. UR - http://www.jmir.org/2020/11/e20144/ UR - http://dx.doi.org/10.2196/20144 UR - http://www.ncbi.nlm.nih.gov/pubmed/33180742 ID - info:doi/10.2196/20144 ER - TY - JOUR AU - Post, Ann Lori AU - Argaw, T. Salem AU - Jones, Cameron AU - Moss, B. Charles AU - Resnick, Danielle AU - Singh, Nadya Lauren AU - Murphy, Leo Robert AU - Achenbach, J. Chad AU - White, Janine AU - Issa, Ziad Tariq AU - Boctor, J. Michael AU - Oehmke, Francis James PY - 2020/11/19 TI - A SARS-CoV-2 Surveillance System in Sub-Saharan Africa: Modeling Study for Persistence and Transmission to Inform Policy JO - J Med Internet Res SP - e24248 VL - 22 IS - 11 KW - global COVID-19 surveillance KW - African public health surveillance KW - sub-Saharan African COVID-19 KW - African surveillance metrics KW - dynamic panel data KW - generalized method of the moments KW - African econometrics KW - African SARS-CoV-2 KW - African COVID-19 surveillance system KW - African COVID-19 transmission speed KW - African COVID-19 transmission acceleration KW - COVID-19 transmission deceleration KW - COVID-19 transmission jerk KW - COVID-19 7-day persistence KW - Sao Tome and Principe KW - Senegal KW - Seychelles KW - Sierra Leone KW - Somalia KW - South Africa KW - South Sudan KW - Sudan KW - Suriname KW - Swaziland KW - Tanzania KW - Togo KW - Uganda KW - Zambia KW - Zimbabwe KW - Gambia KW - Ghana KW - Guinea KW - Guinea-Bissau KW - Kenya KW - Lesotho KW - Liberia KW - Madagascar KW - Malawi KW - Mali KW - Mauritania KW - Mauritius KW - Mozambique KW - Namibia KW - Niger KW - Nigeria KW - Rwanda KW - Angola KW - Benin KW - Botswana KW - Burkina Faso KW - Burundi KW - Cameroon KW - Central African Republic KW - Chad KW - Comoros KW - Congo KW - Cote d'Ivoire KW - Democratic Republic of Congo KW - Equatorial Guinea KW - Eritrea KW - Ethiopia KW - Gabon N2 - Background: Since the novel coronavirus emerged in late 2019, the scientific and public health community around the world have sought to better understand, surveil, treat, and prevent the disease, COVID-19. In sub-Saharan Africa (SSA), many countries responded aggressively and decisively with lockdown measures and border closures. Such actions may have helped prevent large outbreaks throughout much of the region, though there is substantial variation in caseloads and mortality between nations. Additionally, the health system infrastructure remains a concern throughout much of SSA, and the lockdown measures threaten to increase poverty and food insecurity for the subcontinent?s poorest residents. The lack of sufficient testing, asymptomatic infections, and poor reporting practices in many countries limit our understanding of the virus?s impact, creating a need for better and more accurate surveillance metrics that account for underreporting and data contamination. Objective: The goal of this study is to improve infectious disease surveillance by complementing standardized metrics with new and decomposable surveillance metrics of COVID-19 that overcome data limitations and contamination inherent in public health surveillance systems. In addition to prevalence of observed daily and cumulative testing, testing positivity rates, morbidity, and mortality, we derived COVID-19 transmission in terms of speed, acceleration or deceleration, change in acceleration or deceleration (jerk), and 7-day transmission rate persistence, which explains where and how rapidly COVID-19 is transmitting and quantifies shifts in the rate of acceleration or deceleration to inform policies to mitigate and prevent COVID-19 and food insecurity in SSA. Methods: We extracted 60 days of COVID-19 data from public health registries and employed an empirical difference equation to measure daily case numbers in 47 sub-Saharan countries as a function of the prior number of cases, the level of testing, and weekly shift variables based on a dynamic panel model that was estimated using the generalized method of moments approach by implementing the Arellano-Bond estimator in R. Results: Kenya, Ghana, Nigeria, Ethiopia, and South Africa have the most observed cases of COVID-19, and the Seychelles, Eritrea, Mauritius, Comoros, and Burundi have the fewest. In contrast, the speed, acceleration, jerk, and 7-day persistence indicate rates of COVID-19 transmissions differ from observed cases. In September 2020, Cape Verde, Namibia, Eswatini, and South Africa had the highest speed of COVID-19 transmissions at 13.1, 7.1, 3.6, and 3 infections per 100,0000, respectively; Zimbabwe had an acceleration rate of transmission, while Zambia had the largest rate of deceleration this week compared to last week, referred to as a jerk. Finally, the 7-day persistence rate indicates the number of cases on September 15, 2020, which are a function of new infections from September 8, 2020, decreased in South Africa from 216.7 to 173.2 and Ethiopia from 136.7 to 106.3 per 100,000. The statistical approach was validated based on the regression results; they determined recent changes in the pattern of infection, and during the weeks of September 1-8 and September 9-15, there were substantial country differences in the evolution of the SSA pandemic. This change represents a decrease in the transmission model R value for that week and is consistent with a de-escalation in the pandemic for the sub-Saharan African continent in general. Conclusions: Standard surveillance metrics such as daily observed new COVID-19 cases or deaths are necessary but insufficient to mitigate and prevent COVID-19 transmission. Public health leaders also need to know where COVID-19 transmission rates are accelerating or decelerating, whether those rates increase or decrease over short time frames because the pandemic can quickly escalate, and how many cases today are a function of new infections 7 days ago. Even though SSA is home to some of the poorest countries in the world, development and population size are not necessarily predictive of COVID-19 transmission, meaning higher income countries like the United States can learn from African countries on how best to implement mitigation and prevention efforts. International Registered Report Identifier (IRRID): RR2-10.2196/21955 UR - https://www.jmir.org/2020/11/e24248 UR - http://dx.doi.org/10.2196/24248 UR - http://www.ncbi.nlm.nih.gov/pubmed/33211026 ID - info:doi/10.2196/24248 ER - TY - JOUR AU - Arias Garcia, Sonia AU - Chen, Jia AU - Calleja, Garcia Jesus AU - Sabin, Keith AU - Ogbuanu, Chinelo AU - Lowrance, David AU - Zhao, Jinkou PY - 2020/11/17 TI - Availability and Quality of Surveillance and Survey Data on HIV Prevalence Among Sex Workers, Men Who Have Sex With Men, People Who Inject Drugs, and Transgender Women in Low- and Middle-Income Countries: Review of Available Data (2001-2017) JO - JMIR Public Health Surveill SP - e21688 VL - 6 IS - 4 KW - Key populations KW - HIV prevalence KW - men who have sex with men KW - people who inject drugs KW - sex workers KW - transgender women KW - low- and middle-income countries N2 - Background: In 2019, 62% of new HIV infections occurred among key populations (KPs) and their sexual partners. The World Health Organization (WHO) recommends implementation of bio-behavioral surveys every 2-3 years to obtain HIV prevalence data for all KPs. However, the collection of these data is often less frequent and geographically limited. Objective: This study intended to assess the availability and quality of HIV prevalence data among sex workers (SWs), men who have sex with men (MSM), people who inject drugs, and transgender women (transwomen) in low- and middle-income countries. Methods: Data were obtained from survey reports, national reports, journal articles, and other grey literature available to the Global Fund, Joint United Nations Programme on HIV/AIDS, and WHO or from other open sources. Elements reviewed included names of subnational units, HIV prevalence, sampling method, and size. Based on geographical coverage, availability of trends over time, and recency of estimates, data were categorized by country and grouped as follows: nationally adequate, locally adequate but nationally inadequate, no recent data, no trends available, and no data. Results: Among the 123 countries assessed, 91.9% (113/123) presented at least 1 HIV prevalence data point for any KP; 78.0% (96/123) presented data for at least 2 groups; and 51.2% (63/123), for at least 3 groups. Data on all 4 groups were available for only 14.6% (18/123) of the countries. HIV prevalence data for SWs, MSM, people who inject drugs, and transwomen were available in 86.2% (106/123), 80.5% (99/123), 45.5% (56/123), and 23.6% (29/123) of the countries, respectively. Only 10.6% (13/123) of the countries presented nationally adequate data for any KP between 2001 and 2017; 6 for SWs; 2 for MSM; and 5 for people who inject drugs. Moreover, 26.8% (33/123) of the countries were categorized as locally adequate but nationally inadequate, mostly for SWs and MSM. No trend data on SWs and MSM were available for 38.2% (47/123) and 43.9% (54/123) of the countries, respectively, while no data on people who inject drugs and transwomen were available for 76.4% (94/123) and 54.5% (67/123) of the countries, respectively. An increase in the number of data points was observed for MSM and transwomen. Overall increases were noted in the number and proportions of data points, especially for MSM, people who inject drugs, and transwomen, with sample sizes exceeding 100. Conclusions: Despite general improvements in health data availability and quality, the availability of HIV prevalence data among the most vulnerable populations in low- and middle-income countries remains insufficient. Data collection should be expanded to include behavioral, clinical, and epidemiologic data through context-specific differentiated survey approaches while emphasizing data use for program improvements. Ending the HIV epidemic by 2030 is possible only if the epidemic is controlled among KPs. UR - http://publichealth.jmir.org/2020/4/e21688/ UR - http://dx.doi.org/10.2196/21688 UR - http://www.ncbi.nlm.nih.gov/pubmed/33200996 ID - info:doi/10.2196/21688 ER - TY - JOUR AU - El Emam, Khaled AU - Mosquera, Lucy AU - Bass, Jason PY - 2020/11/16 TI - Evaluating Identity Disclosure Risk in Fully Synthetic Health Data: Model Development and Validation JO - J Med Internet Res SP - e23139 VL - 22 IS - 11 KW - synthetic data KW - privacy KW - data sharing KW - data access KW - de-identification KW - open data N2 - Background: There has been growing interest in data synthesis for enabling the sharing of data for secondary analysis; however, there is a need for a comprehensive privacy risk model for fully synthetic data: If the generative models have been overfit, then it is possible to identify individuals from synthetic data and learn something new about them. Objective: The purpose of this study is to develop and apply a methodology for evaluating the identity disclosure risks of fully synthetic data. Methods: A full risk model is presented, which evaluates both identity disclosure and the ability of an adversary to learn something new if there is a match between a synthetic record and a real person. We term this ?meaningful identity disclosure risk.? The model is applied on samples from the Washington State Hospital discharge database (2007) and the Canadian COVID-19 cases database. Both of these datasets were synthesized using a sequential decision tree process commonly used to synthesize health and social science data. Results: The meaningful identity disclosure risk for both of these synthesized samples was below the commonly used 0.09 risk threshold (0.0198 and 0.0086, respectively), and 4 times and 5 times lower than the risk values for the original datasets, respectively. Conclusions: We have presented a comprehensive identity disclosure risk model for fully synthetic data. The results for this synthesis method on 2 datasets demonstrate that synthesis can reduce meaningful identity disclosure risks considerably. The risk model can be applied in the future to evaluate the privacy of fully synthetic data. UR - http://www.jmir.org/2020/11/e23139/ UR - http://dx.doi.org/10.2196/23139 UR - http://www.ncbi.nlm.nih.gov/pubmed/33196453 ID - info:doi/10.2196/23139 ER - TY - JOUR AU - Qin, Lei AU - Wang, Yidan AU - Sun, Qiang AU - Zhang, Xiaomei AU - Shia, Ben-Chang AU - Liu, Chengcheng PY - 2020/11/13 TI - Analysis of the COVID-19 Epidemic Transmission Network in Mainland China: K-Core Decomposition Study JO - JMIR Public Health Surveill SP - e24291 VL - 6 IS - 4 KW - COVID-19 KW - epidemic network KW - prevention and control KW - k-core decomposition N2 - Background: Since the outbreak of COVID-19 in December 2019 in Wuhan, Hubei Province, China, frequent interregional contacts and the high rate of infection spread have catalyzed the formation of an epidemic network. Objective: The aim of this study was to identify influential nodes and highlight the hidden structural properties of the COVID-19 epidemic network, which we believe is central to prevention and control of the epidemic. Methods: We first constructed a network of the COVID-19 epidemic among 31 provinces in mainland China; after some basic characteristics were revealed by the degree distribution, the k-core decomposition method was employed to provide static and dynamic evidence to determine the influential nodes and hierarchical structure. We then exhibited the influence power of the above nodes and the evolution of this power. Results: Only a small fraction of the provinces studied showed relatively strong outward or inward epidemic transmission effects. The three provinces of Hubei, Beijing, and Guangzhou showed the highest out-degrees, and the three highest in-degrees were observed for the provinces of Beijing, Henan, and Liaoning. In terms of the hierarchical structure of the COVID-19 epidemic network over the whole period, more than half of the 31 provinces were located in the innermost core. Considering the correlation of the characteristics and coreness of each province, we identified some significant negative and positive factors. Specific to the dynamic transmission process of the COVID-19 epidemic, three provinces of Anhui, Beijing, and Guangdong always showed the highest coreness from the third to the sixth week; meanwhile, Hubei Province maintained the highest coreness until the fifth week and then suddenly dropped to the lowest in the sixth week. We also found that the out-strengths of the innermost nodes were greater than their in-strengths before January 27, 2020, at which point a reversal occurred. Conclusions: Increasing our understanding of how epidemic networks form and function may help reduce the damaging effects of COVID-19 in China as well as in other countries and territories worldwide. UR - http://publichealth.jmir.org/2020/4/e24291/ UR - http://dx.doi.org/10.2196/24291 UR - http://www.ncbi.nlm.nih.gov/pubmed/33108309 ID - info:doi/10.2196/24291 ER - TY - JOUR AU - Saurabh, Suman AU - Verma, Kumar Mahendra AU - Gautam, Vaishali AU - Kumar, Nitesh AU - Goel, Dhanesh Akhil AU - Gupta, Kumar Manoj AU - Bhardwaj, Pankaj AU - Misra, Sanjeev PY - 2020/10/15 TI - Transmission Dynamics of the COVID-19 Epidemic at the District Level in India: Prospective Observational Study JO - JMIR Public Health Surveill SP - e22678 VL - 6 IS - 4 KW - Epidemiology KW - SARS-CoV-2 KW - COVID-19 KW - serial interval KW - basic reproduction number KW - projection KW - outbreak response KW - India KW - mathematical modeling KW - infectious disease N2 - Background: On March 9, 2020, the first COVID-19 case was reported in Jodhpur, Rajasthan, in the northwestern part of India. Understanding the epidemiology of COVID-19 at a local level is becoming increasingly important to guide measures to control the pandemic. Objective: The aim of this study was to estimate the serial interval and basic reproduction number (R0) to understand the transmission dynamics of the COVID-19 outbreak at a district level. We used standard mathematical modeling approaches to assess the utility of these factors in determining the effectiveness of COVID-19 responses and projecting the size of the epidemic. Methods: Contact tracing of individuals infected with SARS-CoV-2 was performed to obtain the serial intervals. The median and 95th percentile values of the SARS-CoV-2 serial interval were obtained from the best fits with the weibull, log-normal, log-logistic, gamma, and generalized gamma distributions. Aggregate and instantaneous R0 values were derived with different methods using the EarlyR and EpiEstim packages in R software. Results: The median and 95th percentile values of the serial interval were 5.23 days (95% CI 4.72-5.79) and 13.20 days (95% CI 10.90-18.18), respectively. R0 during the first 30 days of the outbreak was 1.62 (95% CI 1.07-2.17), which subsequently decreased to 1.15 (95% CI 1.09-1.21). The peak instantaneous R0 values obtained using a Poisson process developed by Jombert et al were 6.53 (95% CI 2.12-13.38) and 3.43 (95% CI 1.71-5.74) for sliding time windows of 7 and 14 days, respectively. The peak R0 values obtained using the method by Wallinga and Teunis were 2.96 (95% CI 2.52-3.36) and 2.92 (95% CI 2.65-3.22) for sliding time windows of 7 and 14 days, respectively. R0 values of 1.21 (95% CI 1.09-1.34) and 1.12 (95% CI 1.03-1.21) for the 7- and 14-day sliding time windows, respectively, were obtained on July 6, 2020, using method by Jombert et al. Using the method by Wallinga and Teunis, values of 0.32 (95% CI 0.27-0.36) and 0.61 (95% CI 0.58-0.63) were obtained for the 7- and 14-day sliding time windows, respectively. The projection of cases over the next month was 2131 (95% CI 1799-2462). Reductions of transmission by 25% and 50% corresponding to reasonable and aggressive control measures could lead to 58.7% and 84.0% reductions in epidemic size, respectively. Conclusions: The projected transmission reductions indicate that strengthening control measures could lead to proportionate reductions of the size of the COVID-19 epidemic. Time-dependent instantaneous R0 estimation based on the process by Jombart et al was found to be better suited for guiding COVID-19 response at the district level than overall R0 or instantaneous R0 estimation by the Wallinga and Teunis method. A data-driven approach at the local level is proposed to be useful in guiding public health strategy and surge capacity planning. UR - http://publichealth.jmir.org/2020/4/e22678/ UR - http://dx.doi.org/10.2196/22678 UR - http://www.ncbi.nlm.nih.gov/pubmed/33001839 ID - info:doi/10.2196/22678 ER - TY - JOUR AU - Tosi, Davide AU - Campi, Alessandro PY - 2020/10/14 TI - How Data Analytics and Big Data Can Help Scientists in Managing COVID-19 Diffusion: Modeling Study to Predict the COVID-19 Diffusion in Italy and the Lombardy Region JO - J Med Internet Res SP - e21081 VL - 22 IS - 10 KW - COVID-19 KW - SARS-CoV-2 KW - big data KW - data analytics KW - predictive models KW - prediction KW - modeling KW - Italy KW - diffusion N2 - Background: COVID-19 is the most widely discussed topic worldwide in 2020, and at the beginning of the Italian epidemic, scientists tried to understand the virus diffusion and the epidemic curve of positive cases with controversial findings and numbers. Objective: In this paper, a data analytics study on the diffusion of COVID-19 in Italy and the Lombardy Region is developed to define a predictive model tailored to forecast the evolution of the diffusion over time. Methods: Starting with all available official data collected worldwide about the diffusion of COVID-19, we defined a predictive model at the beginning of March 2020 for the Italian country. Results: This paper aims at showing how this predictive model was able to forecast the behavior of the COVID-19 diffusion and how it predicted the total number of positive cases in Italy over time. The predictive model forecasted, for the Italian country, the end of the COVID-19 first wave by the beginning of June. Conclusions: This paper shows that big data and data analytics can help medical experts and epidemiologists in promptly designing accurate and generalized models to predict the different COVID-19 evolutionary phases in other countries and regions, and for second and third possible epidemic waves. UR - http://www.jmir.org/2020/10/e21081/ UR - http://dx.doi.org/10.2196/21081 UR - http://www.ncbi.nlm.nih.gov/pubmed/33027038 ID - info:doi/10.2196/21081 ER - TY - JOUR AU - Oehmke, Francis James AU - Moss, B. Charles AU - Singh, Nadya Lauren AU - Oehmke, Bristol Theresa AU - Post, Ann Lori PY - 2020/10/5 TI - Dynamic Panel Surveillance of COVID-19 Transmission in the United States to Inform Health Policy: Observational Statistical Study JO - J Med Internet Res SP - e21955 VL - 22 IS - 10 KW - COVID-19 KW - models KW - surveillance KW - reopening America KW - contagion KW - metrics KW - health policy KW - public health N2 - Background: The Great COVID-19 Shutdown aimed to eliminate or slow the spread of SARS-CoV-2, the virus that causes COVID-19. The United States has no national policy, leaving states to independently implement public health guidelines that are predicated on a sustained decline in COVID-19 cases. Operationalization of ?sustained decline? varies by state and county. Existing models of COVID-19 transmission rely on parameters such as case estimates or R0 and are dependent on intensive data collection efforts. Static statistical models do not capture all of the relevant dynamics required to measure sustained declines. Moreover, existing COVID-19 models use data that are subject to significant measurement error and contamination. Objective: This study will generate novel metrics of speed, acceleration, jerk, and 7-day lag in the speed of COVID-19 transmission using state government tallies of SARS-CoV-2 infections, including state-level dynamics of SARS-CoV-2 infections. This study provides the prototype for a global surveillance system to inform public health practice, including novel standardized metrics of COVID-19 transmission, for use in combination with traditional surveillance tools. Methods: Dynamic panel data models were estimated with the Arellano-Bond estimator using the generalized method of moments. This statistical technique allows for the control of a variety of deficiencies in the existing data. Tests of the validity of the model and statistical techniques were applied. Results: The statistical approach was validated based on the regression results, which determined recent changes in the pattern of infection. During the weeks of August 17-23 and August 24-30, 2020, there were substantial regional differences in the evolution of the US pandemic. Census regions 1 and 2 were relatively quiet with a small but significant persistence effect that remained relatively unchanged from the prior 2 weeks. Census region 3 was sensitive to the number of tests administered, with a high constant rate of cases. A weekly special analysis showed that these results were driven by states with a high number of positive test reports from universities. Census region 4 had a high constant number of cases and a significantly increased persistence effect during the week of August 24-30. This change represents an increase in the transmission model R value for that week and is consistent with a re-emergence of the pandemic. Conclusions: Reopening the United States comes with three certainties: (1) the ?social? end of the pandemic and reopening are going to occur before the ?medical? end even while the pandemic is growing. We need improved standardized surveillance techniques to inform leaders when it is safe to open sections of the country; (2) varying public health policies and guidelines unnecessarily result in varying degrees of transmission and outbreaks; and (3) even those states most successful in containing the pandemic continue to see a small but constant stream of new cases daily. UR - https://www.jmir.org/2020/10/e21955 UR - http://dx.doi.org/10.2196/21955 UR - http://www.ncbi.nlm.nih.gov/pubmed/32924962 ID - info:doi/10.2196/21955 ER - TY - JOUR AU - Oehmke, Francis James AU - Oehmke, B. Theresa AU - Singh, Nadya Lauren AU - Post, Ann Lori PY - 2020/9/22 TI - Dynamic Panel Estimate?Based Health Surveillance of SARS-CoV-2 Infection Rates to Inform Public Health Policy: Model Development and Validation JO - J Med Internet Res SP - e20924 VL - 22 IS - 9 KW - COVID-19 KW - models KW - surveillance KW - COVID-19 surveillance system KW - dynamic panel data KW - infectious disease modeling KW - reopening America KW - COVID-19 guidelines KW - COVID-19 health policy N2 - Background: SARS-CoV-2, the novel coronavirus that causes COVID-19, is a global pandemic with higher mortality and morbidity than any other virus in the last 100 years. Without public health surveillance, policy makers cannot know where and how the disease is accelerating, decelerating, and shifting. Unfortunately, existing models of COVID-19 contagion rely on parameters such as the basic reproduction number and use static statistical methods that do not capture all the relevant dynamics needed for surveillance. Existing surveillance methods use data that are subject to significant measurement error and other contaminants. Objective: The aim of this study is to provide a proof of concept of the creation of surveillance metrics that correct for measurement error and data contamination to determine when it is safe to ease pandemic restrictions. We applied state-of-the-art statistical modeling to existing internet data to derive the best available estimates of the state-level dynamics of COVID-19 infection in the United States. Methods: Dynamic panel data (DPD) models were estimated with the Arellano-Bond estimator using the generalized method of moments. This statistical technique enables control of various deficiencies in a data set. The validity of the model and statistical technique was tested. Results: A Wald chi-square test of the explanatory power of the statistical approach indicated that it is valid (?210=1489.84, P<.001), and a Sargan chi-square test indicated that the model identification is valid (?2946=935.52, P=.59). The 7-day persistence rate for the week of June 27 to July 3 was 0.5188 (P<.001), meaning that every 10,000 new cases in the prior week were associated with 5188 cases 7 days later. For the week of July 4 to 10, the 7-day persistence rate increased by 0.2691 (P=.003), indicating that every 10,000 new cases in the prior week were associated with 7879 new cases 7 days later. Applied to the reported number of cases, these results indicate an increase of almost 100 additional new cases per day per state for the week of July 4-10. This signifies an increase in the reproduction parameter in the contagion models and corroborates the hypothesis that economic reopening without applying best public health practices is associated with a resurgence of the pandemic. Conclusions: DPD models successfully correct for measurement error and data contamination and are useful to derive surveillance metrics. The opening of America involves two certainties: the country will be COVID-19?free only when there is an effective vaccine, and the ?social? end of the pandemic will occur before the ?medical? end. Therefore, improved surveillance metrics are needed to inform leaders of how to open sections of the United States more safely. DPD models can inform this reopening in combination with the extraction of COVID-19 data from existing websites. UR - http://www.jmir.org/2020/9/e20924/ UR - http://dx.doi.org/10.2196/20924 UR - http://www.ncbi.nlm.nih.gov/pubmed/32915762 ID - info:doi/10.2196/20924 ER - TY - JOUR AU - Krishnamurthy, Kamalanand AU - Ambikapathy, Bakiya AU - Kumar, Ashwani AU - Britto, De Lourduraj PY - 2020/9/18 TI - Prediction of the Transition From Subexponential to the Exponential Transmission of SARS-CoV-2 in Chennai, India: Epidemic Nowcasting JO - JMIR Public Health Surveill SP - e21152 VL - 6 IS - 3 KW - COVID-19 KW - epidemic KW - mathematical modeling KW - probabilistic models KW - public transport KW - exponential transmission N2 - Background: Several countries adopted lockdown to slowdown the exponential transmission of the coronavirus disease (COVID-19) epidemic. Disease transmission models and the epidemic forecasts at the national level steer the policy to implement appropriate intervention strategies and budgeting. However, it is critical to design a data-driven reliable model for nowcasting for smaller populations, in particular metro cities. Objective: The aim of this study is to analyze the transition of the epidemic from subexponential to exponential transmission in the Chennai metro zone and to analyze the probability of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) secondary infections while availing the public transport systems in the city. Methods: A single geographical zone ?Chennai-Metro-Merge? was constructed by combining Chennai District with three bordering districts. Subexponential and exponential models were developed to analyze and predict the progression of the COVID-19 epidemic. Probabilistic models were applied to assess the probability of secondary infections while availing public transport after the release of the lockdown. Results: The model predicted that transition from subexponential to exponential transmission occurs around the eighth week after the reporting of a cluster of cases. The probability of secondary infections with a single index case in an enclosure of the city bus, the suburban train general coach, and the ladies coach was found to be 0.192, 0.074, and 0.114, respectively. Conclusions: Nowcasting at the early stage of the epidemic predicts the probable time point of the exponential transmission and alerts the public health system. After the lockdown release, public transportation will be the major source of SARS-CoV-2 transmission in metro cities, and appropriate strategies based on nowcasting are needed. UR - https://publichealth.jmir.org/2020/3/e21152 UR - http://dx.doi.org/10.2196/21152 UR - http://www.ncbi.nlm.nih.gov/pubmed/32609621 ID - info:doi/10.2196/21152 ER - TY - JOUR AU - Churches, Timothy AU - Jorm, Louisa PY - 2020/9/18 TI - Flexible, Freely Available Stochastic Individual Contact Model for Exploring COVID-19 Intervention and Control Strategies: Development and Simulation JO - JMIR Public Health Surveill SP - e18965 VL - 6 IS - 3 KW - COVID-19 KW - epidemic curve KW - infection dynamics KW - public health interventions N2 - Background: Throughout March 2020, leaders in countries across the world were making crucial decisions about how and when to implement public health interventions to combat the coronavirus disease (COVID-19). They urgently needed tools to help them to explore what will work best in their specific circumstances of epidemic size and spread, and feasible intervention scenarios. Objective: We sought to rapidly develop a flexible, freely available simulation model for use by modelers and researchers to allow investigation of how various public health interventions implemented at various time points might change the shape of the COVID-19 epidemic curve. Methods: ?COVOID? (COVID-19 Open-Source Infection Dynamics) is a stochastic individual contact model (ICM), which extends the ICMs provided by the open-source EpiModel package for the R statistical computing environment. To demonstrate its use and inform urgent decisions on March 30, 2020, we modeled similar intervention scenarios to those reported by other investigators using various model types, as well as novel scenarios. The scenarios involved isolation of cases, moderate social distancing, and stricter population ?lockdowns? enacted over varying time periods in a hypothetical population of 100,000 people. On April 30, 2020, we simulated the epidemic curve for the three contiguous local areas (population 287,344) in eastern Sydney, Australia that recorded 5.3% of Australian cases of COVID-19 through to April 30, 2020, under five different intervention scenarios and compared the modeled predictions with the observed epidemic curve for these areas. Results: COVOID allocates each member of a population to one of seven compartments. The number of times individuals in the various compartments interact with each other and their probability of transmitting infection at each interaction can be varied to simulate the effects of interventions. Using COVOID on March 30, 2020, we were able to replicate the epidemic response patterns to specific social distancing intervention scenarios reported by others. The simulated curve for three local areas of Sydney from March 1 to April 30, 2020, was similar to the observed epidemic curve in terms of peak numbers of cases, total numbers of cases, and duration under a scenario representing the public health measures that were actually enacted, including case isolation and ramp-up of testing and social distancing measures. Conclusions: COVOID allows rapid modeling of many potential intervention scenarios, can be tailored to diverse settings, and requires only standard computing infrastructure. It replicates the epidemic curves produced by other models that require highly detailed population-level data, and its predicted epidemic curve, using parameters simulating the public health measures that were enacted, was similar in form to that actually observed in Sydney, Australia. Our team and collaborators are currently developing an extended open-source COVOID package comprising of a suite of tools to explore intervention scenarios using several categories of models. UR - https://publichealth.jmir.org/2020/3/e18965 UR - http://dx.doi.org/10.2196/18965 UR - http://www.ncbi.nlm.nih.gov/pubmed/32568729 ID - info:doi/10.2196/18965 ER - TY - JOUR AU - Adnan, Mehnaz AU - Gao, Xiaoying AU - Bai, Xiaohan AU - Newbern, Elizabeth AU - Sherwood, Jill AU - Jones, Nicholas AU - Baker, Michael AU - Wood, Tim AU - Gao, Wei PY - 2020/9/17 TI - Potential Early Identification of a Large Campylobacter Outbreak Using Alternative Surveillance Data Sources: Autoregressive Modelling and Spatiotemporal Clustering JO - JMIR Public Health Surveill SP - e18281 VL - 6 IS - 3 KW - Campylobacter KW - disease outbreaks KW - forecasting KW - spatio-temporal analysis N2 - Background: Over one-third of the population of Havelock North, New Zealand, approximately 5500 people, were estimated to have been affected by campylobacteriosis in a large waterborne outbreak. Cases reported through the notifiable disease surveillance system (notified case reports) are inevitably delayed by several days, resulting in slowed outbreak recognition and delayed control measures. Early outbreak detection and magnitude prediction are critical to outbreak control. It is therefore important to consider alternative surveillance data sources and evaluate their potential for recognizing outbreaks at the earliest possible time. Objective: The first objective of this study is to compare and validate the selection of alternative data sources (general practice consultations, consumer helpline, Google Trends, Twitter microblogs, and school absenteeism) for their temporal predictive strength for Campylobacter cases during the Havelock North outbreak. The second objective is to examine spatiotemporal clustering of data from alternative sources to assess the size and geographic extent of the outbreak and to support efforts to attribute its source. Methods: We combined measures derived from alternative data sources during the 2016 Havelock North campylobacteriosis outbreak with notified case report counts to predict suspected daily Campylobacter case counts up to 5 days before cases reported in the disease surveillance system. Spatiotemporal clustering of the data was analyzed using Local Moran?s I statistics to investigate the extent of the outbreak in both space and time within the affected area. Results: Models that combined consumer helpline data with autoregressive notified case counts had the best out-of-sample predictive accuracy for 1 and 2 days ahead of notified case reports. Models using Google Trends and Twitter typically performed the best 3 and 4 days before case notifications. Spatiotemporal clusters showed spikes in school absenteeism and consumer helpline inquiries that preceded the notified cases in the city primarily affected by the outbreak. Conclusions: Alternative data sources can provide earlier indications of a large gastroenteritis outbreak compared with conventional case notifications. Spatiotemporal analysis can assist in refining the geographical focus of an outbreak and can potentially support public health source attribution efforts. Further work is required to assess the location of such surveillance data sources and methods in routine public health practice. UR - http://publichealth.jmir.org/2020/3/e18281/ UR - http://dx.doi.org/10.2196/18281 UR - http://www.ncbi.nlm.nih.gov/pubmed/32940617 ID - info:doi/10.2196/18281 ER - TY - JOUR AU - Turicchi, Jake AU - O'Driscoll, Ruairi AU - Finlayson, Graham AU - Duarte, Cristiana AU - Palmeira, L. A. AU - Larsen, C. Sofus AU - Heitmann, L. Berit AU - Stubbs, James R. PY - 2020/9/11 TI - Data Imputation and Body Weight Variability Calculation Using Linear and Nonlinear Methods in Data Collected From Digital Smart Scales: Simulation and Validation Study JO - JMIR Mhealth Uhealth SP - e17977 VL - 8 IS - 9 KW - weight variability KW - weight fluctuation KW - weight cycling KW - weight instability KW - imputation KW - validation KW - digital tracking KW - smart scales KW - body weight KW - energy balance N2 - Background: Body weight variability (BWV) is common in the general population and may act as a risk factor for obesity or diseases. The correct identification of these patterns may have prognostic or predictive value in clinical and research settings. With advancements in technology allowing for the frequent collection of body weight data from electronic smart scales, new opportunities to analyze and identify patterns in body weight data are available. Objective: This study aims to compare multiple methods of data imputation and BWV calculation using linear and nonlinear approaches Methods: In total, 50 participants from an ongoing weight loss maintenance study (the NoHoW study) were selected to develop the procedure. We addressed the following aspects of data analysis: cleaning, imputation, detrending, and calculation of total and local BWV. To test imputation, missing data were simulated at random and using real patterns of missingness. A total of 10 imputation strategies were tested. Next, BWV was calculated using linear and nonlinear approaches, and the effects of missing data and data imputation on these estimates were investigated. Results: Body weight imputation using structural modeling with Kalman smoothing or an exponentially weighted moving average provided the best agreement with observed values (root mean square error range 0.62%-0.64%). Imputation performance decreased with missingness and was similar between random and nonrandom simulations. Errors in BWV estimations from missing simulated data sets were low (2%-7% with 80% missing data or a mean of 67, SD 40.1 available body weights) compared with that of imputation strategies where errors were significantly greater, varying by imputation method. Conclusions: The decision to impute body weight data depends on the purpose of the analysis. Directions for the best performing imputation methods are provided. For the purpose of estimating BWV, data imputation should not be conducted. Linear and nonlinear methods of estimating BWV provide reasonably accurate estimates under high proportions (80%) of missing data. UR - http://mhealth.jmir.org/2020/9/e17977/ UR - http://dx.doi.org/10.2196/17977 UR - http://www.ncbi.nlm.nih.gov/pubmed/32915155 ID - info:doi/10.2196/17977 ER - TY - JOUR AU - Mehta, Mihir AU - Julaiti, Juxihong AU - Griffin, Paul AU - Kumara, Soundar PY - 2020/9/11 TI - Early Stage Machine Learning?Based Prediction of US County Vulnerability to the COVID-19 Pandemic: Machine Learning Approach JO - JMIR Public Health Surveill SP - e19446 VL - 6 IS - 3 KW - COVID-19 KW - coronavirus KW - prediction model KW - county-level vulnerability KW - machine learning KW - XGBoost N2 - Background: The rapid spread of COVID-19 means that government and health services providers have little time to plan and design effective response policies. It is therefore important to quickly provide accurate predictions of how vulnerable geographic regions such as counties are to the spread of this virus. Objective: The aim of this study is to develop county-level prediction around near future disease movement for COVID-19 occurrences using publicly available data. Methods: We estimated county-level COVID-19 occurrences for the period March 14 to 31, 2020, based on data fused from multiple publicly available sources inclusive of health statistics, demographics, and geographical features. We developed a three-stage model using XGBoost, a machine learning algorithm, to quantify the probability of COVID-19 occurrence and estimate the number of potential occurrences for unaffected counties. Finally, these results were combined to predict the county-level risk. This risk was then used as an estimated after-five-day-vulnerability of the county. Results: The model predictions showed a sensitivity over 71% and specificity over 94% for models built using data from March 14 to 31, 2020. We found that population, population density, percentage of people aged >70 years, and prevalence of comorbidities play an important role in predicting COVID-19 occurrences. We observed a positive association at the county level between urbanicity and vulnerability to COVID-19. Conclusions: The developed model can be used for identification of vulnerable counties and potential data discrepancies. Limited testing facilities and delayed results introduce significant variation in reported cases, which produces a bias in the model. UR - http://publichealth.jmir.org/2020/3/e19446/ UR - http://dx.doi.org/10.2196/19446 UR - http://www.ncbi.nlm.nih.gov/pubmed/32784193 ID - info:doi/10.2196/19446 ER - TY - JOUR AU - Jung, Young Se AU - Jo, Hyeontae AU - Son, Hwijae AU - Hwang, Ju Hyung PY - 2020/9/9 TI - Real-World Implications of a Rapidly Responsive COVID-19 Spread Model with Time-Dependent Parameters via Deep Learning: Model Development and Validation JO - J Med Internet Res SP - e19907 VL - 22 IS - 9 KW - epidemic models KW - SIR models KW - time-dependent parameters KW - neural networks KW - deep learning KW - COVID-19 KW - modeling KW - spread KW - outbreak N2 - Background: The COVID-19 pandemic has caused major disruptions worldwide since March 2020. The experience of the 1918 influenza pandemic demonstrated that decreases in the infection rates of COVID-19 do not guarantee continuity of the trend. Objective: The aim of this study was to develop a precise spread model of COVID-19 with time-dependent parameters via deep learning to respond promptly to the dynamic situation of the outbreak and proactively minimize damage. Methods: In this study, we investigated a mathematical model with time-dependent parameters via deep learning based on forward-inverse problems. We used data from the Korea Centers for Disease Control and Prevention (KCDC) and the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University for Korea and the other countries, respectively. Because the data consist of confirmed, recovered, and deceased cases, we selected the susceptible-infected-recovered (SIR) model and found approximated solutions as well as model parameters. Specifically, we applied fully connected neural networks to the solutions and parameters and designed suitable loss functions. Results: We developed an entirely new SIR model with time-dependent parameters via deep learning methods. Furthermore, we validated the model with the conventional Runge-Kutta fourth order model to confirm its convergent nature. In addition, we evaluated our model based on the real-world situation reported from the KCDC, the Korean government, and news media. We also crossvalidated our model using data from the CSSE for Italy, Sweden, and the United States. Conclusions: The methodology and new model of this study could be employed for short-term prediction of COVID-19, which could help the government prepare for a new outbreak. In addition, from the perspective of measuring medical resources, our model has powerful strength because it assumes all the parameters as time-dependent, which reflects the exact status of viral spread. UR - http://www.jmir.org/2020/9/e19907/ UR - http://dx.doi.org/10.2196/19907 UR - http://www.ncbi.nlm.nih.gov/pubmed/32877350 ID - info:doi/10.2196/19907 ER - TY - JOUR AU - Sambaturu, Prathyush AU - Bhattacharya, Parantapa AU - Chen, Jiangzhuo AU - Lewis, Bryan AU - Marathe, Madhav AU - Venkatramanan, Srinivasan AU - Vullikanti, Anil PY - 2020/9/4 TI - An Automated Approach for Finding Spatio-Temporal Patterns of Seasonal Influenza in the United States: Algorithm Validation Study JO - JMIR Public Health Surveill SP - e12842 VL - 6 IS - 3 KW - epidemic data analysis KW - summarization KW - spatio-temporal patterns KW - transactional data mining N2 - Background: Agencies such as the Centers for Disease Control and Prevention (CDC) currently release influenza-like illness incidence data, along with descriptive summaries of simple spatio-temporal patterns and trends. However, public health researchers, government agencies, as well as the general public, are often interested in deeper patterns and insights into how the disease is spreading, with additional context. Analysis by domain experts is needed for deriving such insights from incidence data. Objective: Our goal was to develop an automated approach for finding interesting spatio-temporal patterns in the spread of a disease over a large region, such as regions which have specific characteristics (eg, high incidence in a particular week, those which showed a sudden change in incidence) or regions which have significantly different incidence compared to earlier seasons. Methods: We developed techniques from the area of transactional data mining for characterizing and finding interesting spatio-temporal patterns in disease spread in an automated manner. A key part of our approach involved using the principle of minimum description length for representing a given target set in terms of combinations of attributes (referred to as clauses); we considered both positive and negative clauses, relaxed descriptions which approximately represent the set, and used integer programming to find such descriptions. Finally, we designed an automated approach, which examines a large space of sets corresponding to different spatio-temporal patterns, and ranks them based on the ratio of their size to their description length (referred to as the compression ratio). Results: We applied our methods using minimum description length to find spatio-temporal patterns in the spread of seasonal influenza in the United States using state level influenza-like illness activity indicator data from the CDC. We observed that the compression ratios were over 2.5 for 50% of the chosen sets, when approximate descriptions and negative clauses were allowed. Sets with high compression ratios (eg, over 2.5) corresponded to interesting patterns in the spatio-temporal dynamics of influenza-like illness. Our approach also outperformed description by solution in terms of the compression ratio. Conclusions: Our approach, which is an unsupervised machine learning method, can provide new insights into patterns and trends in the disease spread in an automated manner. Our results show that the description complexity is an effective approach for characterizing sets of interest, which can be easily extended to other diseases and regions beyond influenza in the US. Our approach can also be easily adapted for automated generation of narratives. UR - http://publichealth.jmir.org/2020/3/e12842/ UR - http://dx.doi.org/10.2196/12842 UR - http://www.ncbi.nlm.nih.gov/pubmed/32701458 ID - info:doi/10.2196/12842 ER - TY - JOUR AU - Bendtsen, Marcus PY - 2020/8/27 TI - The P Value Line Dance: When Does the Music Stop? JO - J Med Internet Res SP - e21345 VL - 22 IS - 8 KW - sample size KW - randomized controlled trial KW - Bayesian analysis KW - P value KW - dichotomization KW - dichotomy KW - error KW - uncertainty UR - http://www.jmir.org/2020/8/e21345/ UR - http://dx.doi.org/10.2196/21345 UR - http://www.ncbi.nlm.nih.gov/pubmed/32852275 ID - info:doi/10.2196/21345 ER - TY - JOUR AU - Abdulaal, Ahmed AU - Patel, Aatish AU - Charani, Esmita AU - Denny, Sarah AU - Mughal, Nabeela AU - Moore, Luke PY - 2020/8/25 TI - Prognostic Modeling of COVID-19 Using Artificial Intelligence in the United Kingdom: Model Development and Validation JO - J Med Internet Res SP - e20259 VL - 22 IS - 8 KW - COVID-19 KW - coronavirus KW - machine learning KW - deep learning KW - modeling KW - artificial intelligence KW - neural network KW - prediction N2 - Background: The current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak is a public health emergency and the case fatality rate in the United Kingdom is significant. Although there appear to be several early predictors of outcome, there are no currently validated prognostic models or scoring systems applicable specifically to patients with confirmed SARS-CoV-2. Objective: We aim to create a point-of-admission mortality risk scoring system using an artificial neural network (ANN). Methods: We present an ANN that can provide a patient-specific, point-of-admission mortality risk prediction to inform clinical management decisions at the earliest opportunity. The ANN analyzes a set of patient features including demographics, comorbidities, smoking history, and presenting symptoms and predicts patient-specific mortality risk during the current hospital admission. The model was trained and validated on data extracted from 398 patients admitted to hospital with a positive real-time reverse transcription polymerase chain reaction (RT-PCR) test for SARS-CoV-2. Results: Patient-specific mortality was predicted with 86.25% accuracy, with a sensitivity of 87.50% (95% CI 61.65%-98.45%) and specificity of 85.94% (95% CI 74.98%-93.36%). The positive predictive value was 60.87% (95% CI 45.23%-74.56%), and the negative predictive value was 96.49% (95% CI 88.23%-99.02%). The area under the receiver operating characteristic curve was 90.12%. Conclusions: This analysis demonstrates an adaptive ANN trained on data at a single site, which demonstrates the early utility of deep learning approaches in a rapidly evolving pandemic with no established or validated prognostic scoring systems. UR - http://www.jmir.org/2020/8/e20259/ UR - http://dx.doi.org/10.2196/20259 UR - http://www.ncbi.nlm.nih.gov/pubmed/32735549 ID - info:doi/10.2196/20259 ER - TY - JOUR AU - Liu, Dianbo AU - Clemente, Leonardo AU - Poirier, Canelle AU - Ding, Xiyu AU - Chinazzi, Matteo AU - Davis, Jessica AU - Vespignani, Alessandro AU - Santillana, Mauricio PY - 2020/8/17 TI - Real-Time Forecasting of the COVID-19 Outbreak in Chinese Provinces: Machine Learning Approach Using Novel Digital Data and Estimates From Mechanistic Models JO - J Med Internet Res SP - e20285 VL - 22 IS - 8 KW - COVID-19 KW - coronavirus KW - digital epidemiology KW - modeling KW - modeling disease outbreaks KW - emerging outbreak KW - machine learning KW - precision public health KW - machine learning in public health KW - forecasting KW - digital data KW - mechanistic model KW - hybrid simulation KW - hybrid model KW - simulation N2 - Background: The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; and if left unchecked, they may become major public health threats to the planet. The ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals and caused over 150,000 deaths, is an example of one of these catastrophic events. Objective: We present a timely and novel methodology that combines disease estimates from mechanistic models and digital traces, via interpretable machine learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces in real time. Methods: Our method uses the following as inputs: (a) official health reports, (b) COVID-19?related internet search activity, (c) news media activity, and (d) daily forecasts of COVID-19 activity from a metapopulation mechanistic model. Our machine learning methodology uses a clustering technique that enables the exploitation of geospatial synchronicities of COVID-19 activity across Chinese provinces and a data augmentation technique to deal with the small number of historical disease observations characteristic of emerging outbreaks. Results: Our model is able to produce stable and accurate forecasts 2 days ahead of the current time and outperforms a collection of baseline models in 27 out of 32 Chinese provinces. Conclusions: Our methodology could be easily extended to other geographies currently affected by COVID-19 to aid decision makers with monitoring and possibly prevention. UR - http://www.jmir.org/2020/8/e20285/ UR - http://dx.doi.org/10.2196/20285 UR - http://www.ncbi.nlm.nih.gov/pubmed/32730217 ID - info:doi/10.2196/20285 ER - TY - JOUR AU - Warin, Thierry PY - 2020/8/11 TI - Global Research on Coronaviruses: An R Package JO - J Med Internet Res SP - e19615 VL - 22 IS - 8 KW - COVID-19 KW - SARS-CoV-2 KW - coronavirus KW - R package KW - bibliometric KW - virus KW - infectious disease KW - reference KW - informatics N2 - Background: In these trying times, we developed an R package about bibliographic references on coronaviruses. Working with reproducible research principles based on open science, disseminating scientific information, providing easy access to scientific production on this particular issue, and offering a rapid integration in researchers? workflows may help save time in this race against the virus, notably in terms of public health. Objective: The goal is to simplify the workflow of interested researchers, with multidisciplinary research in mind. With more than 60,500 medical bibliographic references at the time of publication, this package is among the largest about coronaviruses. Methods: This package could be of interest to epidemiologists, researchers in scientometrics, biostatisticians, as well as data scientists broadly defined. This package collects references from PubMed and organizes the data in a data frame. We then built functions to sort through this collection of references. Researchers can also integrate the data into their pipeline and implement them in R within their code libraries. Results: We provide a short use case in this paper based on a bibliometric analysis of the references made available by this package. Classification techniques can also be used to go through the large volume of references and allow researchers to save time on this part of their research. Network analysis can be used to filter the data set. Text mining techniques can also help researchers calculate similarity indices and help them focus on the parts of the literature that are relevant for their research. Conclusions: This package aims at accelerating research on coronaviruses. Epidemiologists can integrate this package into their workflow. It is also possible to add a machine learning layer on top of this package to model the latest advances in research about coronaviruses, as we update this package daily. It is also the only one of this size, to the best of our knowledge, to be built in the R language. UR - http://www.jmir.org/2020/8/e19615/ UR - http://dx.doi.org/10.2196/19615 UR - http://www.ncbi.nlm.nih.gov/pubmed/32730218 ID - info:doi/10.2196/19615 ER - TY - JOUR AU - Zhan, Choujun AU - Tse, Kong Chi AU - Lai, Zhikang AU - Chen, Xiaoyun AU - Mo, Mingshen PY - 2020/7/3 TI - General Model for COVID-19 Spreading With Consideration of Intercity Migration, Insufficient Testing, and Active Intervention: Modeling Study of Pandemic Progression in Japan and the United States JO - JMIR Public Health Surveill SP - e18880 VL - 6 IS - 3 KW - pandemic spreading KW - SEICR model KW - COVID-19 KW - prediction KW - effect of intervention N2 - Background: The coronavirus disease (COVID-19) began to spread in mid-December 2019 from Wuhan, China, to most provinces in China and over 200 other countries through an active travel network. Limited by the ability of the country or city to perform tests, the officially reported number of confirmed cases is expected to be much smaller than the true number of infected cases. Objective: This study aims to develop a new susceptible-exposed-infected-confirmed-removed (SEICR) model for predicting the spreading progression of COVID-19 with consideration of intercity travel and the difference between the number of confirmed cases and actual infected cases, and to apply the model to provide a realistic prediction for the United States and Japan under different scenarios of active intervention. Methods: The model introduces a new state variable corresponding to the actual number of infected cases, integrates intercity travel data to track the movement of exposed and infected individuals among cities, and allows different levels of active intervention to be considered so that a realistic prediction of the number of infected individuals can be performed. Moreover, the model generates future progression profiles for different levels of intervention by setting the parameters relative to the values found from the data fitting. Results: By fitting the model with the data of the COVID-19 infection cases and the intercity travel data for Japan (January 15 to March 20, 2020) and the United States (February 20 to March 20, 2020), model parameters were found and then used to predict the pandemic progression in 47 regions of Japan and 50 states (plus a federal district) in the United States. The model revealed that, as of March 19, 2020, the number of infected individuals in Japan and the United States could be 20-fold and 5-fold as many as the number of confirmed cases, respectively. The results showed that, without tightening the implementation of active intervention, Japan and the United States will see about 6.55% and 18.2% of the population eventually infected, respectively, and with a drastic 10-fold elevated active intervention, the number of people eventually infected can be reduced by up to 95% in Japan and 70% in the United States. Conclusions: The new SEICR model has revealed the effectiveness of active intervention for controlling the spread of COVID-19. Stepping up active intervention would be more effective for Japan, and raising the level of public vigilance in maintaining personal hygiene and social distancing is comparatively more important for the United States. UR - https://publichealth.jmir.org/2020/3/e18880 UR - http://dx.doi.org/10.2196/18880 UR - http://www.ncbi.nlm.nih.gov/pubmed/32589145 ID - info:doi/10.2196/18880 ER - TY - JOUR AU - Turk, J. Philip AU - Chou, Shih-Hsiung AU - Kowalkowski, A. Marc AU - Palmer, P. Pooja AU - Priem, S. Jennifer AU - Spencer, D. Melanie AU - Taylor, J. Yhenneko AU - McWilliams, D. Andrew PY - 2020/6/19 TI - Modeling COVID-19 Latent Prevalence to Assess a Public Health Intervention at a State and Regional Scale: Retrospective Cohort Study JO - JMIR Public Health Surveill SP - e19353 VL - 6 IS - 2 KW - COVID-19 KW - public health surveillance KW - novel coronavirus 2019 KW - pandemic KW - forecasting KW - SIR model KW - detection probability KW - latent prevalence N2 - Background: Emergence of the coronavirus disease (COVID-19) caught the world off guard and unprepared, initiating a global pandemic. In the absence of evidence, individual communities had to take timely action to reduce the rate of disease spread and avoid overburdening their health care systems. Although a few predictive models have been published to guide these decisions, most have not taken into account spatial differences and have included assumptions that do not match the local realities. Access to reliable information that is adapted to local context is critical for policy makers to make informed decisions during a rapidly evolving pandemic. Objective: The goal of this study was to develop an adapted susceptible-infected-removed (SIR) model to predict the trajectory of the COVID-19 pandemic in North Carolina and the Charlotte Metropolitan Region, and to incorporate the effect of a public health intervention to reduce disease spread while accounting for unique regional features and imperfect detection. Methods: Three SIR models were fit to infection prevalence data from North Carolina and the greater Charlotte Region and then rigorously compared. One of these models (SIR-int) accounted for a stay-at-home intervention and imperfect detection of COVID-19 cases. We computed longitudinal total estimates of the susceptible, infected, and removed compartments of both populations, along with other pandemic characteristics such as the basic reproduction number. Results: Prior to March 26, disease spread was rapid at the pandemic onset with the Charlotte Region doubling time of 2.56 days (95% CI 2.11-3.25) and in North Carolina 2.94 days (95% CI 2.33-4.00). Subsequently, disease spread significantly slowed with doubling times increased in the Charlotte Region to 4.70 days (95% CI 3.77-6.22) and in North Carolina to 4.01 days (95% CI 3.43-4.83). Reflecting spatial differences, this deceleration favored the greater Charlotte Region compared to North Carolina as a whole. A comparison of the efficacy of intervention, defined as 1 ? the hazard ratio of infection, gave 0.25 for North Carolina and 0.43 for the Charlotte Region. In addition, early in the pandemic, the initial basic SIR model had good fit to the data; however, as the pandemic and local conditions evolved, the SIR-int model emerged as the model with better fit. Conclusions: Using local data and continuous attention to model adaptation, our findings have enabled policy makers, public health officials, and health systems to proactively plan capacity and evaluate the impact of a public health intervention. Our SIR-int model for estimated latent prevalence was reasonably flexible, highly accurate, and demonstrated efficacy of a stay-at-home order at both the state and regional level. Our results highlight the importance of incorporating local context into pandemic forecast modeling, as well as the need to remain vigilant and informed by the data as we enter into a critical period of the outbreak. UR - http://publichealth.jmir.org/2020/2/e19353/ UR - http://dx.doi.org/10.2196/19353 UR - http://www.ncbi.nlm.nih.gov/pubmed/32427104 ID - info:doi/10.2196/19353 ER - TY - JOUR AU - Chabata, T. Sungai AU - Fearon, Elizabeth AU - Webb, L. Emily AU - Weiss, A. Helen AU - Hargreaves, R. James AU - Cowan, M. Frances PY - 2020/6/15 TI - Assessing Bias in Population Size Estimates Among Hidden Populations When Using the Service Multiplier Method Combined With Respondent-Driven Sampling Surveys: Survey Study JO - JMIR Public Health Surveill SP - e15044 VL - 6 IS - 2 KW - service multiplier method KW - respondent-driven sampling KW - population size estimation KW - female sex workers KW - key populations KW - HIV KW - Zimbabwe N2 - Background: Population size estimates (PSEs) for hidden populations at increased risk of HIV, including female sex workers (FSWs), are important to inform public health policy and resource allocation. The service multiplier method (SMM) is commonly used to estimate the sizes of hidden populations. We used this method to obtain PSEs for FSWs at 9 sites in Zimbabwe and explored methods for assessing potential biases that could arise in using this approach. Objective: This study aimed to guide the assessment of biases that arise when estimating the population sizes of hidden populations using the SMM combined with respondent-driven sampling (RDS) surveys. Methods: We conducted RDS surveys at 9 sites in late 2013, where the Sisters with a Voice program (the program), which collects program visit data of FSWs, was also present. Using the SMM, we obtained PSEs for FSWs at each site by dividing the number of FSWs who attended the program, based on program records, by the RDS-II weighted proportion of FSWs who reported attending this program in the previous 6 months in the RDS surveys. Both the RDS weighting and SMM make a number of assumptions, potentially leading to biases if the assumptions are not met. To test these assumptions, we used convergence and bottleneck plots to assess seed dependence of RDS-II proportion estimates, chi-square tests to assess if there was an association between the characteristics of FSWs and their knowledge of program existence, and logistic regression to compare the characteristics of FSWs attending the program with those recruited to RDS surveys. Results: The PSEs ranged from 194 (95% CI 62-325) to 805 (95% CI 456-1142) across 9 sites from May to November 2013. The 95% CIs for the majority of sites were wide. In some sites, the RDS-II proportion of women who reported program use in the RDS surveys may have been influenced by the characteristics of selected seeds, and we also observed bottlenecks in some sites. There was no evidence of association between characteristics of FSWs and knowledge of program existence, and in the majority of sites, there was no evidence that the characteristics of the populations differed between RDS and program data. Conclusions: We used a series of rigorous methods to explore potential biases in our PSEs. We were able to identify the biases and their potential direction, but we could not determine the ultimate direction of these biases in our PSEs. We have evidence that the PSEs in most sites may be biased and a suggestion that the bias is toward underestimation, and this should be considered if the PSEs are to be used. These tests for bias should be included when undertaking population size estimation using the SMM combined with RDS surveys. UR - http://publichealth.jmir.org/2020/2/e15044/ UR - http://dx.doi.org/10.2196/15044 UR - http://www.ncbi.nlm.nih.gov/pubmed/32459645 ID - info:doi/10.2196/15044 ER - TY - JOUR AU - Her, Qoua AU - Malenfant, Jessica AU - Zhang, Zilu AU - Vilk, Yury AU - Young, Jessica AU - Tabano, David AU - Hamilton, Jack AU - Johnson, Ron AU - Raebel, Marsha AU - Boudreau, Denise AU - Toh, Sengwee PY - 2020/6/4 TI - Distributed Regression Analysis Application in Large Distributed Data Networks: Analysis of Precision and Operational Performance JO - JMIR Med Inform SP - e15073 VL - 8 IS - 6 KW - distributed regression analysis KW - distributed data networks KW - privacy-protecting analytics KW - pharmacoepidemiology KW - PopMedNet N2 - Background: A distributed data network approach combined with distributed regression analysis (DRA) can reduce the risk of disclosing sensitive individual and institutional information in multicenter studies. However, software that facilitates large-scale and efficient implementation of DRA is limited. Objective: This study aimed to assess the precision and operational performance of a DRA application comprising a SAS-based DRA package and a file transfer workflow developed within the open-source distributed networking software PopMedNet in a horizontally partitioned distributed data network. Methods: We executed the SAS-based DRA package to perform distributed linear, logistic, and Cox proportional hazards regression analysis on a real-world test case with 3 data partners. We used PopMedNet to iteratively and automatically transfer highly summarized information between the data partners and the analysis center. We compared the DRA results with the results from standard SAS procedures executed on the pooled individual-level dataset to evaluate the precision of the SAS-based DRA package. We computed the execution time of each step in the workflow to evaluate the operational performance of the PopMedNet-driven file transfer workflow. Results: All DRA results were precise (<10?12), and DRA model fit curves were identical or similar to those obtained from the corresponding pooled individual-level data analyses. All regression models required less than 20 min for full end-to-end execution. Conclusions: We integrated a SAS-based DRA package with PopMedNet and successfully tested the new capability within an active distributed data network. The study demonstrated the validity and feasibility of using DRA to enable more privacy-protecting analysis in multicenter studies. UR - https://medinform.jmir.org/2020/6/e15073 UR - http://dx.doi.org/10.2196/15073 UR - http://www.ncbi.nlm.nih.gov/pubmed/32496200 ID - info:doi/10.2196/15073 ER - TY - JOUR AU - Huang, Yihao AU - Li, Mingtao PY - 2020/5/27 TI - Optimization of Precontrol Methods and Analysis of a Dynamic Model for Brucellosis: Model Development and Validation JO - JMIR Med Inform SP - e18664 VL - 8 IS - 5 KW - brucellosis KW - dynamic model KW - protective measures KW - precontrol methods N2 - Background: Brucella is a gram-negative, nonmotile bacterium without a capsule. The infection scope of Brucella is wide. The major source of infection is mammals such as cattle, sheep, goats, pigs, and dogs. Currently, human beings do not transmit Brucella to each other. When humans eat Brucella-contaminated food or contact animals or animal secretions and excretions infected with Brucella, they may develop brucellosis. Although brucellosis does not originate in humans, its diagnosis and cure are very difficult; thus, it has a huge impact on humans. Even with the rapid development of medical science, brucellosis is still a major problem for Chinese people. Currently, the number of patients with brucellosis in China is 100,000 per year. In addition, due to the ongoing improvement in the living standards of Chinese people, the demand for meat products has gradually increased, and increased meat transactions have greatly promoted the spread of brucellosis. Therefore, many researchers are concerned with investigating the transmission of Brucella as well as the diagnosis and treatment of brucellosis.Mathematical models have become an important tool for the study of infectious diseases. Mathematical models can reflect the spread of infectious diseases and be used to study the effect of different inhibition methods on infectious diseases. The effect of control measures to obtain effective suppression can provide theoretical support for the suppression of infectious diseases. Therefore, it is the objective of this study to build a suitable mathematical model for brucellosis infection. Objective: We aimed to study the optimized precontrol methods of brucellosis using a dynamic threshold?based microcomputer model and to provide critical theoretical support for the prevention and control of brucellosis. Methods: By studying the transmission characteristics of Brucella and building a Brucella transmission model, the precontrol methods were designed and presented to the key populations (Brucella-susceptible populations). We investigated the utilization of protective tools by the key populations before and after precontrol methods. Results: An improvement in the amount of glove-wearing was evident and significant (P<.001), increasing from 51.01% before the precontrol methods to 66.22% after the precontrol methods, an increase of 15.21%. However, the amount of hat-wearing did not improve significantly (P=.95). Hat-wearing among the key populations increased from 57.3% before the precontrol methods to 58.6% after the precontrol methods, an increase of 1.3%. Conclusions: By demonstrating the optimized precontrol methods for a brucellosis model built on a dynamic threshold?based microcomputer model, this study provides theoretical support for the suppression of Brucella and the improved usage of protective measures by key populations. UR - https://medinform.jmir.org/2020/5/e18664 UR - http://dx.doi.org/10.2196/18664 UR - http://www.ncbi.nlm.nih.gov/pubmed/32459180 ID - info:doi/10.2196/18664 ER - TY - JOUR AU - Huang, Yihao AU - Li, Mingtao PY - 2020/5/27 TI - Application of a Mathematical Model in Determining the Spread of the Rabies Virus: Simulation Study JO - JMIR Med Inform SP - e18627 VL - 8 IS - 5 KW - rabies KW - computer model KW - suppression measures KW - basic reproductive number N2 - Background: Rabies is an acute infectious disease of the central nervous system caused by the rabies virus. The mortality rate of rabies is almost 100%. For some countries with poor sanitation, the spread of rabies among dogs is very serious. Objective: The objective of this paper was to study the ecological transmission mode of rabies to make theoretical contributions to the suppression of rabies in China. Methods: A mathematical model of the transmission mode of rabies was constructed using relevant data from the literature and officially published figures in China. Using this model, we fitted the data of the number of patients with rabies and predicted the future number of patients with rabies. In addition, we studied the effectiveness of different rabies suppression measures. Results: The results of the study indicated that the number of people infected with rabies will rise in the first stage, and then decrease. The model forecasted that in about 10 years, the number of rabies cases will be controlled within a relatively stable range. According to the prediction results of the model reported in this paper, the number of rabies cases will eventually plateau at approximately 500 people every year. Relatively effective rabies suppression measures include controlling the birth rate of domestic and wild dogs as well as increasing the level of rabies immunity in domestic dogs. Conclusions: The basic reproductive number of rabies in China is still greater than 1. That is, China currently has insufficient measures to control rabies. The research on the transmission mode of rabies and control measures in this paper can provide theoretical support for rabies control in China. UR - http://medinform.jmir.org/2020/5/e18627/ UR - http://dx.doi.org/10.2196/18627 UR - http://www.ncbi.nlm.nih.gov/pubmed/32459185 ID - info:doi/10.2196/18627 ER - TY - JOUR AU - Yeng, Kandabongee Prosper AU - Woldaregay, Zebene Ashenafi AU - Solvoll, Terje AU - Hartvigsen, Gunnar PY - 2020/5/26 TI - Cluster Detection Mechanisms for Syndromic Surveillance Systems: Systematic Review and Framework Development JO - JMIR Public Health Surveill SP - e11512 VL - 6 IS - 2 KW - sentinel surveillance KW - space-time clustering KW - aberration detection N2 - Background: The time lag in detecting disease outbreaks remains a threat to global health security. The advancement of technology has made health-related data and other indicator activities easily accessible for syndromic surveillance of various datasets. At the heart of disease surveillance lies the clustering algorithm, which groups data with similar characteristics (spatial, temporal, or both) to uncover significant disease outbreak. Despite these developments, there is a lack of updated reviews of trends and modelling options in cluster detection algorithms. Objective: Our purpose was to systematically review practically implemented disease surveillance clustering algorithms relating to temporal, spatial, and spatiotemporal clustering mechanisms for their usage and performance efficacies, and to develop an efficient cluster detection mechanism framework. Methods: We conducted a systematic review exploring Google Scholar, ScienceDirect, PubMed, IEEE Xplore, ACM Digital Library, and Scopus. Between January and March 2018, we conducted the literature search for articles published to date in English in peer-reviewed journals. The main eligibility criteria were studies that (1) examined a practically implemented syndromic surveillance system with cluster detection mechanisms, including over-the-counter medication, school and work absenteeism, and disease surveillance relating to the presymptomatic stage; and (2) focused on surveillance of infectious diseases. We identified relevant articles using the title, keywords, and abstracts as a preliminary filter with the inclusion criteria, and then conducted a full-text review of the relevant articles. We then developed a framework for cluster detection mechanisms for various syndromic surveillance systems based on the review. Results: The search identified a total of 5936 articles. Removal of duplicates resulted in 5839 articles. After an initial review of the titles, we excluded 4165 articles, with 1674 remaining. Reading of abstracts and keywords eliminated 1549 further records. An in-depth assessment of the remaining 125 articles resulted in a total of 27 articles for inclusion in the review. The result indicated that various clustering and aberration detection algorithms have been empirically implemented or assessed with real data and tested. Based on the findings of the review, we subsequently developed a framework to include data processing, clustering and aberration detection, visualization, and alerts and alarms. Conclusions: The review identified various algorithms that have been practically implemented and tested. These results might foster the development of effective and efficient cluster detection mechanisms in empirical syndromic surveillance systems relating to a broad spectrum of space, time, or space-time. UR - http://publichealth.jmir.org/2020/2/e11512/ UR - http://dx.doi.org/10.2196/11512 UR - http://www.ncbi.nlm.nih.gov/pubmed/32357126 ID - info:doi/10.2196/11512 ER - TY - JOUR AU - Huang, Qiangsheng AU - Kang, Sunny Yu PY - 2020/5/25 TI - Mathematical Modeling of COVID-19 Control and Prevention Based on Immigration Population Data in China: Model Development and Validation JO - JMIR Public Health Surveill SP - e18638 VL - 6 IS - 2 KW - COVID-19 KW - 2019-ncov KW - epidemic control and prevention KW - epidemic risk time series model KW - incoming immigration population KW - new diagnoses per day N2 - Background: At the end of February 2020, the spread of coronavirus disease (COVID-19) in China had drastically slowed and appeared to be under control compared to the peak data in early February of that year. However, the outcomes of COVID-19 control and prevention measures varied between regions (ie, provinces and municipalities) in China; moreover, COVID-19 has become a global pandemic, and the spread of the disease has accelerated in countries outside China. Objective: This study aimed to establish valid models to evaluate the effectiveness of COVID-19 control and prevention among various regions in China. These models also targeted regions with control and prevention problems by issuing immediate warnings. Methods: We built a mathematical model, the Epidemic Risk Time Series Model, and used it to analyze two sets of data, including the daily COVID-19 incidence (ie, newly diagnosed cases) as well as the daily immigration population size. Results: Based on the results of the model evaluation, some regions, such as Shanghai and Zhejiang, were successful in COVID-19 control and prevention, whereas other regions, such as Heilongjiang, yielded poor performance. The evaluation result was highly correlated with the basic reproduction number (R0) value, and the result was evaluated in a timely manner at the beginning of the disease outbreak. Conclusions: The Epidemic Risk Time Series Model was designed to evaluate the effectiveness of COVID-19 control and prevention in different regions in China based on analysis of immigration population data. Compared to other methods, such as R0, this model enabled more prompt issue of early warnings. This model can be generalized and applied to other countries to evaluate their COVID-19 control and prevention. UR - http://publichealth.jmir.org/2020/2/e18638/ UR - http://dx.doi.org/10.2196/18638 UR - http://www.ncbi.nlm.nih.gov/pubmed/32396132 ID - info:doi/10.2196/18638 ER - TY - JOUR AU - Avoundjian, Tigran AU - Dombrowski, C. Julia AU - Golden, R. Matthew AU - Hughes, P. James AU - Guthrie, L. Brandon AU - Baseman, Janet AU - Sadinle, Mauricio PY - 2020/4/30 TI - Comparing Methods for Record Linkage for Public Health Action: Matching Algorithm Validation Study JO - JMIR Public Health Surveill SP - e15917 VL - 6 IS - 2 KW - medical record linkage KW - public health surveillance KW - public health practice KW - data management N2 - Background: Many public health departments use record linkage between surveillance data and external data sources to inform public health interventions. However, little guidance is available to inform these activities, and many health departments rely on deterministic algorithms that may miss many true matches. In the context of public health action, these missed matches lead to missed opportunities to deliver interventions and may exacerbate existing health inequities. Objective: This study aimed to compare the performance of record linkage algorithms commonly used in public health practice. Methods: We compared five deterministic (exact, Stenger, Ocampo 1, Ocampo 2, and Bosh) and two probabilistic record linkage algorithms (fastLink and beta record linkage [BRL]) using simulations and a real-world scenario. We simulated pairs of datasets with varying numbers of errors per record and the number of matching records between the two datasets (ie, overlap). We matched the datasets using each algorithm and calculated their recall (ie, sensitivity, the proportion of true matches identified by the algorithm) and precision (ie, positive predictive value, the proportion of matches identified by the algorithm that were true matches). We estimated the average computation time by performing a match with each algorithm 20 times while varying the size of the datasets being matched. In a real-world scenario, HIV and sexually transmitted disease surveillance data from King County, Washington, were matched to identify people living with HIV who had a syphilis diagnosis in 2017. We calculated the recall and precision of each algorithm compared with a composite standard based on the agreement in matching decisions across all the algorithms and manual review. Results: In simulations, BRL and fastLink maintained a high recall at nearly all data quality levels, while being comparable with deterministic algorithms in terms of precision. Deterministic algorithms typically failed to identify matches in scenarios with low data quality. All the deterministic algorithms had a shorter average computation time than the probabilistic algorithms. BRL had the slowest overall computation time (14 min when both datasets contained 2000 records). In the real-world scenario, BRL had the lowest trade-off between recall (309/309, 100.0%) and precision (309/312, 99.0%). Conclusions: Probabilistic record linkage algorithms maximize the number of true matches identified, reducing gaps in the coverage of interventions and maximizing the reach of public health action. UR - http://publichealth.jmir.org/2020/2/e15917/ UR - http://dx.doi.org/10.2196/15917 UR - http://www.ncbi.nlm.nih.gov/pubmed/32352389 ID - info:doi/10.2196/15917 ER - TY - JOUR AU - de Lusignan, Simon AU - Lopez Bernal, Jamie AU - Zambon, Maria AU - Akinyemi, Oluwafunmi AU - Amirthalingam, Gayatri AU - Andrews, Nick AU - Borrow, Ray AU - Byford, Rachel AU - Charlett, André AU - Dabrera, Gavin AU - Ellis, Joanna AU - Elliot, J. Alex AU - Feher, Michael AU - Ferreira, Filipa AU - Krajenbrink, Else AU - Leach, Jonathan AU - Linley, Ezra AU - Liyanage, Harshana AU - Okusi, Cecilia AU - Ramsay, Mary AU - Smith, Gillian AU - Sherlock, Julian AU - Thomas, Nicholas AU - Tripathy, Manasa AU - Williams, John AU - Howsam, Gary AU - Joy, Mark AU - Hobbs, Richard PY - 2020/4/2 TI - Emergence of a Novel Coronavirus (COVID-19): Protocol for Extending Surveillance Used by the Royal College of General Practitioners Research and Surveillance Centre and Public Health England JO - JMIR Public Health Surveill SP - e18606 VL - 6 IS - 2 KW - general practice KW - medical record systems KW - computerized KW - sentinel surveillance KW - coronavirus KW - COVID-19 KW - SARS-CoV-2 KW - surveillance KW - infections KW - pandemic KW - records as topic KW - serology N2 - Background: The Royal College of General Practitioners (RCGP) Research and Surveillance Centre (RSC) and Public Health England (PHE) have successfully worked together on the surveillance of influenza and other infectious diseases for over 50 years, including three previous pandemics. With the emergence of the international outbreak of the coronavirus infection (COVID-19), a UK national approach to containment has been established to test people suspected of exposure to COVID-19. At the same time and separately, the RCGP RSC?s surveillance has been extended to monitor the temporal and geographical distribution of COVID-19 infection in the community as well as assess the effectiveness of the containment strategy. Objectives: The aims of this study are to surveil COVID-19 in both asymptomatic populations and ambulatory cases with respiratory infections, ascertain both the rate and pattern of COVID-19 spread, and assess the effectiveness of the containment policy. Methods: The RCGP RSC, a network of over 500 general practices in England, extract pseudonymized data weekly. This extended surveillance comprises of five components: (1) Recording in medical records of anyone suspected to have or who has been exposed to COVID-19. Computerized medical records suppliers have within a week of request created new codes to support this. (2) Extension of current virological surveillance and testing people with influenza-like illness or lower respiratory tract infections (LRTI)?with the caveat that people suspected to have or who have been exposed to COVID-19 should be referred to the national containment pathway and not seen in primary care. (3) Serology sample collection across all age groups. This will be an extra blood sample taken from people who are attending their general practice for a scheduled blood test. The 100 general practices currently undertaking annual influenza virology surveillance will be involved in the extended virological and serological surveillance. (4) Collecting convalescent serum samples. (5) Data curation. We have the opportunity to escalate the data extraction to twice weekly if needed. Swabs and sera will be analyzed in PHE reference laboratories. Results: General practice clinical system providers have introduced an emergency new set of clinical codes to support COVID-19 surveillance. Additionally, practices participating in current virology surveillance are now taking samples for COVID-19 surveillance from low-risk patients presenting with LRTIs. Within the first 2 weeks of setup of this surveillance, we have identified 3 cases: 1 through the new coding system, the other 2 through the extended virology sampling. Conclusions: We have rapidly converted the established national RCGP RSC influenza surveillance system into one that can test the effectiveness of the COVID-19 containment policy. The extended surveillance has already seen the use of new codes with 3 cases reported. Rapid sharing of this protocol should enable scientific critique and shared learning. International Registered Report Identifier (IRRID): DERR1-10.2196/18606 UR - https://publichealth.jmir.org/2020/2/e18606 UR - http://dx.doi.org/10.2196/18606 UR - http://www.ncbi.nlm.nih.gov/pubmed/32240095 ID - info:doi/10.2196/18606 ER - TY - JOUR AU - Le, Giang AU - Khuu, Nghia AU - Tieu, Thu Van Thi AU - Nguyen, Duy Phuc AU - Luong, Yen Hoa Thi AU - Pham, Duy Quang AU - Tran, Phuc Hau AU - Nguyen, Vu Thuong AU - Morgan, Meade AU - Abdul-Quader, S. Abu PY - 2019/01/29 TI - Population Size Estimation of Venue-Based Female Sex Workers in Ho Chi Minh City, Vietnam: Capture-Recapture Exercise JO - JMIR Public Health Surveill SP - e10906 VL - 5 IS - 1 KW - population size estimation KW - venue-based KW - female sex workers KW - Ho Chi Minh City KW - capture-recapture N2 - Background: There is limited population size estimation of female sex workers (FSWs) in Ho Chi Minh City (HCMC)?the largest city in Vietnam. Only 1 population size estimation among venue-based female sex workers (VFSWs) was conducted in 2012 in HCMC. Appropriate estimates of the sizes of key populations are critical for resource allocation to prevent HIV infection. Objective: The aim of this study was to estimate the population size of the VFSWs from December 2016 to January 2017 in HCMC, Vietnam. Methods: A multistage capture-recapture study was conducted in HCMC. The capture procedures included selection of districts using stratified probability proportion to size, mapping to identify venues, approaching all VFSWs to screen their eligibility, and then distribution of a unique object (a small pink makeup bag) to all eligible VFSWs in all identified venues. The recapture exercise included equal probability random selection of a sample of venues from the initial mapping and then approaching FSWs in those venues to determine the number and proportion of women who received the unique object. The proportion and associated confidence bounds, calculated using sampling weights and accounting for study design, were then divided by the number of objects distributed to calculate the number of VFSWs in the selected districts. This was then multiplied by the inverse of the proportion of districts selected to calculate the number of VFSWs in HCMC as a whole. Results: Out of 24 districts, 6 were selected for the study. Mapping identified 573 venues across which 2317 unique objects were distributed in the first capture. During the recapture round, 103 venues were selected and 645 VFSWs were approached and interviewed. Of those, 570 VFSWs reported receiving the unique object during the capture round. Total estimated VFSWs in the 6 selected districts were 2616 (95% CI 2445-3014), accounting for the fact that only 25% (6/24) of total districts were selected gives an overall estimate of 10,465 (95% CI 9782-12,055) VFSWs in HCMC. Conclusions: The capture-recapture exercise provided an estimated number of VFSWs in HCMC. However, for planning HIV prevention and care service needs among all FSWs, studies are needed to assess the number of sex workers who are not venue-based, including those who use social media platforms to sell services. UR - http://publichealth.jmir.org/2019/1/e10906/ UR - http://dx.doi.org/10.2196/10906 UR - http://www.ncbi.nlm.nih.gov/pubmed/30694204 ID - info:doi/10.2196/10906 ER - TY - JOUR AU - Talaei-Khoei, Amir AU - Wilson, M. James AU - Kazemi, Seyed-Farzan PY - 2019/01/15 TI - Period of Measurement in Time-Series Predictions of Disease Counts from 2007 to 2017 in Northern Nevada: Analytics Experiment JO - JMIR Public Health Surveill SP - e11357 VL - 5 IS - 1 KW - autocorrelation KW - disease counts KW - prediction KW - public health surveillance KW - time-series analysis N2 - Background: The literature in statistics presents methods by which autocorrelation can identify the best period of measurement to improve the performance of a time-series prediction. The period of measurement plays an important role in improving the performance of disease-count predictions. However, from the operational perspective in public health surveillance, there is a limitation to the length of the measurement period that can offer meaningful and valuable predictions. Objective: This study aimed to establish a method that identifies the shortest period of measurement without significantly decreasing the prediction performance for time-series analysis of disease counts. Methods: The data used in this evaluation include disease counts from 2007 to 2017 in northern Nevada. The disease counts for chlamydia, salmonella, respiratory syncytial virus, gonorrhea, viral meningitis, and influenza A were predicted. Results: Our results showed that autocorrelation could not guarantee the best performance for prediction of disease counts. However, the proposed method with the change-point analysis suggests a period of measurement that is operationally acceptable and performance that is not significantly different from the best prediction. Conclusions: The use of change-point analysis with autocorrelation provides the best and most practical period of measurement. UR - http://publichealth.jmir.org/2019/1/e11357/ UR - http://dx.doi.org/10.2196/11357 UR - http://www.ncbi.nlm.nih.gov/pubmed/30664479 ID - info:doi/10.2196/11357 ER - TY - JOUR AU - Alada?, Emre Ahmet AU - Muderrisoglu, Serra AU - Akbas, Berfu Naz AU - Zahmacioglu, Oguzhan AU - Bingol, O. Haluk PY - 2018/06/21 TI - Detecting Suicidal Ideation on Forums: Proof-of-Concept Study JO - J Med Internet Res SP - e215 VL - 20 IS - 6 KW - suicide KW - suicidal ideation KW - suicidality KW - detection KW - prevention KW - classification model KW - text mining KW - machine learning KW - artificial intelligence KW - suicidal surveillance N2 - Background: In 2016, 44,965 people in the United States died by suicide. It is common to see people with suicidal ideation seek help or leave suicide notes on social media before attempting suicide. Many prefer to express their feelings with longer passages on forums such as Reddit and blogs. Because these expressive posts follow regular language patterns, potential suicide attempts can be prevented by detecting suicidal posts as they are written. Objective: This study aims to build a classifier that differentiates suicidal and nonsuicidal forum posts via text mining methods applied on post titles and bodies. Methods: A total of 508,398 Reddit posts longer than 100 characters and posted between 2008 and 2016 on SuicideWatch, Depression, Anxiety, and ShowerThoughts subreddits were downloaded from the publicly available Reddit dataset. Of these, 10,785 posts were randomly selected and 785 were manually annotated as suicidal or nonsuicidal. Features were extracted using term frequency-inverse document frequency, linguistic inquiry and word count, and sentiment analysis on post titles and bodies. Logistic regression, random forest, and support vector machine (SVM) classification algorithms were applied on resulting corpus and prediction performance is evaluated. Results: The logistic regression and SVM classifiers correctly identified suicidality of posts with 80% to 92% accuracy and F1 score, respectively, depending on different data compositions closely followed by random forest, compared to baseline ZeroR algorithm achieving 50% accuracy and 66% F1 score. Conclusions: This study demonstrated that it is possible to detect people with suicidal ideation on online forums with high accuracy. The logistic regression classifier in this study can potentially be embedded on blogs and forums to make the decision to offer real-time online counseling in case a suicidal post is being written. UR - http://www.jmir.org/2018/6/e215/ UR - http://dx.doi.org/10.2196/jmir.9840 UR - http://www.ncbi.nlm.nih.gov/pubmed/29929945 ID - info:doi/10.2196/jmir.9840 ER - TY - JOUR AU - Fearon, Elizabeth AU - Chabata, T. Sungai AU - Thompson, A. Jennifer AU - Cowan, M. Frances AU - Hargreaves, R. James PY - 2017/09/14 TI - Sample Size Calculations for Population Size Estimation Studies Using Multiplier Methods With Respondent-Driven Sampling Surveys JO - JMIR Public Health Surveill SP - e59 VL - 3 IS - 3 KW - population surveillance KW - sample size KW - sampling studies KW - surveys and questionnaires KW - research design KW - data collection KW - sex workers KW - HIV N2 - Background: While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. Objective: To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. Methods: The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (M) by the proportion of individuals in a representative survey who report receipt of the service or object (P). We have developed an approach to sample size calculation, interpreting methods to estimate the variance around estimates obtained using multiplier methods in conjunction with research into design effects and respondent-driven sampling. We describe an application to estimate the number of female sex workers in Harare, Zimbabwe. Results: There is high variance in estimates. Random error around the size estimate reflects uncertainty from M and P, particularly when the estimate of P in the respondent-driven sampling survey is low. As expected, sample size requirements are higher when the design effect of the survey is assumed to be greater. Conclusions: We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when P is small, so balancing against other potential sources of bias, we advise researchers to consider longer service attendance reference periods and to distribute more unique objects, which is likely to result in a higher estimate of P in the respondent-driven sampling survey. UR - http://publichealth.jmir.org/2017/3/e59/ UR - http://dx.doi.org/10.2196/publichealth.7909 UR - http://www.ncbi.nlm.nih.gov/pubmed/28912117 ID - info:doi/10.2196/publichealth.7909 ER - TY - JOUR PY - 2011// TI - Roles of Health Literacy in Relation to Social Determinants of Health and Recommendations for Informatics-Based Interventions: Systematic Review JO - Online J Public Health Inform SP - e3602 VL - 3 IS - 2 UR - UR - http://dx.doi.org/10.5210/ojphi.v3i2.3602 UR - http://www.ncbi.nlm.nih.gov/pubmed/23569612 ID - info:doi/10.5210/ojphi.v3i2.3602 ER -