<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.0 20040830//EN" "http://dtd.nlm.nih.gov/publishing/2.0/journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.0" xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">JPH</journal-id>
      <journal-id journal-id-type="nlm-ta">JMIR Public Health Surveill</journal-id>
      <journal-title>JMIR Public Health and Surveillance</journal-title>
      <issn pub-type="epub">2369-2960</issn>
      <publisher>
        <publisher-name>JMIR Publications</publisher-name>
        <publisher-loc>Toronto, Canada</publisher-loc>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">v10i1e53330</article-id>
      <article-id pub-id-type="pmid">38666756</article-id>
      <article-id pub-id-type="doi">10.2196/53330</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Original Paper</subject>
        </subj-group>
        <subj-group subj-group-type="article-type">
          <subject>Original Paper</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>A Comprehensive Youth Diabetes Epidemiological Data Set and Web Portal: Resource Development and Case Studies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="editor">
          <name>
            <surname>Mavragani</surname>
            <given-names>Amaryllis</given-names>
          </name>
        </contrib>
        <contrib contrib-type="editor">
          <name>
            <surname>Sanchez</surname>
            <given-names>Travis</given-names>
          </name>
        </contrib>
      </contrib-group>
      <contrib-group>
        <contrib contrib-type="reviewer">
          <name>
            <surname>El Khamlichi</surname>
            <given-names>Sokaina</given-names>
          </name>
        </contrib>
        <contrib contrib-type="reviewer">
          <name>
            <surname>Zhao</surname>
            <given-names>Chenhao</given-names>
          </name>
        </contrib>
        <contrib contrib-type="reviewer">
          <name>
            <surname>Su</surname>
            <given-names>Yujie</given-names>
          </name>
        </contrib>
      </contrib-group>
      <contrib-group>
        <contrib id="contrib1" contrib-type="author" equal-contrib="yes">
          <name name-style="western">
            <surname>McDonough</surname>
            <given-names>Catherine</given-names>
          </name>
          <degrees>MS</degrees>
          <xref rid="aff1" ref-type="aff">1</xref>
          <ext-link ext-link-type="orcid">https://orcid.org/0009-0009-8465-6632</ext-link>
        </contrib>
        <contrib id="contrib2" contrib-type="author" equal-contrib="yes">
          <name name-style="western">
            <surname>Li</surname>
            <given-names>Yan Chak</given-names>
          </name>
          <degrees>MPhil</degrees>
          <xref rid="aff1" ref-type="aff">1</xref>
          <ext-link ext-link-type="orcid">https://orcid.org/0000-0001-6554-5958</ext-link>
        </contrib>
        <contrib id="contrib3" contrib-type="author">
          <name name-style="western">
            <surname>Vangeepuram</surname>
            <given-names>Nita</given-names>
          </name>
          <degrees>MPH, MD</degrees>
          <xref rid="aff2" ref-type="aff">2</xref>
          <xref rid="aff3" ref-type="aff">3</xref>
          <ext-link ext-link-type="orcid">https://orcid.org/0000-0003-4848-4633</ext-link>
        </contrib>
        <contrib id="contrib4" contrib-type="author">
          <name name-style="western">
            <surname>Liu</surname>
            <given-names>Bian</given-names>
          </name>
          <degrees>PhD</degrees>
          <xref rid="aff3" ref-type="aff">3</xref>
          <ext-link ext-link-type="orcid">https://orcid.org/0000-0001-9166-693X</ext-link>
        </contrib>
        <contrib id="contrib5" contrib-type="author" corresp="yes">
          <name name-style="western">
            <surname>Pandey</surname>
            <given-names>Gaurav</given-names>
          </name>
          <degrees>PhD</degrees>
          <xref rid="aff1" ref-type="aff">1</xref>
          <address>
            <institution>Department of Genetics and Genomic Sciences</institution>
            <institution>Icahn School of Medicine at Mount Sinai</institution>
            <addr-line>1 Gustave L. Levy Pl</addr-line>
            <addr-line>New York, NY, 10029</addr-line>
            <country>United States</country>
            <phone>1 212 241 6500</phone>
            <email>gaurav.pandey@mssm.edu</email>
          </address>
          <ext-link ext-link-type="orcid">https://orcid.org/0000-0003-1939-679X</ext-link>
        </contrib>
      </contrib-group>
      <aff id="aff1">
        <label>1</label>
        <institution>Department of Genetics and Genomic Sciences</institution>
        <institution>Icahn School of Medicine at Mount Sinai</institution>
        <addr-line>New York, NY</addr-line>
        <country>United States</country>
      </aff>
      <aff id="aff2">
        <label>2</label>
        <institution>Department of Pediatrics</institution>
        <institution>Icahn School of Medicine at Mount Sinai</institution>
        <addr-line>New York, NY</addr-line>
        <country>United States</country>
      </aff>
      <aff id="aff3">
        <label>3</label>
        <institution>Department of Population Health Science and Policy</institution>
        <institution>Icahn School of Medicine at Mount Sinai</institution>
        <addr-line>New York, NY</addr-line>
        <country>United States</country>
      </aff>
      <author-notes>
        <corresp>Corresponding Author: Gaurav Pandey <email>gaurav.pandey@mssm.edu</email></corresp>
      </author-notes>
      <pub-date pub-type="collection">
        <year>2024</year>
      </pub-date>
      <pub-date pub-type="epub">
        <day>2</day>
        <month>7</month>
        <year>2024</year>
      </pub-date>
      <volume>10</volume>
      <elocation-id>e53330</elocation-id>
      <history>
        <date date-type="received">
          <day>5</day>
          <month>10</month>
          <year>2023</year>
        </date>
        <date date-type="rev-request">
          <day>9</day>
          <month>1</month>
          <year>2024</year>
        </date>
        <date date-type="rev-recd">
          <day>6</day>
          <month>2</month>
          <year>2024</year>
        </date>
        <date date-type="accepted">
          <day>26</day>
          <month>4</month>
          <year>2024</year>
        </date>
      </history>
      <copyright-statement>©Catherine McDonough, Yan Chak Li, Nita Vangeepuram, Bian Liu, Gaurav Pandey. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 02.07.2024.</copyright-statement>
      <copyright-year>2024</copyright-year>
      <license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
        <p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.</p>
      </license>
      <self-uri xlink:href="https://publichealth.jmir.org/2024/1/e53330" xlink:type="simple"/>
      <abstract>
        <sec sec-type="background">
          <title>Background</title>
          <p>The prevalence of type 2 diabetes mellitus (DM) and pre–diabetes mellitus (pre-DM) has been increasing among youth in recent decades in the United States, prompting an urgent need for understanding and identifying their associated risk factors. Such efforts, however, have been hindered by the lack of easily accessible youth pre-DM/DM data.</p>
        </sec>
        <sec sec-type="objective">
          <title>Objective</title>
          <p>We aimed to first build a high-quality, comprehensive epidemiological data set focused on youth pre-DM/DM. Subsequently, we aimed to make these data accessible by creating a user-friendly web portal to share them and the corresponding codes. Through this, we hope to address this significant gap and facilitate youth pre-DM/DM research.</p>
        </sec>
        <sec sec-type="methods">
          <title>Methods</title>
          <p>Building on data from the National Health and Nutrition Examination Survey (NHANES) from 1999 to 2018, we cleaned and harmonized hundreds of variables relevant to pre-DM/DM (fasting plasma glucose level ≥100 mg/dL or glycated hemoglobin  ≥5.7%) for youth aged 12-19 years (N=15,149). We identified individual factors associated with pre-DM/DM risk using bivariate statistical analyses and predicted pre-DM/DM status using our Ensemble Integration (EI) framework for multidomain machine learning. We then developed a user-friendly web portal named Prediabetes/diabetes in youth Online Dashboard (POND) to share the data and codes.</p>
        </sec>
        <sec sec-type="results">
          <title>Results</title>
          <p>We extracted 95 variables potentially relevant to pre-DM/DM risk organized into 4 domains (sociodemographic, health status, diet, and other lifestyle behaviors). The bivariate analyses identified 27 significant correlates of pre-DM/DM (<italic>P</italic>&lt;.001, Bonferroni adjusted), including race or ethnicity, health insurance, BMI, added sugar intake, and screen time. Among these factors, 16 factors were also identified based on the EI methodology (Fisher <italic>P</italic> of overlap=7.06×10<sup>-6</sup>). In addition to those, the EI approach identified 11 additional predictive variables, including some known (eg, meat and fruit intake and family income) and less recognized factors (eg, number of rooms in homes). The factors identified in both analyses spanned across all 4 of the domains mentioned. These data and results, as well as other exploratory tools, can be accessed on POND.</p>
        </sec>
        <sec sec-type="conclusions">
          <title>Conclusions</title>
          <p>Using NHANES data, we built one of the largest public epidemiological data sets for studying youth pre-DM/DM and identified potential risk factors using complementary analytical approaches. Our results align with the multifactorial nature of pre-DM/DM with correlates across several domains. Also, our data-sharing platform, POND, facilitates a wide range of applications to inform future youth pre-DM/DM studies.</p>
        </sec>
      </abstract>
      <kwd-group>
        <kwd>youth prediabetes and diabetes</kwd>
        <kwd>public data set</kwd>
        <kwd>NHANES</kwd>
        <kwd>web portal</kwd>
        <kwd>epidemiology</kwd>
        <kwd>biostatistics</kwd>
        <kwd>machine learning</kwd>
        <kwd>National Health and Nutrition Examination Survey</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec sec-type="introduction">
      <title>Introduction</title>
      <p>Type 2 diabetes mellitus (DM) is a complex disease influenced by several biological and epidemiological factors [<xref ref-type="bibr" rid="ref1">1</xref>,<xref ref-type="bibr" rid="ref2">2</xref>], such as obesity [<xref ref-type="bibr" rid="ref3">3</xref>], family history [<xref ref-type="bibr" rid="ref4">4</xref>], diet [<xref ref-type="bibr" rid="ref1">1</xref>,<xref ref-type="bibr" rid="ref5">5</xref>], physical activity level [<xref ref-type="bibr" rid="ref1">1</xref>,<xref ref-type="bibr" rid="ref6">6</xref>-<xref ref-type="bibr" rid="ref8">8</xref>], and socioeconomic status [<xref ref-type="bibr" rid="ref9">9</xref>-<xref ref-type="bibr" rid="ref11">11</xref>]. Prediabetes, characterized by elevated blood glucose levels below the diabetes threshold, is a precursor condition to DM [<xref ref-type="bibr" rid="ref12">12</xref>]. There has been an alarming increasing trend in the prevalence of youth with pre–diabetes mellitus (pre-DM) and DM both in the United States [<xref ref-type="bibr" rid="ref13">13</xref>-<xref ref-type="bibr" rid="ref19">19</xref>] and worldwide [<xref ref-type="bibr" rid="ref20">20</xref>,<xref ref-type="bibr" rid="ref21">21</xref>], and the numbers of newly diagnosed youth living with pre-DM/DM are also expected to increase [<xref ref-type="bibr" rid="ref14">14</xref>,<xref ref-type="bibr" rid="ref20">20</xref>,<xref ref-type="bibr" rid="ref22">22</xref>]. The latest estimate based on nationally representative data showed that the prevalence of pre-DM among youth increased from 11.6% in 1999-2002 to 28.2% in 2015-2018 in the United States [<xref ref-type="bibr" rid="ref13">13</xref>]. This growth is particularly concerning because pre-DM/DM disproportionately affects racial and ethnic minority groups and those with low socioeconomic status [<xref ref-type="bibr" rid="ref9">9</xref>-<xref ref-type="bibr" rid="ref11">11</xref>,<xref ref-type="bibr" rid="ref22">22</xref>-<xref ref-type="bibr" rid="ref24">24</xref>], leading to significant health disparities. Having pre-DM/DM at a younger age also confers a higher health and economic burden resulting from living with the condition for more years and a higher risk of developing other cardiometabolic diseases [<xref ref-type="bibr" rid="ref25">25</xref>-<xref ref-type="bibr" rid="ref30">30</xref>]. This serious challenge calls for increased translational research into factors associated with pre-DM/DM among youth and how they can collectively affect disease risk and inform prevention strategies.</p>
      <p>In particular, the most critically needed research in this direction is exploring the collective impact of various risk factors across multiple health-related domains. While clinical factors, such as obesity, have been mechanistically linked to insulin resistance [<xref ref-type="bibr" rid="ref31">31</xref>], it is important to consider the broader perspective. There is an increasing recognition that social determinants of health (SDoH) play a significant role in amplifying the risk of pre-DM/DM and their related disparities. For example, factors such as limited access to health care, food and housing insecurity, and the neighborhood-built environment have been identified as influential contributors [<xref ref-type="bibr" rid="ref9">9</xref>-<xref ref-type="bibr" rid="ref11">11</xref>,<xref ref-type="bibr" rid="ref32">32</xref>]. However, to gain a comprehensive understanding, it is essential to delve into other less studied variables, such as screen time, acculturation, or frequency of eating out, and examine how they interact to increase the risk of pre-DM/DM among youth [<xref ref-type="bibr" rid="ref2">2</xref>].</p>
      <p>One of the major challenges that has limited translational research into youth pre-DM/DM risk factors is that there are not publicly available, easily accessible data comprehensively profiling interrelated epidemiological factors for young individuals [<xref ref-type="bibr" rid="ref2">2</xref>]. Specifically, most available public diabetes data portals focus on providing aggregated descriptive trends, such as pre-DM/DM prevalence for the entire population or subgroups stratified by race and ethnicity [<xref ref-type="bibr" rid="ref33">33</xref>-<xref ref-type="bibr" rid="ref36">36</xref>], which does not allow in-depth examination of the relationships between multiple risk factors and pre-DM/DM risk using individual-level data. While there do exist a few individual-level public diabetes data sets [<xref ref-type="bibr" rid="ref37">37</xref>-<xref ref-type="bibr" rid="ref41">41</xref>], they include mainly clinical measurements, while other important risk factors such as those related to diet, physical activity, and SDoH are limited. In addition, these data sets are not available for youth populations, as they focus exclusively on adult populations and not on youth specifically [<xref ref-type="bibr" rid="ref37">37</xref>,<xref ref-type="bibr" rid="ref39">39</xref>-<xref ref-type="bibr" rid="ref41">41</xref>]. Furthermore, these data sets are not accompanied by any user-friendly web-based portals that can help explore or analyze these data to reveal interesting knowledge about youth pre-DM/DM. This shows that there is a lack of a comprehensive data set that includes multiple epidemiological variables to study youth pre-DM/DM and easily usable functionalities to explore and analyze data.</p>
      <p>To directly address this data gap, we turned to the National Health and Nutrition Examination Survey (NHANES), which offers a promising path for examining pre-DM/DM among the US youth population by providing a rich source of individual- and household-level epidemiological factors. As a result, NHANES has been a prominent data source for studying youth pre-DM/DM trends and associated factors [<xref ref-type="bibr" rid="ref18">18</xref>,<xref ref-type="bibr" rid="ref42">42</xref>-<xref ref-type="bibr" rid="ref45">45</xref>]. However, the use of NHANES data requires extensive data processing that is laborious and time-intensive [<xref ref-type="bibr" rid="ref46">46</xref>]. This represents a major challenge for the widespread use of these high-quality and extensive data for studying youth pre-DM/DM.</p>
      <p>In this work, we directly addressed the above challenges by processing NHANES data from 1999 to 2018 into a large-scale, youth diabetes–focused data set that covers a variety of relevant variable domains, namely, sociodemographic factors, health status indicators, diet, and other lifestyle behaviors. We also provided public access to this high-quality comprehensive youth pre-DM/DM data set, as well as functionalities to explore and analyze it, through the user-friendly Prediabetes/diabetes in youth Online Dashboard (POND) [<xref ref-type="bibr" rid="ref47">47</xref>]. We demonstrated the data set’s use and potential through 2 case studies that used statistical analyses and machine learning (ML) approaches, respectively, to identify important epidemiological factors that are associated with youth pre-DM/DM.</p>
      <p>Through this work, we aim to advance youth diabetes research by providing the most comprehensive epidemiological data set available through a public web portal and illustrating the value of these resources through our example case studies based on statistical analyses and ML. Our overarching goal is to enable researchers to investigate the multifactorial variables associated with youth pre-DM/DM, which may drive translational advances in prevention and management strategies.</p>
    </sec>
    <sec sec-type="methods">
      <title>Methods</title>
      <sec>
        <title>Overview</title>
        <p><xref rid="figure1" ref-type="fig">Figure 1</xref> [<xref ref-type="bibr" rid="ref48">48</xref>] shows the overall study design and workflow. In the following subsections, we detail the components of the workflow.</p>
        <fig id="figure1" position="float">
          <label>Figure 1</label>
          <caption>
            <p>Study design and workflow. We processed data from 10 survey cycles (1999-2018) from the National Health and Nutrition Examination Survey (NHANES), which yielded 15,149 youths with known pre-DM/DM status. We extracted 95 variables that were relevant to pre-DM/DM and organized them into 4 domains: sociodemographic, health status, diet, and other lifestyle behaviors. We made the data set easily accessible to the public through the user-friendly POND (Prediabetes/diabetes in youth Online Dashboard) web portal, enabling users to navigate, visualize, and download the data. In addition, we conducted 2 case studies with complementary statistical and machine learning methods that are designed to illustrate the translation potential of our data set and point. Both analyses identified predictive variables associated with youth diabetes, and the results can be explored in POND (some images in this figure were obtained from an open-source collection). DM: diabetes mellitus.</p>
          </caption>
          <graphic xlink:href="publichealth_v10i1e53330_fig1.png" alt-version="no" mimetype="image" position="float" xlink:type="simple"/>
        </fig>
      </sec>
      <sec>
        <title>Data Source and Study Population</title>
        <p>We built the youth pre-DM/DM data set based on publicly available NHANES data [<xref ref-type="bibr" rid="ref49">49</xref>] spanning the years from 1999 to 2018. Developed by the Centers for Disease Control and Prevention, NHANES is a serial cross-sectional survey that gathers comprehensive health-related information from nationally representative samples of the noninstitutionalized population in the United States. The survey uses a multistage probability sampling method and collects data through questionnaires, physical examinations, and biomarker analysis. Each year, approximately 5000 individuals are included in the survey, and the data are publicly released in 2-year cycles.</p>
        <p><xref rid="figure2" ref-type="fig">Figure 2</xref> details the process used to define our study population. Briefly, of the total 101,316 participants in 1999-2018 NHANES, we excluded individuals who (1) were not within the 12-19 years age range, (2) did not have either of the biomarkers used to define pre-DM/DM status, and (3) answered “Yes” to “Have you ever been told by a doctor or health professional that you have diabetes?” The youth pre-DM/DM outcome of this work was derived as follows: youth were considered at risk of pre-DM/DM if their fasting plasma glucose (FPG) was at or greater than 100 mg/dL, or their glycated hemoglobin (HbA<sub>1c</sub>) was at or greater than 5.7%, according to the current American Diabetes Association (ADA) pediatric clinical guidelines [<xref ref-type="bibr" rid="ref2">2</xref>].</p>
        <fig id="figure2" position="float">
          <label>Figure 2</label>
          <caption>
            <p>Flow chart showing the inclusion and exclusion criteria applied to 1999-2018 NHANES participants that yielded the study population included in our youth pre-DM/DM data set. Pre-DM/DM status was defined by the current American Diabetes Association (ADA) biomarker criteria, that is, elevated levels of 1 of 2 pre-DM/DM biomarkers (FPG ≥100 mg/dL or HbA1c ≥5.7%). DM: diabetes mellitus; FPG: fasting plasma glucose; HbA1c: glycated hemoglobin; NHANES: National Health and Nutrition Examination Survey.</p>
          </caption>
          <graphic xlink:href="publichealth_v10i1e53330_fig2.png" alt-version="no" mimetype="image" position="float" xlink:type="simple"/>
        </fig>
      </sec>
      <sec>
        <title>Validation of the Study Population</title>
        <p>We estimated pre-DM/DM prevalence across the 10 survey cycles (1999-2018) by incorporating the NHANES design elements in the analysis and compared the general trend with those reported in the literature [<xref ref-type="bibr" rid="ref18">18</xref>,<xref ref-type="bibr" rid="ref19">19</xref>]. We also specifically applied the analytical methods reported in a recent study [<xref ref-type="bibr" rid="ref13">13</xref>] based on NHANES data to our study population to replicate the trends in pre-DM among youth in the United States from 1999 to 2018 reported in that analysis. Specifically, that study selected a youth population from 12-19 years of age with positive sampling weight from the fasting subsample (ie, nonzero and nonmissing Fasting Subsample 2 Year Mobile Examination Centers Weight [“WTSAF2YR”]; personal communication) without a self-reported physician-diagnosed DM. In addition, that study focused only on pre-DM, which was defined as an HbA<sub>1c</sub> level between 5.7% and 6.4% or an FPG level between 100 mg/dL and 125 mg/dL [<xref ref-type="bibr" rid="ref13">13</xref>].</p>
      </sec>
      <sec>
        <title>Development of Youth Pre-DM/DM Data Set</title>
        <p>Based on the most recent ADA standard of care recommendations including factors related to pre-DM/DM risk and management [<xref ref-type="bibr" rid="ref2">2</xref>], we selected 27 potentially relevant NHANES questionnaires and grouped them into 4 domains: sociodemographic, health status, diet, and other lifestyle behaviors. For example, under the health status domain, BMI was included as a potential risk factor for youth pre-DM/DM [<xref ref-type="bibr" rid="ref2">2</xref>]. Similarly, lifestyle and behavioral variables included factors, such as diet and physical activity, that have been shown to be critical for pre-DM/DM prevention in both observational studies and randomized clinical trials [<xref ref-type="bibr" rid="ref50">50</xref>-<xref ref-type="bibr" rid="ref52">52</xref>]. Our sociodemographic domain included demographic, socioeconomic, and SDoH variables (eg, age, gender, poverty status, and food security). Except for commonly available clinical measurements, such as blood pressure and total cholesterol, we did not include laboratory data (eg, triglycerides, transferrin, C-reactive protein, interleukin-6, and white blood cells), since these measurements were not collected for all NHANES participants and were not commonly accessible for the general population.</p>
        <p>From the selected questionnaires, we identified a list of 95 variables based on the aforementioned methodology. The complete list of variables is provided in Table S1 in Section S1 of <xref ref-type="supplementary-material" rid="app1">Multimedia Appendix 1</xref> [<xref ref-type="bibr" rid="ref13">13</xref>,<xref ref-type="bibr" rid="ref49">49</xref>,<xref ref-type="bibr" rid="ref53">53</xref>-<xref ref-type="bibr" rid="ref62">62</xref>] and on our POND web portal [<xref ref-type="bibr" rid="ref47">47</xref>]. All the code developed, processed data, and detailed description of variables are also available on the web portal [<xref ref-type="bibr" rid="ref47">47</xref>]. The process of extracting these variables involved extensive examination of the questions that were asked, consultation of the literature, and discussions to reach consensus within the study team. The details of this process are provided in Figure S1 and Section S2 of <xref ref-type="supplementary-material" rid="app1">Multimedia Appendix 1</xref>. We used SAS (version 9.4; SAS Institute) and R (version 4.2.2; R Core Team, 2022) in R Studio (version 4.2.2; R Core Team, 2022) for data processing and data set development.</p>
      </sec>
      <sec>
        <title>Building the POND</title>
        <p>To facilitate other researchers’ use of our youth pre-DM/DM data set and make our methodology transparent and reproducible, we developed POND to share our processed data set and enable users to understand and explore the data on their own. The web portal was developed using R <italic>markdown</italic> and the <italic>flexdashboard</italic> package [<xref ref-type="bibr" rid="ref63">63</xref>] and was published as a Shiny application [<xref ref-type="bibr" rid="ref64">64</xref>]. Table S2 and Section S3 in <xref ref-type="supplementary-material" rid="app1">Multimedia Appendix 1</xref> provide details of all the R packages used to develop POND, and the related code is available on the portal’s do