Self-reported symptoms of SARS-CoV-2 infection in a non-hospitalized population: results from the large Italian web-based EPICOVID19 cross-sectional survey

Background: Understanding the occurrence of Severe Acute Respiratory Syndrome-Coronavirus-2 (SARS-CoV-2)-like symptoms in a large non-hospitalized population, when the epidemic peak was occurring in Italy, is of paramount importance but data are scarce. Objective: Aims of this study were to evaluate the association of self-reported symptoms with SARS-CoV-2 nasopharyngeal swab (NPS) test in non-hospitalized individuals and to estimate the occurrence of COVID-19-like symptoms in a larger nontested population. Methods: This is an Italian countrywide self-administered cross-sectional web-based survey on voluntary adults who completed an anonymous questionnaire in the period 13-21 April 2020. The associations between symptoms potentially related to SARSCoV-2 infection and NPS results were calculated as adjusted odds ratios with 95% confidence intervals (aOR, 95%CI) by means of multiple logistic regression analysis controlling for age, sex, education, smoking habits, and the number of co-morbidities. Thereafter, for each symptom and for their combination, we calculated sensitivity, specificity, accuracy and AUC in a ROC analysis to estimate the occurrence of COVID-19-like infections in the non-tested population. Results: A total of 171,310 responded to the survey (59.9% females, mean age 47.4 years). Out of the 4,785 respondents with known NPS test result, 4,392 were not hospitalized. Among them, the NPS positive respondents (n=856) most frequently reported myalgia (61.6%), olfactory and/or taste disorders (OTDs, 59.2%), cough (54.4%), and fever (51.9%) whereas 7.7% were asymptomatic. Multiple regression analysis showed that OTDs (aOR 10.3, [95%CI 8.4-12.7]), fever (2.5, 95%CI 2.0-3.1), myalgia (1.5, 95%CI 1.2-1.8), and cough (1.3, 95%CI 1.0-1.6) were associated with NPS positivity. Having two to four of these symptoms increased the aOR from 7.4 (95%CI, 5.6-9.7) to 35.5 (95%CI, 24.6-52.2). The combination of the four symptoms showed an AUC of 0.810 (95%CI 0.795-0.825) in classifying NPS-P, and was applied to the non-hospitalized and non-tested sample (n=165,782). We found that from 4.4% to 12.1% of respondents had experienced symptoms suggestive of COVID-19 infection. Conclusions: Our results suggest that self-reported symptoms may be reliable indicators of SARS-CoV-2 infection in a pandemic context. A not negligible part (up to 12.1%) of the symptomatic respondents were left undiagnosed and potentially contributed to the spread of the infection. (JMIR Preprints 27/06/2020:21866) DOI: https://doi.org/10.2196/preprints.21866 Preprint Settings 1) Would you like to publish your submitted manuscript as preprint? Please make my preprint PDF available to anyone at any time (recommended). Please make my preprint PDF available only to logged-in users; I understand that my title and abstract will remain visible to all users. https://preprints.jmir.org/preprint/21866 [unpublished, non-peer-reviewed preprint] JMIR Preprints Adorni et al Only make the preprint title and abstract visible. No, I do not wish to publish my submitted manuscript as a preprint. 2) If accepted for publication in a JMIR journal, would you like the PDF to be visible to the public? Yes, please make my accepted manuscript PDF available to anyone at any time (Recommended). Yes, but please make my accepted manuscript PDF available only to logged-in users; I understand that the title and abstract will remain visible to all users (see Important note, above). I also understand that if I later pay to participate in <a href="https://jmir.zendesk.com/hc/en-us/articles/360008899632-What-is-the-PubMed-Now-ahead-of-print-option-when-I-pay-the-APF-" target="_blank">JMIR’s PubMed Now! service</a> service, my accepted manuscript PDF will automatically be made openly available. Yes, but only make the title and abstract visible (see Important note, above). I understand that if I later pay to participate in <a href="https://jmir.zendesk.com/hc/en-us/articles/360008899632-What-is-the-PubMed-Now-ahead-of-print-option-when-I-pay-the-APF-" target="_blank">JMIR’s PubMed Now! service</a> service, my accepted manuscript PDF will automatically be made openly available. https://preprints.jmir.org/preprint/21866 [unpublished, non-peer-reviewed preprint] JMIR Preprints Adorni et al


Table of Contents
Self-reported symptoms of SARS-CoV-2 infection in a non-hospitalized population: results from the large Italian web-based EPICOVID19 crosssectional survey.
Only make the preprint title and abstract visible. No, I do not wish to publish my submitted manuscript as a preprint. 2) If accepted for publication in a JMIR journal, would you like the PDF to be visible to the public?
Yes, please make my accepted manuscript PDF available to anyone at any time (Recommended).
Yes, but please make my accepted manuscript PDF available only to logged-in users; I understand that the title and abstract will remain v Yes, but only make the title and abstract visible (see Important note, above). I understand that if I later pay to participate in <a href="http

ABSTRACT Background
Understanding the occurrence of Severe Acute Respiratory Syndrome-Coronavirus-2 (SARS-CoV-2)-like symptoms in a large non-hospitalized population, when the epidemic peak was occurring in Italy, is of paramount importance but data are scarce.

Objective
Aims of this study were to evaluate the association of self-reported symptoms with SARS-CoV-2 nasopharyngeal swab (NPS) test in non-hospitalized individuals and to estimate the occurrence of COVID-19-like symptoms in a larger non-tested population.

This is an Italian countrywide self-administered cross-sectional web-based survey on voluntary
adults who completed an anonymous questionnaire in the period 13-21 April 2020. The associations between symptoms potentially related to SARS-CoV-2 infection and NPS results were calculated as adjusted odds ratios with 95% confidence intervals (aOR, 95%CI) by means of multiple logistic regression analysis controlling for age, sex, education, smoking habits, and the number of comorbidities. Thereafter, for each symptom and for their combination, we calculated sensitivity, specificity, accuracy and AUC in a ROC analysis to estimate the occurrence of COVID- 19-like infections in the non-tested population.  [3].

Results
It is worth noting that only approximately 20% of SARS-CoV-2 infected patients required hospital care [4]. The vast majority experience mild or subclinical form of the disease did not require hospital admission [5] and a relatively high percentage (40 to 45%) remained asymptomatic [6].
Fever, upper respiratory symptoms, myalgia, headache, and gastrointestinal disturbances have been frequently reported [4,7], as well as the olfactory and taste disorders (OTDs), by SARS-CoV-2 patients [8]. Nevertheless, the prevalence of COVID-19 related symptoms in the population of nonhospitalized is still poorly investigated [9,10]. An early recognition of the conditions attributable to the infection is of paramount importance. This is particularly relevant for identifying promptly not only cases with severe clinical course but also the ones with milder symptomatology who can spread the infection, and who need to be immediately quarantined while testing and contact tracing is carried out.
This study is based on EPICOVID19, an anonymised self-administered web-based survey aimed at estimating the number of suspected cases of COVID-19 and investigating the role of the potential determinants of SARS-CoV-2 infection in a large Italian sample of respondents living in Italy during the lockdown (started in Italy on 9 March 2020). The aims of this paper are to evaluate the association of self-symptoms with SARS-CoV-2 nasopharyngeal swab (NPS) test in non-hospitalized individuals, and to estimate the occurrence of COVID-19-like symptoms in the non-tested population.

Study design and setting
EPICOVID19 is a national Italian internet-based survey that was carried out using a cross-sectional research design by a working group dedicated to collaborative public health SARS-CoV-2 research.
The survey was launched on 13 April 2020 and targeted adult volunteers living in Italy during the lockdown.

Recruitment
In order to enrol as many subjects as possible, the survey was promoted using social media (Facebook, Twitter, Instagram, Whatsapp), press releases, internet pages, local radio and TV stations, and institutional websites that called upon volunteers to contact the study website [11]. The inclusion criteria were i) age of >18 years; ii) access to a mobile phone, computer, or tablet with internet connectivity; and iii) on-line consent to participate in the study.

Development of the web-based questionnaire
EPICOVID19 was developed by the working group after a literature review of existing research into COVID-19, starting with the WHO protocols [12], and of the standard and validated instruments previously used to investigate severe acute respiratory syndrome (SARS) and Middle Eastern respiratory syndrome (MERS) [13,14].
The questionnaire was adapted to the national context and implemented using the European Commission's open-source official EUSurvey management tool [15]. The participants were asked to complete the self-administered 38-item questionnaire, which contained mainly mandatory and closed questions divided into six sections: i) socio-demographic data; ii) clinical evaluation; iii) personal characteristics and health status; iv) housing conditions; v) lifestyle; and vi) behaviours following the lockdown (see Annex 1).

Data collection and variables
For the purposes of this study, we analysed a sub-set of data collected between 13 and 21 April 2020.

Study groups definitions
For the aims of this study, we defined three study samples: 1. Sample A, including the total population of respondents (N=171,310); 2. Subsample B, including the non-hospitalized individuals and NPS tested with known result (n=4,392); 3. Subsample C, including the non-hospitalized and non-tested individuals (n=165,782).

Statistical analysis
The continuous variables were expressed as mean values with standard deviations (SD), and the categorical variables as counts and percentages. The χ 2 test and one-way analysis of variance (ANOVA) were used to compare the characteristics of respondents by NPS test result (sample A).
The geographical coverage of the sample was evaluated by calculating response rates by Italian region standardized by the number of residents aged >18 years on 1 January 2019 [16]. When analysing subsample B we calculated the matrix of pairwise tetrachoric correlations of self-reported symptoms, given the dichotomous nature of these variables. Crude and adjusted logistic regression models, controlling for age, sex, education, smoking habits, and the number of co-morbidities were applied to assess the measurements of association between self-reported symptoms and SARS-CoV-2 NPS positivity versus negativity by estimating the odds ratios (aOR) and 95% Confidence Intervals  Six hundred and ten respondents (0.35%) said that they had been hospitalised between 1 February and 21 April 2020, including 399 of the 5,317 who were tested for SARS-CoV-2 infection (7.5%) (Supplementary Table S1). Females and younger respondents were less likely to be NPS-P, whereas those with a lower level of education or retired were more frequently NPS-P. Current smokers were less prevalent among the subjects with a positive NPS test (9.5%).

Geographical coverage
Although it lacked a formal sampling strategy, the survey reached a large number of subjects throughout Italy. Figure 1 shows the standardized response rates and the incidence of SARS-CoV-2 infection per 100,000 inhabitants by Italian region as of 23 April 2020 [16,18]. As expected, response rates were higher in the northern regions (Lombardy and Piedmont) and reflected the incidence of confirmed cases at that time.  Excluding the respondents who referred that their first symptom appeared in February in the sensitivity analysis did not substantially change the results (Supplementary Table S3).

Self-reported symptoms
In Tables 3 and 4 are shown the results of the sex-and age-stratified multiple regression analyses.
OTDs were more closely associated with the odds of a positive test in females (   After dichotomizing for the presence of two or more and of three or more, the resulting aORs were 12.17 (95%CI, 9.50-15.59) and 22.44 (95%CI, 16.93-29.75). When the four symptoms were singularly analyzed, the larger AUC (0.749, 95%CI 0.730-0.767) was found for OTDs, characterized also by better Sp=91.8%, with myalgia having the higher sensitivity (Se=61.6%) in classifying NPS-P. The combination of the four symptoms increased the AUC to 0.810 (95%CI 0.795-0.825), with higher sensitivity at the cut-off of two or more (Se=70.7) and higher specificity at the cut-off of three or more (Sp=91.2%) (data not shown).
As a final step, we quantified the amount of probable SARS-CoV-2 infections in the nonhospitalized and non-tested population (subsample C) by calculating the frequencies for the combination of the four symptoms resulted from the analysis on subsample B. We found that, with an accuracy of 77.2% and 83.0% respectively, 20,103 respondents (12.1%, 95%CI 12.0%-12.3%) had two or more and 7,739 ones (4.4%, 95%CI 4.3%-4.6%) had three or more symptoms suggestive of novel coronavirus disease.

DISCUSSION
This study, based on the responses of >170,000 persons to a web-based survey, outlined the COVID-19 symptom profile of the cases that did not require hospitalisation during the outbreak of the epidemic in Italy. OTDs, myalgia, fever and cough were symptoms associated with laboratoryproven SARS-CoV-2 infection. Among non-hospitalized and non-tested respondents, from 4.4% to 12.1% experienced symptoms suggestive of COVID-19 illness.
Although approximately 60% of the respondents reported at least one symptom compatible with the viral infection, only 3.4% of these had access to NPS testing for SARS-CoV-2. Respondents with at least one symptom accounted for 94% of NPS-P patients, 70% of NPS-N patients, and 75% of the patients with an unknown NPS test result. We here report that sub-groups with symptomatology similar to NPS-P subjects have not been tested, a worrying finding that suggests a large number of cases may have remained undiagnosed or may not have been correctly quarantined [19]. Active case finding with prompt isolation and contact tracing would be a highly important means of ending the spread of SARS-CoV-2 infection [20], which otherwise likely might continue through households [21]. The very limited number of respondents who were diagnosed based on NPS testing is a consequence of the decision by health authorities to reserve the use of diagnostics for clinically severe cases and creating suboptimal conditions for effective contact tracing.
A number of papers have described the clinical characteristics, symptoms and disease course of SARS-CoV-2 in-patients [22,23] and out-patients [24], but still little is known about the natural history of the infection and its clinical spectrum or rate of symptoms in non-hospitalized COVID-19 cases. In our analyses, we showed a strong association between OTDs and NPS-P, with NSP-P respondents having more than 10-fold increased risk of having OTDs. In line with our findings, OTDs has been reported as symptom specific of SARS-CoV-2 infection in clinical [8,25] and nonclinical setting [9,26,27]. In 18,401 users of "COVID symptom tracker mobile app" in UK and US who underwent molecular testing, loss of smell in addition to fever and persistent cough was found as potential predictor of COVID-19 [9]. Similar results were recently reported from two other online surveys in Italian [26] and in French population [27]. Consistently with the aforementioned population studies, we also found that other COVID-19 related symptoms as fever, myalgia or cough were significantly associated with NPS-P, even though less specific than OTDs. Overall, the four above-mentioned symptoms demonstrated an additive effect that increases the probability of NPS-P.
Interestingly, our sub-set analyses revealed some associations between the respondents' symptoms and their demographic characteristics. The association between OTDs and NPS-P was stronger in younger patients, possibly because the known deterioration in the sense of smell during aging [28] means that younger subjects are more likely to notice its loss. We also found that NSP-P was more closely associated with OTDs in women and with fever in men, although both symptoms are significantly associated with NSP-P in both sexes. The association between female and OTDs has also been reported in hospitalized COVID-19 patients [8].
Notably, in the subpopulation of 165,782 participants who were not NPS-tested and non-hospitalized, we calculated with accuracy close to 80% that 12.1% had two or more of these symptoms and 4.4% had three or more, leading to a significant number of adults with COVID-19like illness. Applying the most conservative criterion (presence of three or more symptoms at the same time), characterized by a specificity of 91.2%, we estimated that about 2.2 millions of Italian adults had high probability of being COVID-19 symptomatic cases up to April 21, 2020.
The estimation of the real proportion of the population infected is a fundamental indicator for public health policy makers in the ongoing COVID-19 pandemic. During the epidemic peak, model-based estimates [29] have suggested that the ratio between notified and actual cases ranges from 1:5 to 1:20, but to date in Italy real-world data are limited to restricted local settings or are available only in case of NPS testing of symptomatic patients with serious illness and requiring intensive or subintensive medical care. This lack has led to a wide underestimation of the spread of novel coronavirus in the mild symptomatic individuals or in those with limited access to testing. For this reason, our results seem to be quite and consistent with other surveys performed in large populations. A model combing symptoms to predict probable infection was applied to the data derived from the "COVID symptom tracker mobile app" in UK and USA [9], indicating that the 17.4% users were likely to have COVID-19-like infection. Data from a nationally representative survey of Canadian indicated that about 8% of adults reported they or someone in their household had symptoms suggestive of novel coronavirus disease in March 2020 [10].
These findings suggest that during a pandemic, when testing and contact tracing should be prioritized, the presence of such symptoms, also detected through a simple anamnestic investigation, may be an early indicator of SARS-CoV-2 infection in individuals who should be quarantined and molecularly tested.
It is also interesting to note that 7.7% of non-hospitalized patients with a NPS-P test reported no symptoms. A number of studies have suggested that asymptomatic patients may be virus spreaders [30,31]. According to the results from sixteen SARS-CoV-2 testing studies pooled by Oran and colleagues, asymptomatic persons accounted for approximately 40% to 45% of COVID-19 infections [6]. In the Italian population study carried out on about 2,500 residents in the municipality of Vo', authors showed that the age-adjusted prevalence of COVID-19 asymptomatic cases was 43.2% (95CI 32.2%-54.7%) [5]. The characteristics of our study make it unsuitable for precisely estimating the percentage of completely asymptomatic, and our lower-than-expected findings can be explained by the limited access to molecular testing for the asymptomatic individuals and by the possible over-reporting of symptoms.
Our data concerning an apparently protective role of smoking in relation to NPS-P add new evidence to a panorama in which it has been suggested that this habit may have divergent clinical, prognostic and epidemiological effects in COVID-19 patients [32]. This issue will be investigated in more detail in a separate article in order to contribute further to the current debate [33].

Study limitations and strengths
Given the voluntary nature of the survey, it was not intended to assess a representative sample of the general population. However, extensive participation has allowed us to collect a sample that is quite balanced in terms of gender and age, although more shifted towards younger subjects with a higher level of education as can be expected in the case of an on-line questionnaire. The characteristics of a web-based survey may have also introduced a bias leading people with symptoms to respond more often than those without symptoms, and people who are health-conscious to exacerbate (overreport) their symptoms. In addition, some symptoms (e.g OTDs) are more likely to be subjected to recall bias due to media emphasis on their association with the disease.
At the date of the survey collection analysis the NPS "testing rate" among Italian adults (age≥18 was 610 (0.36%), among these 279 (0.16%) were NPS-P, in line with the hospitalization rates in the general population.
As the sample was self-selected the generalisability of our results should be done with caution.
Lastly, a single self-reported negative test cannot exclude a possible SARS-CoV-2 infection.
On-line surveys have become an accepted, low-cost and scalable means [34,35] of efficiently and rapidly involving a large number of people regardless of geographical distance, thus making them somehow preferable to more traditional, time-consuming and expensive methods, especially in an ongoing emergency situation. Further, in the context of this outbreak the EPICOVID19 survey might have involved people who had no opportunity to report their symptoms in other ways. It is noteworthy that our survey achieved a satisfactory geographical coverage, proportional, as expected, to the distribution of COVID-19 infection and to the reasonable likelihood that communities living in more affected areas would be more willing to respond.
To the best of our knowledge, this is the largest Italian web-based survey of SARS-CoV-2 symptoms and, notably, carried out during the epidemic peak in Italy, when data at population level were unavailable. National authorities, healthcare workers and the public have known little about the real spread of the infection since it started. Our preliminary findings shed some light on paucisymptomatic or mild infections by novel coronavirus disease in Italy.

Conclusions
The adoption of effective strategies and ready-to-use digital tools, like the real-time reporting Legend: Left: response rates x100,000. Right: incidence rates of SARS-CoV-2 x100,000. Legend: Error bars are ±2*standard error (normal approximation).