Characteristics of SARS-CoV-2 positive and complicated COVID-19 patient cohorts in Israel: A comparative analysis

Reliably identifying patients at increased risk for COVID-19 complications could guide clinical decisions, public health policies, and preparedness efforts. The most globally accepted definitions of at-risk patients rely, primarily, on epidemiological characterization of hospitalized COVID-19 patients. However, such characterization overlooks, and fails to correct for, the prevalence of existing conditions in the wider SARS-CoV-2 positive population. Here, we use the complete medical records of 4,353 Israeli SARS-CoV-2 positive individuals, of whom 173 experienced moderate or severe symptoms of COVID-19, to identify the conditions that increase the risk of disease complications, in various age and sex strata. Our analysis suggests that cardiovascular and kidney diseases, obesity, and hypertension are significant risk factors for COVID-19 complications, as previously reported. Interestingly, it also indicates that depression (e.g., odds ratio, OR, for males 65 years or older: 2.94, 95% confidence intervals [1.55, 5.58]; P-value = 0.014) as well cognitive and neurological disorder (e.g., OR for individuals [≥] 65 year old: 2.65 [1.69, 4.17]; P-value < 0.001) are significant risk factors; and that smoking and background of respiratory diseases do not significantly increase the risk of complications. Adjusting existing risk definitions following these observations may improve their accuracy and impact the global pandemic containment efforts.


Introduction
As of May 7 th , 2020, close to four million people worldwide contracted severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and more than 265,000 people died of corona virus disease 2019  complications. This pandemic poses grave challenges to patients, healthcare providers, and policy makers. Many of these challenges may be better addressed with timely stratification of patients to risk groups, based on their past and current medical characteristics. For example, reliably identifying patients at increased (or decreased) risk could guide clinical decisions (e.g., hospitalization vs home care), public health policies (e.g., risk-based quarantine), and preparedness efforts (e.g., expected medical equipment required).
Dozens of studies providing epidemiological characterization of severe COVID-19 patients have been published in the last few months. For example, meta-analysis of twenty Chinese studies [1] suggested that elderly males with a high body mass index (BMI), high breathing rate and a combination of underlying comorbidities (e.g., hypertension, diabetes, cardiovascular disease, and chronic obstructive pulmonary disease) are at a high risk of developing severe complications; and a more recent study characterized patients hospitalized with COVID-19 in the New York City area and compared, side-by-side, the clinical measures for patients discharged alive, died, or remained in hospital [2].
Consequently, a few algorithms for identifying patients at risk for COVID-19 (severe) complications have been proposed. The Centers for Disease Control and Prevention (CDC) identified individuals 65 years and older, living at nursing home or long-term care facility, or suffering from underlying medical conditions, particularly if not well controlled, as being at high risk for severe illness from COVID-19 [3]. Similarly, the European Centre for Disease Prevention and Control (ECDC) lists age above 70 years and some underlying conditions as risk factors for critical illness [4]. The United Kingdom National Health Service (NHS) included solid organ transplant recipients, patient with specific cancers or severe respiratory conditions, pregnant women with significant heart disease, and those with increased risk of infection (e.g., due to immunosuppression therapies) in the highest clinical COVID-19 risk group [5]. On April 2020 approximately 1.3 million people in this group were asked to "shield" by staying at home for a period of at least 12 weeks. In addition, patients over 70 years and those suffering from some underlying health conditions (e.g., chronic respiratory diseases, BMI ≥ 40, and pregnant women) were considered in a wider vulnerable group (also referred to as the "flu group"). Finally, a more quantitative risk model has been adopted by the Israeli Ministry of Health (MoH), assigning a point for each underlying condition from a predefined list, then considering age group and point count to identify high risk patients.
The vast majority of published reports characterized hospitalized (typically, severe) COVID-19 patients and only a handful analyzed broader (though small) cohorts of symptomatic individuals (e.g., [6]). These studies excluded, by design, patients who did not require hospitalization or asymptomatic ones. As such, it is unclear to what extent the prevalence of comorbidities in the studied population differs from that of same age (and sex) SARS-CoV-2 positive patients; and, accordingly, whether these comorbidities are significant risk factors for severe COVID-19 or merely a reflection of comorbidity prevalence in the wider population. Unlike these studies, we compare here the prevalence of existing conditions in SARS-CoV-2 positive and complicated COVID-19 patients and identify those associated with COVID-19 complications in various age and sex strata. Our analysis highlights stratum-specific risk factors and may allow better identification of patients at risk, in different subpopulations.

Maccabi COVID-19 data
Maccabi Health Services (MHS) is a nationwide health plan (payer-provider), representing a quarter of the Israeli population. The MHS database contains longitudinal data on a stable population of over 2.3 million people since 1993 (with annual attrition rate lower than 1%). Data are automatically collected and include comprehensive laboratory data from a single central lab, full pharmacy prescription and purchase data, and extensive demographic information on each patient.

Studied cohorts
SARS-CoV-2 polymerase chain reaction testing in Israel uses both nasopharyngeal and saliva samples. Individuals with positive testing result (until April 22, 2020) are included in the SARS-CoV-2 positive cohort. Positive patients whose disease status, as updated by Israeli hospitals, deteriorated to moderate or severe (at any point in time), admitted to the intensive care unit, or died constitute the complicated COVID-19 cohort; the remaining SARS-CoV-2 positive patients (including asymptomatic, mild COVID-19 patients or those with unknown status) constitute the non-complicated COVID-19 cohort.

Existing conditions
Beside age and sex, we considered a set of existing conditions, comprising those included in the CDC, NHS, and Israeli MoH at-risk definitions, as well as a set of conditions showing significant association with flu and flu-like complications.
To identify each individual's existing conditions, we used, where available, registries created and maintained by MHS. These registries are based on validated inclusion and exclusion criteria (considering coded diagnoses, treatment, labs, and imaging, as applicable). The registries are continuously and retrospectively (since 1998) updated based on each patient's central medical record. Patients may be excluded from a registry when deemed misclassified by their primary physician. Linkage across registries and with other sources of information is performed via a unique national identification number. MHS registries used are: Cardiovascular diseases (including ischemic heart disease, congestive heart failure, peripheral vascular disease, cerebrovascular disease, and other cardiovascular diseases) [7], diabetes [8,9], hypertension [10], osteoporosis [11], chronic kidney disease [12], cognitive disorders, mental illness [13], cancer, immunosuppression, weight disorders (obesity, overweight and underweight), smoking, and nursing home. For other conditions, we relied on previously grouped lists of diagnosis codes (Read codes or International Classification of Diseases, ICD, codes, ninth revision) [14][15][16]: Deficiency anemia, Fluid and electrolyte disorders, chronic obstructive pulmonary disease (COPD), chronic pulmonary disease, neurological disorders, end stage renal disease, rheumatoid arthritis, paralysis, hip fracture, lymphoma, aspiration pneumonia, pleural effusion, respiratory failure, and alcohol consumption.

Statistical analysis
We extracted the prevalence of the studied conditions in the non-complicated and complicated COVID-19 cohorts and measured the association between each condition and disease complications by computing the corresponding odds ratio and its estimated statistical significance (using Fisher's exact test). We conducted the analysis separately in three age groups: 18-50 years, 50-65 years, and 65 years and older; as well as four (age, sex) strata: male or female, younger or older than 65 years. Finally, to account for multiple testing, we controlled for the false discovery rate using Benjamini and Hochberg's method [17]. All analyses were performed using version 4.0.0 of the R programming language (R Project for Statistical Computing; R Foundation).
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2020. .

Results
Maccabi Health Services (MHS) cohort included 4,353 SARS-CoV-2 positive individuals, of whom 173 deteriorated to moderate or severe condition, admitted to the intensive care unit, or died (complicated COVID-19 cohort; see Methods). Overall, patients in the complicated COVID-19 cohort are older, suffer from more comorbidities, and are more predominantly male (Table 1). Moreover, the prevalence of COVID-19 complications increases with age, and more steeply for men than for women ( Figure 1); and the risk of COVID-19 complications in men under 70 years is significantly higher than in women (and Table 2). Comparing the prevalence of existing conditions in three age groups between the complicated and non-complicated COVID-19 cohorts, reveals multiple risk factors, including obesity for patient 18-50 years (OR: 11.09, 95% confidence intervals, CI: [4.15, 32.67]; P-value < 10 -4 ), Chronic kidney disease for patients 50-65 years (4.06 [1.89, 8.38]; P-value = 0.005); and neurological disorders (2.65 [1.69, 4.17]; P-value < 0.001) for patients 65 years or older (for a complete list, see Table 3 and Supplementary File 1).
Stratifying over age (below and above 65 years) and sex ( Male Female . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2020  0.145 a Number of males in the complicated/non-complicated COVID-19 cohorts, followed by the corresponding numbers in females b Odds ratios (ORs) and 95% confidence intervals (in brackets). ORs greater than 1 suggest an increased risk for COVID-19 complications in males c P-values adjusted for multiple testing using Benjamini and Hochberg method [17]. Rows are sorted ascendingly by P-value. 0.036 a Number of patients with the condition (cases) in the complicated/non-complicated COVID-19 cohorts, followed by the corresponding numbers in patients without the studied condition (controls) b Odds ratios (ORs) and P-values as defined in Table 2. ORs greater than 1 suggest an increased risk for COVID-19 complications in patients with the noted condition . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2020. . Table 3

Discussion
We compared the prevalence of dozens of existing conditions in Israeli SARS-CoV-19 positive and complicated COVID-19 patient cohorts, to highlight those conditions associated with high risk of complications. Our approach differs from most (if not all) previous studies, which focused on hospitalized or symptomatic patients and had no (or limited) access to out-of-hospital (likely, asymptomatic and mildly sick) patients.
Many conditions highlighted by our analysis have been previously reported [1,2,6] and are part of commonly used at-risk definitions [3,5], including hypertension, obesity, kidney and cardiovascular diseases. We do, however, identify a few additional risk factors, notably depression in patients 18-50 years old and males 65 years or more; and cognitive and neurological disorders in patients 65 years or older. These additions may be, in part, associated with different age distribution in the 65+ years group (median 76y, IQR [70-83.5y] versus 72y [68-78y] in the complicated and non-complicated COVID-19 cohorts, respectively) and rely on small sample size (only seven 18-50y patients with depression in the complicated COVID-19 cohort; Table 3); nonetheless, they may deserve more consideration in future studies. Our analysis also points out to reduced importance of respiratory diseases and smoking. Both conditions appear as factors in most at-risk definitions [3,5]: Chronic obstructive pulmonary disease has been associated with severe COVID-19 in multiple (though not all [6]) studies [18], while the role of smoking has been somewhat controversial [18,19]. The discrepancies between our analysis and previous reports likely stem from the different cohorts analyzed: SARS-CoV-2 positive individuals, ranging from asymptomatic to severe COVID-19 versus hospitalized COVID-19 patients, respectively.
Other study-related attributes, for example country-specific characteristics, may also contribute to the varying significance of the studied risk factors.
In parallel to the COVID-19 epidemiological characterization efforts, researchers have also attempted to use retrospective observational data to derive risk models for severe COVID-19 patients [20]. Such models require ample data of COVID-19 patients for both model training and performance assessment. As, currently, such data are scarce, some models compromised on using data for other diseases with, supposedly, similar clinical trajectory and complications. For example, DeCapprio et al [19] trained models on US Medicare claims data to predict inpatient visits with a primary diagnosis of either pneumonia, influenza, acute bronchitis, or other specified upper respiratory infections as proxy for COVID-19 complications. However, as previously reported (e.g., in [22]), and in agreement with our analysis, severe COVID-19 patient characteristics differ considerably from other diseases', thus undermining the generalizability of such models to COVID-19.
Our study has several limitations. First and foremost, the number of complicated COVID-19 patients in MHS data is below 200, limiting the statistical power of our analysis. Second, healthcare policies and, in particular, testing criteria, may systematically bias the composition of SARS-CoV-2 positive cohort. Third, asymptomatic and mild COVID-19 patients (currently in the non-complicated cohort) may deteriorate and eventually be part of the complicated cohort, potentially modifying the results of the analysis. Fourth, our analysis is univariate in nature, testing the association of individual conditions with COVID-19 complications; as such, it is unable to uncover more complex relations, e.g., interdependencies between conditions and COVID-19 complications. Finally, we focused on data from Israel; characteristics in other geographies may differ [22]. We attempted to mitigate some of these limitations by age and sex stratification and robust estimations of statistical significance. We also note that, at the current point in time, many of these shortcomings are shared by all published COVID-19 research work.
Notwithstanding these limitations, our work adopts a novel vantage point to the problem of identifying patients at increased risk for COVID-19 complications. Importantly, as SARS-CoV-2 containment efforts focus on patients at risk for severe complications (for example, shielding vulnerable population in the UK [5]), changes in the list of considered conditions may have huge effect on a large number of individuals, thus calling for continuous fine-tuning of the corresponding definitions.