Diet, Nutrition, Obesity, and Their Implications for COVID-19 Mortality: Development of a Marginalized Two-Part Model for Semicontinuous Data

Background: Nutrition is not a treatment for COVID-19, but it is a modifiable contributor to the development of chronic disease, which is highly associated with COVID-19 severe illness and deaths. A well-balanced diet and healthy patterns of eating strengthen the immune system, improve immunometabolism, and reduce the risk of chronic disease and infectious diseases. Objective: This study aims to assess the effect of diet, nutrition, obesity, and their implications for COVID-19 mortality among 188 countries by using new statistical marginalized two-part models. Methods: We globally evaluated the distribution of diet and nutrition at the national level while considering the variations between different World Health Organization regions. The effects of food supply categories and obesity on (as well as associations with) the number of deaths and the number of recoveries were reported globally by estimating coefficients and conducting color maps. Results: The findings show that a 1% increase in supplementation of pulses reduced the odds of having a zero death by 4-fold (OR 4.12, 95% CI 11.97-1.42). In addition, a 1% increase in supplementation of animal products and meat increased the odds of having a zero death by 1.076-fold (OR 1.076, 95% CI 1.01-1.15) and 1.13-fold (OR 1.13, 95% CI 1.0-1.28), respectively. Tree nuts reduced the odds of having a zero death, and vegetables increased the number of deaths. Globally, the results also showed that populations (countries) who consume more eggs, cereals excluding beer, spices, and stimulants had the greatest impact on the recovery of patients with COVID-19. In addition, populations that consume more meat, vegetal products, sugar and sweeteners, sugar crops, animal fats, and animal products were associated with more death and less recoveries in patients. The effect of consuming sugar products on mortality was considerable, and obesity has affected increased death rates and reduced recovery rates. Conclusions: Although there are differences in dietary patterns, overall, unbalanced diets are a health threat across the world and not only affect death rates but also the quality of life. To achieve the best results in preventing nutrition-related pandemic diseases, strategies and policies should fully recognize the essential role of both diet and obesity in determining good nutrition and optimal health. Policies and programs must address the need for change at the individual level and make modifications in society and the environment to make healthier choices accessible and preferable. (JMIR Public Health Surveill 2021;7(1):e22717) doi: 10.2196/22717 JMIR Public Health Surveill 2021 | vol. 7 | iss. 1 | e22717 | p. 1 http://publichealth.jmir.org/2021/1/e22717/ (page number not for citation purposes) Kamyari et al JMIR PUBLIC HEALTH AND SURVEILLANCE


Introduction
Transmission of COVID-19 began in Wuhan, Hubei Province, China on December 31, 2019 [1,2]. According to the latest World Health Organization (WHO) report on July 3, 2020, there were 11,188,120 confirmed cases and 528,431 deaths worldwide, with 1505 total cases and 69.3 deaths per 1 million population [3]. The WHO named it a global pandemic because of the rapid outbreak of the disease worldwide [4,5].
The COVID-19 epidemic started during winter in areas of the world where the consumption of wildlife is not uncommon. Coronavirus is one of the viruses causing the common cold, a disease that has never had a cure nor any effective prevention or vaccine. However, there are relatively consistent data suggesting that the risk of contracting the common cold is high under inadequate sleep, psychosocial or physical stress including exposure to cold temperatures, inadequate nutrition, and any condition that compromises the body's immune system [6].
A high percentage of COVID-19 deaths worldwide are associated with one or more chronic conditions. It is also evident that older people are at a higher risk for severe illness with this pandemic [7,8]. Nutrition is not a treatment for COVID-19, but it is a modifiable contributor to the development of chronic disease, which is highly associated with COVID-19 severe illness and deaths [9]. A well-balanced diet and healthy patterns of eating strengthens the immune system, improves immunometabolism, and reduces the risk of chronic disease and infectious diseases [10,11]. Furthermore, nutrition may have a positive or negative impact on COVID-19, as it may be a way to support people at higher risk for the disease (ie, older people and people with pre-existing conditions of noncommunicable diseases) [12].
It is clear in these challenging times that optimizing nutrition is important, not only for ourselves but also for every patient that goes through their own period of treatment. Every health system should be aware of the benefits of healthy eating and be able to provide sound nutritional guidance to their patients, especially those with chronic disease. Having knowledge about nutritional interventions that may help prevent chronic conditions and their associated risks is now more important than ever [13].
On the other hand, being overweight or obese are interpreted as excessive fat [14] accumulation and represent a risk to health [15]. Most of the world's populations live in countries where being overweight or obese kill more people than being underweight. However, does it cause a decrease in the immune system or severity of COVID-19? Is it dangerous toward getting an infection and the mortality of COVID- 19? This study aims to assess the effect of diet, nutrition, and obesity on COVID-19 mortality among 188 countries by using new statistical marginalized two-part (mTP) models. Hence, we globally evaluated the distribution of diet and nutrition on the national level while considering the variation between different regions. The effects of food supply categories and obesities on (as well as associations with) the number of deaths and the number of recoveries is reported worldwide by estimating coefficients and conducting color maps.

Overview
This section starts with a short description of the data set and information on relevant sources. In the following section, we introduce the conventional two-part (TP) regression model and the proposed mTP regression model for semicontinuous data. For the continuous part, we considered two flexible distributions including log-normal (LN) and beta prime (BP). We also described their properties to assess the overall impact of covariates on the marginal mean and demonstrated that the proposed model outperforms the conventional model. Finally, the proposed mTP model was applied to the healthy diet data set on fat quantity and protein to investigate the effects of nutrition categories and obesity on the number of deaths and recoveries in 100 cases of COVID-19.

Dietary, Obesity, and COVID-19 Data
Food supply data is some of the most important data in both Food and Agriculture Organization (FAO)/WHO STAT [16]. In fact, this data is the basis for estimations of global and national undernourishment assessment when it is combined with parameters and other data sets. In addition, both businesses and governments use this data for economic analysis and policy setting, and the academic community also uses this data.
In this data set, we combined data of different types of food, world population obesity and undernourished rates, and the global COVID-19 cases count from around the world (188 countries) to learn more about how a healthy eating style could help combat COVID-19. In addition, from the data set, we can gather information regarding diet patterns from countries with lower COVID-19 infection rates and adjust our own diet accordingly. The spread of the disease, deaths, recoveries, and their different distributions are shown in Figure 1, which can be evaluated according to the WHO regions. From the data sets, accessible as Google sheets in GitHub [17], we have used fat quantity and protein for different categories of food (all calculated as percentage of total intake amount). We have also added on the obesity rate (in percentage) for comparison. The end of the data sets also included the most up-to-date confirmed infections, deaths, recoveries, and active cases (also in percentage of current population for each country). In this study, response variables were the deaths in 100 cases and the recoveries in 100 cases that were continuously (ranged 0 to 100) measured for 188 countries [18].
To synchronize results relative to interregional variations, data sets were grouped according to WHO regions (Multimedia Appendix 1), and a mTP analysis of deaths and recoveries was conducted using a random effects (regions cluster) model. Supply food data description is described in Multimedia Appendix 2. Both fat quantity and protein data sets, including 23 categories, were obtained from the FAO database [19] and were used to show the specific types of food that belong to each category for assessing influential effects of the fat quantity and protein supply.
Semicontinuous response variables such as mortality indexes are typically characterized by the presence of zeros and positive continuous outcomes that are often right skewed. In this paper, we propose a class of models for positive and zero responses by means of a zero-augmented mixed regression model. Under this class, we are particularly interested in studying positive responses whose distribution accommodates skewness. At the same time, responses can be zero, and therefore, we justified the use of a zero-augmented mixture model.

Conventional Two-Part Model
We began with a review of the conventional TP model presented in Cragg [20], Manning et al [21], Duan et al [22], and elsewhere. Let Y ij be a semicontinuous variable for the i-th (i=1,2, ..., n) subject at cluster j (j=1,2, ..., n i ). For nonnegative data (Y ij ≥0) consisting of independent observations that clustered in the j-th level, the generic form of the conventional TP model can be written as: where π ij = Pr(Y ij >0), 1 (.) is the indicator function, and g(y ij |y ij >0) is any density function applicable to the positive values of Y ij , although the LN density is often chosen. This model is parameterized following equations (2) and (3) relevant to zero and nonzero components, respectively: where Z ij is a 1 × q covariate (used as an explanatory variable) vector, α is a q × 1 regression coefficient vector, and b 1i is the cluster-level random effect in the zero component. The location parameter μ ij is modeled in the second part of the TP model assuming a log link: where X ij is a 1 × p covariate vector, β is a p × 1 regression coefficient vector, and b 2i is again the cluster-level random effect in the nonzero component. The error term ε ij is assumed to be normally distributed as N(0, ). Note that this TP mixed model can be extended to include additional random effects. For illustration purposes and simplicity, we restricted attention here to the TP mixed models with two levels; extensions to multilevel models are straightforward. When fitting this model to independent responses, the binary and conditionally continuous components of the likelihood are separable, and therefore, these two parts are fit separately. The binary component is often modeled using logistic regression, and the continuous component can be fit using standard regression models such as the BP [23], LN [24], gamma [25,26], and long-skew normal [27].
The marginal mean and variance of Y ij from a TP model can be derived as follows: When LN is assumed in the continuous part, the marginal mean is:

Marginalized Two-Part Model
To obtain interpretable covariate effects on the marginal (unconditional) mean, we proposed the following mTP model that parameterizes the covariate effects directly in terms of the marginal mean, ν i =E(Y i ), on the original (ie, untransformed) data scale. The mTP model specifies the linear predictors: where b 1i represents the random effect that accounts for the within-subject correlation pertaining to the clustered measures in the zero part, b 1i~ N(0, ).
where b 2i represents the random effect that accounts for the within-subject correlation pertaining to the clustered measures in the continuous part, b 2i~ N(0, ).
The two random effect intercepts b 1i and b 2i in the two processes of zero and nonzero are assumed to be independent and uncorrelated.
is the vector of covariates for the i-th subject measured at the j-th cluster for the binary part, and is the vector of covariates for the i-th subject measured at the j-th cluster used for the continuous part. The two parts might have common covariates or completely different ones. α is the vector of model coefficients corresponding to the binary part, and β is the vector of coefficients corresponding to the continuous part, conditional on the values being nonzero. The model can be easily extended to include higher-order random effects.

Marginalized Two-Part Log-Normal Model
When modeling semicontinuous data, the continuous component is most frequently modeled using a LN distribution. The generic form of the marginalized two-part log-normal (mTP LN) model for independent responses can be written as in equation (1), with g(y ij |y ij >0), taking the LN density function LN(.; μ, σ 2 ) with mean μ and variance σ 2 on the log scale. The marginal mean and variance of Y ij are: The likelihood (L), parameterized in terms of π ij and μ ij , is: where ϕ(b 1ij , b 2ij ) represents the bivariate normal distribution for the random effects with a mean vector of zeros and variance-covariance matrix and for zero and nonzero parts, respectively.
To use this LN likelihood framework, the marginal mean in equation (8) can be rearranged to solve for μ ij , yielding: Noting that: and:

Marginalized Two-Part Beta Prime Model
The BP distribution [28,29] is also known as inverted beta distribution or beta distribution of the second kind, often the model of choice for fitting semicontinuous data where the response variable is measured continuously on the positive real line (Y>0) because of the flexibility it provides in terms of the variety of shapes it can accommodate. The probability density function of a BP distributed random variable Y parameterized in terms of its mean μ and a precision parameter ϕ is given by: To obtain interpretable covariate effects on the marginal mean, we proposed the following mTP model that parameterizes the covariate effects directly in terms of the marginal mean, ν ij = E(Y ij ), on the original (ie, untransformed) data scale. The mTP model with random (cluster) effects Z 1ij and Z 2ij for the zero and the continuous components, respectively, specifies the linear predictors: where, and have full rank p and q for the zero and the continuous components, respectively; α (p+1)×1 and β (q+1)×1 are the corresponding vectors of the regression coefficients. As seen in equations (15) and (16) respectively [30,31]. The errors term e ij~ N(0, ) was also assumed to be of normal distribution and independent of the random effects.
Let ψ ij = I(y ij >0) denote the indicator of Y ij being nonzero. The general form of the likelihood function for the i-th subject can be described as follows: where the log-likelihood for the binary part is: and the log-likelihood for the continuous part is: with , which can be implemented in the SAS NLMIXED procedure by quasi-Newton optimization with adaptive Gaussian quadrature techniques [32]. With the conventional model, the likelihood and score equations can be separated into two independent components: one for the binary part and one for the continuous part. In contrast, note that the score equations for the mTP model are not separable, and thus, the binary and continuous parts are fit simultaneously. Model-based asymptotic standard errors are computed using Fisher information matrix, Ι(α, β, σ), as: with the maximum likelihood estimates substituted for α, β, and σ.

Results
In this section, the proposed mTP model was applied to the healthy diet data set on fat and protein to investigate the effects of supplementation categories on the number of deaths per 100 cases and recoveries per 100 cases of COVID-19. The estimations of mTP BP and mTP LN related to deaths and recoveries are shown in Tables 1 and 2, respectively. In these tables, variances ( and ) show the variety of responses among level 2 (ie, the WHO regions) related to each part of the zero and nonzero (ie, positive) components. Tables 1 and 2 show that almost all categories have the same effect on the number of deaths and recoveries in 100 cases. The number of deaths per 100 cases, number of recoveries per 100 cases, and the obesity rates until July 3, 2020, for all countries and split by the WHO regions is shown in Multimedia Appendix 3. Deaths are more common in Western and Southwest Europe (eg, Belgium, the United Kingdom, France, Italy, Hungary, Netherlands, and Spain), North America (eg, Mexico, Bahamas, Canada, Barbados, Belize, and the United States), and North Africa (eg, Western Sahara, Chad, Algeria, and Niger). The highest number of deaths occurred in Yemen (26.62 deaths per 100 cases), which could be due to the crises caused by the war and the poor health conditions in this country in the last years. Frequently, it seems that the northern regions of the world appear to have had more deaths, which may be due to temperature differences between the two hemispheres.  According to the results of Table 1, except pulses in fat quantity and animal products, meat, tree nuts, and vegetables in the protein data set, all categories had no significant effect on the number of deaths. A 1% increase in supplementation of pulses reduced the odds of having a zero death by 4-fold (1 / exp(-1.417) = 4.1251). In addition, a 1% increase in supplementation of animal products and meat increased the odds of having a zero death by 1.076-fold (exp(0.0736) = 1.076) and 1.133-fold (exp(0.0736) = 1.133), respectively. Tree nuts reduced the odds of having a zero death, and vegetables increased the number of deaths.
Continuously, except animal fats, sugar sweeteners, and tree nuts in fat quantity, and animal fats and sugar crops in the protein data, all categories had no significant effect on the number of recoveries ( Table 2). The effect of consuming sugar products on mortality was considerable. Every 1% increment in sugar sweeteners decreased the number of recoveries by 98.17% (-9.68, 95% CI -12.6440 to -6.7098). Tree nuts in fat quantity also reduced the number of recoveries by 16.9% (-0.1732, 95% CI -0.3157 to -0.3070). In the protein data, sugar crops reduced the number of recoveries by 99.11% (1exp(-4.7273) = 0.9911). The world map related to sugar and sweetener supply is shown in Figure 2. Based on the results of the proposed model and estimates of the effects of sugar, our prediction for the coming days is that the countries of the Americas, with more sugar product intake, will probably face more deaths. For further evaluation, we calculated correlations between categories (plus obesity rate) with the number of deaths ( Figure  3 A and C) and the number of recoveries (Figure 3 B and D) by using the bivariate Pearson correlation. Results of the correlations showed that, in the protein data, countries that consumed more spices, tree nuts, cereals, aquatic products, stimulants, vegetable oils, oil crops, pulses, fruit (wine), and alcoholic beverage (in order) had fewer deaths from COVID-19, and conversely, countries that consumed more meat, vegetables, vegetal products, sugar and sweeteners, animal products, animal fats, sugar crops, milk, fish, offals, miscellaneous, eggs, and starchy roots (in order) had more deaths from COVID-19. In the fat quantity data, countries that consumed more sugar and sweeteners, miscellaneous, tree nuts, meat, animal products, animal fats, offals, and fish had more deaths from COVID-19. Finally, same as the mTP model results, obesity has affected increased death rates and reduced recovery rates in all correlation analyses (Figures 3 and 4).

Principal Results
In this study, we proposed a mTP regression model for clustered semicontinuous diet and nutrition data. This model allows investigators to obtain covariate effects on the marginal mean of the outcome (eg, deaths and recoveries). It also has an unconditional interpretation of the covariate effect on the marginal mean. Our proposed mTP model had satisfactory performance in the diet and nutrition data analysis.
Findings of this study show that populations (countries) who consume more eggs, cereals excluding beer, spices, and stimulants had the greatest impact on the recovery of patients with COVID-19. In addition, populations that consumed more meat, vegetal products, sugar and sweeteners, sugar crops, animal fats, and animal products were associated with more deaths and less recoveries in patients. The effect of consuming sugar products on mortality was considerable. In addition, obesity has affected increased death rates and reduced recovery rates.

Comparison With Prior Work
Healthy diets and physical activity are key to good nutrition and necessary for a long and healthy life and prevention of chronic disease [33]. Eating nutrition dense foods and balancing energy intake with necessary physical activity to maintain a healthy weight is essential at all stages of life. Unbalanced consumption of foods high in energy (sugar, starch, and fat) and low in essential nutrition contributes to energy excess, being overweight, and being obese. The amount of energy consumed in relation to physical activity and the quality of food are key determinants of nutrition-related chronic disease [11]. In a review study from January 2020, Zhang and Liu [13] reviewed the importance of some nutrition interventions (vitamins, minerals, immunoenhancers) in infectious and respiratory diseases. The authors suggested that the nutritional status of each patient who was infected should be evaluated before the administration of general treatments, and the current children's RNA-virus vaccines, including the influenza vaccine, should be used for people who are not infected and health care workers. Moreover, the results of their review showed that all the potential interventions (nutritional or immunoenhancers) should be implemented to control COVID-19 if the infection is uncontrollable [13]. Our results also confirm these associations by introducing influential diet categories, including sugar and sweeteners, animal products, animal fats, sugar crops, miscellaneous, and tree nuts as more important risk factors for death or slowing of recovery in patients with COVID-19.
Recent studies point to obesity as a critical risk factor for being hospitalized or dying from COVID-19 [34][35][36]. Indeed, a high prevalence of obesity has been observed in patients with COVID-19, requiring invasive mechanical ventilation [37], a robust proxy of SARS-CoV-2 severity. In patients younger than 60 years, those with obesity were at almost double the risk of being admitted to critical care when compared with patients of a normal weight [38]. Results from this study confirm previous findings on the risk of obesity and add that obesity slows down patients' recovery and treatment.
People need to eat fewer prepared foods and more complex plant-based foods [11]. Although there are differences in dietary patterns, overall, unbalanced diets are a health threat across the world and do not just affect death rates but also the quality of life. To achieve best results in preventing nutrition-related pandemic diseases, strategies and policies should fully recognize the essential role of both diet and obesity in determining good nutrition and optimal health. Policies and programs must address the need for change at the individual level as well as the modifications in society and the environment to make healthier choices accessible and preferable.

Study Limitations
We have some limitation in using the nutrition data sets. The study is based on observational data, and inevitably with 188 countries included, there were variations in how the data were collected. This study included 23 dietary attributes; some that are of interest to health such as saturated and monounsaturated fatty acids and free sugars across the diet (not just those in drinks) were not included in the analysis. The study also did not take into account lifestyle factors, such as smoking and physical activity, that can have a significant impact on the risk of the disease outcomes used in the study.
Finally, we remind all our readers to take care of themselves during this pandemic, follow the guidelines of the Centers for Disease Control and Prevention [39], and eat healthy foods with sufficient amounts of fruits and vegetables as previously discussed.

Conclusions
Good nutrition is important before, during, and after an infection. The findings of this study show that populations who consume more eggs, cereals excluding beer, spices, and stimulants had the greatest impact on the recovery of patients with COVID-19. In addition, populations that consumed more meat, vegetal products, sugar and sweeteners, sugar crops, animal fats, and animal products were associated with more deaths and less recoveries in patients. The effect of consuming sugar products on mortality is considerable. In addition, obesity has affected increased death rates and reduced recovery rates.
Although there are differences in dietary patterns, overall, unbalanced diets are a health threat across the world and affect not only death rates but also the quality of life. To achieve best results in preventing nutrition-related pandemic diseases, strategies and policies should fully recognize the essential role of both diet and obesity in determining good nutrition and optimal health. Policies and programs must address the need for change at the individual level as well as the modifications in society and the environment to make healthier choices accessible and preferable.