This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.

Understanding the patterns of disease importation through international travel is paramount for effective public health interventions and global disease surveillance. While global airline network data have been used to assist in outbreak prevention and effective preparedness, accurately estimating how these imported cases disseminate locally in receiving countries remains a challenge.

This study aimed to describe and understand the regional distribution of imported cases of dengue and malaria upon arrival in Spain via air travel.

We have proposed a method to describe the regional distribution of imported cases of dengue and malaria based on the computation of the “travelers’ index” from readily available socioeconomic data. We combined indicators representing the main drivers for international travel, including tourism, economy, and visits to friends and relatives, to measure the relative appeal of each region in the importing country for travelers. We validated the resulting estimates by comparing them with the reported cases of malaria and dengue in Spain from 2015 to 2019. We also assessed which motivation provided more accurate estimates for imported cases of both diseases.

The estimates provided by the best fitted model showed high correlation with notified cases of malaria (0.94) and dengue (0.87), with economic motivation being the most relevant for imported cases of malaria and visits to friends and relatives being the most relevant for imported cases of dengue.

Factual descriptions of the local movement of international travelers may substantially enhance the design of cost-effective prevention policies and control strategies, and essentially contribute to decision-support systems. Our approach contributes in this direction by providing a reliable estimate of the number of imported cases of nonendemic diseases, which could be generalized to other applications. Realistic risk assessments will be obtained by combining this regional predictor with the observed local distribution of vectors.

Throughout history, human mobility has been a key determinant for the spread of infectious diseases. From the 14th century bubonic plague pandemic to the 1918 Spanish flu as well as the more recent Ebola epidemic and COVID-19 pandemic, the way individuals travel across the globe has shaped the evolution and geographical dynamics of infectious diseases [

International mobility flows are especially relevant for the spread of vector-borne diseases (VBDs), which often receive less attention in routine epidemiological surveillance plans in countries where they are not endemic. In recent years, global warming and intensified urbanization processes have favored the establishment of previously foreign species around the globe, such as

A wide range of approaches have been used to model risks of imported cases of dengue [

We aimed to provide accurate descriptions of how infected travelers may distribute in a territory, which could be valuable input for local authorities in the design of cost-effective VBD prevention and control strategies. For this, we approximated the local distribution of travelers arriving at a specific country (or any other territory) in terms of readily available indicators, rather than considering travel information that is usually not quantified at a local scale. These statistics gauge the appeal of each region to foreign travelers, quantifying the number of imported cases each region may receive. We calibrated our model with the number of imported cases of dengue and malaria at each province in Spain from 2015 to 2018 and then performed validation by comparing our model’s estimates for the number of imported cases in 2019 with official data.

We first developed a theoretical framework to estimate how travelers distribute throughout the territory after their arrival in the country. See

Summary of the rationale behind our approach. Infected travelers arrive at the importing country from _{k}_{k}_{k} a_{k}_{ki}_{i}_{i}_{k} a_{k}_{ki}

We used the following yearly statistics from 2015 to 2019 to compute the relative appeal of each province in Spain to international travelers, which are publicly available and curated by the Spanish National Statistics Institute [

Tourist indicators: For both hotels and tourist apartments, we used the variables total capacity, number of national travelers, number of foreign travelers, number of overnight stays by national travelers, and number of overnight stays by foreign travelers.

Economic indicators: We considered each province’s population, gross domestic product (GDP), GDP per capita, number of private limited companies (Sociedades Limitadas), and number of public limited companies (Sociedades Anónimas).

Indicators for visits to friends and relatives: For each country and province, we used the number of foreign residents by nationality, number of foreign residents by birthplace, and number of national residents by birthplace (other than Spain).

The following 3 additional data inputs were used in our approach:

Arrival data: We computed the yearly number of travelers arriving in Spain from 2015 to 2019 at each of the 100 airports with the largest flows of incoming travelers and aggregated publicly available monthly data provided by the public entity in charge of the Spanish Airports and Aerial Navigation (Aeropuertos Españoles y Navegación Aérea, AENA) [

Disease data: We used yearly prevalence estimates for malaria and dengue from 2015 to 2019 provided by Global Health Data Exchange (GHDx) [

Cases in Spain: The number of imported cases of malaria and dengue (including any type of infection by the dengue virus) is reported by each province to the Spanish National Surveillance System (Red Nacional de Vigilancia Epidemiológica, RENAVE). These data were used.

We have described the rationale used to combine the above statistics in an informed indicator to estimate the propensity of international travelers to move to a specific region. We considered a country that receives travelers from other countries of the world (these are denoted as

Motivations for international travel are usually classified into 3 major categories: tourism, business, and visits to friends and relatives [_{i}_{i}_{ik}

The last indicator depends on the exporting country

_{i}

We then computed the travelers’ index for each region

_{ik}_{i}_{i}_{ik}

It follows from equations 1 and 2 that for each exporting country

Therefore, given an exporting country _{ik}_{k}

where the sum runs over all exporting countries _{k}

where _{k}

In order to test the validity of our approach, we followed the pipeline depicted in

Summary of the model building process. Key steps include the fitting, selection, validation, and assessment of the model.

We considered data from Spain (importing country) and its 52 provinces (regions over which the imported cases disseminate). We computed the relative importance of _{i}_{i}_{ik}

We used equation 2 to construct the travelers’ index (1 for each possible combination of the indicators). We combined these with arrivals and prevalence data to obtain estimates for the expected number of cases at each province for 2015-2018 (equation 4), resulting from simple averages of the indicators _{i}_{i}_{ik}

where 0 ≤ _{j}_{1}_{2}_{3}_{j}_{1}_{2}_{3}_{j}_{i}_{i}_{ik}

To identify which travelers’ index best approximates the number of imported cases per province, we computed the correlation between the estimates provided by each of our models (1 for each combination of indicators) and the actually reported cases of malaria and dengue at each province for the years 2015 to 2018. The model reporting the highest value for this correlation was selected as the best model. We followed the same procedure for the case of the weighted averages, and a different estimate was obtained for each choice of the indicator and weight _{j}

As a final test for accuracy, we computed the correlation between the best model’s estimates for 2019 (data not used during the fitting and selection process) and the officially reported cases for this year, both for the simple and weighted averages. In case this correlation was high, we considered the model as validated and proceeded to the next step.

We assessed 3 features of the resulting model. First, we fit a linear model explaining the estimated number of imported cases at each province in 2019 in terms of the officially reported cases. The coefficient of the linear model may be understood as the number of cases predicted by the model per officially reported case, thus informing of the overestimation or underestimation of the model’s prediction of actually reported cases. This refers only to the raw number of cases as the accuracy of the distribution is captured by the validated correlation. Second, we ranked the contribution of each of the statistics considered in the model by computing the average correlation with 2019 official data of those estimates obtained from models including each particular variable. We also computed the average loss of accuracy associated with each variable as the difference in the average correlation of those models including and not including each statistic. This allowed us to identify which statistics among the choices made for each indicator _{i}_{i}_{ik}_{j}

Finally, we tested the validity of our results against a well-established model for human mobility [^{−}^{γ}

We then grouped the total number of expected cases at each province as the sum of those arriving at the province according to their final flight destination and those arriving from any other province by means of other transport modes reflected in a geographically bounded power law distribution. This model was then fitted to the data on the officially reported cases of dengue and malaria from 2015 to 2018 for values of

We followed an analogous philosophy for the model building and assessment process as in the travelers’ index model. The parameters leading to the highest correlation with the reported cases for the period of 2015 to 2018 were used to compute an estimate for 2019, and the correlation between this estimate and the 2019 official record was then computed to allow for comparison with the travelers’ index model. A linear model between the human mobility model’s estimate and the official 2019 record was also fitted to assess the underestimation or overestimation of the model.

Our study used publicly available aggregated secondary data with no characteristics that allowed for individual identification. There are no relevant data protection and privacy issues to report.

A preliminary analysis showed that among all statistics used, those concerning the same drivers were usually highly co-dependent, with some exceptions (eg, GDP per capita; see Figures S1 and S2 in

The 100 airports with the highest number of incoming travelers were located in 49 countries and accounted for 99.75% of the total incoming travelers to Spain from 2015 to 2019. Out of these countries, 10 were removed from our study as neither malaria nor dengue was present during the time span under study (according to prevalence data from GHDx), resulting in 39 exporting countries.

Exporting countries.

Country^{a} |
Incoming travelers^{b} |
Malaria prevalence (/100,000 population)^{b} |
Dengue prevalence (/100,000 population)^{b} |

United States | 6,845,337 | 0 | 0.54 |

Brazil | 3,472,397 | 78.96 | 65.40 |

Colombia | 3,053,635 | 197.47 | 54.97 |

Argentina | 2,972,990 | 0 | 15.47 |

Peru | 2,101,286 | 323.49 | 41.37 |

Mexico | 2,063,577 | 7.26 | 37.29 |

Dominican Republic | 1,611,336 | 1.50 | 49.11 |

Algeria | 1,562,772 | 4.92 | 0 |

Venezuela | 1,241,154 | 1065.28 | 48.17 |

Cuba | 1,062,531 | 0 | 37.93 |

Cape Verde | 1,054,106 | 135.17 | 36.41 |

Ecuador | 1,027,244 | 86.97 | 37.84 |

Costa Rica | 810,214 | 0 | 67.74 |

Senegal | 630,192 | 2464.71 | 32.43 |

Panama | 566,765 | 115.70 | 56.47 |

Bolivia | 556,092 | 181.43 | 60.84 |

Gambia | 522,809 | 4237.62 | 33.78 |

Egypt | 466,705 | 0 | 10.88 |

Thailand | 365,318 | 68.10 | 58.10 |

Singapore | 360,619 | 0 | 68.44 |

Equatorial Guinea | 360,393 | 32981.86 | 39.18 |

China | 359,372 | 0.14 | 24.63 |

Pakistan | 306,321 | 536.27 | 41.90 |

Mauritania | 300,521 | 4095.70 | 24.57 |

El Salvador | 293,976 | 8.95 | 130.85 |

Republic of Korea | 254,457 | 15.80 | 0 |

Nigeria | 200,813 | 18792.47 | 38.73 |

Jordan | 182,431 | 0 | 12.45 |

Angola | 176,690 | 11182.79 | 27.97 |

Guatemala | 174,030 | 169.12 | 45.25 |

Saudi Arabia | 166,730 | 3.98 | 15.52 |

Ghana | 149,374 | 18512.96 | 40.65 |

Guinea | 78,545 | 30131.12 | 38.35 |

The Bahamas | 49,010 | 0 | 39.30 |

Gabon | 38,016 | 15756.90 | 44.25 |

Jamaica | 14,802 | 0 | 45.43 |

South Africa | 12,394 | 36.43 | 0 |

Cameroon | 8994 | 19904.66 | 34.96 |

Mali | 1165 | 16024.21 | 31.19 |

^{a}Countries with no malaria or dengue prevalence have been removed from the list.

^{b}Total incoming travelers and average malaria and dengue prevalences (total cases per 100,000 inhabitants) from 2015 to 2019 for the 39 exporting countries considered in our study, ranked by the number of travelers.

High correlation values were found for both malaria (0.94) and dengue (0.86) between the best model’s estimates for 2019 and the notified cases. The models that provided the most accurate estimates included public limited companies, foreign travelers at hotels, and foreign residents by birthplace in the computation of the travelers’ index. The same variables led to the best estimates for both malaria and dengue. While considering weighted averages in the construction of the travelers’ index did not improve the accuracy of the models, different motivations were obtained for travelers carrying each of the diseases: economy seemed to best capture the appeal of each region for imported cases of malaria (relative weight of 0.7, with GDP being the most accurate indicator) and visits to friends and relatives seemed to be the main motivation for travelers with dengue (relative weight of 0.9, assigned to the number of foreign residents in the province by birthplace). Different proportions of overestimation were found for each disease (99% for malaria and 86.5% for dengue). A summary of the relevant features of the models provided by the fitting and selection process is presented in

Summary of the models that most accurately approximated the reported cases in 2015-2018.

Disease (model) | Economic indicator (weight^{a}) |
Tourist indicator (weight^{a}) |
Visits to friends and relatives indicator (weight^{a}) |
Pearson correlation of model’s estimate with 2019 data | Overestimation |

Malaria^{b} (simple) |
Public limited companies | Foreign travelers at hotels | Foreign residents by birthplace | 0.94 | 98.9% |

Malaria^{b} (weighted) |
GDP^{c} (0.7) |
Foreign travelers at hotels (0.1) | Foreign residents by birthplace (0.2) | 0.94 | 99.0% |

Dengue^{b} (simple) |
Public limited companies | Foreign travelers at hotels | Foreign residents by birthplace | 0.86 | 86.5% |

Dengue^{b} (weighted) |
No contribution (0) | Foreign travelers at hotels (0.1) | Foreign residents by birthplace (0.9) | 0.87 | 86.7% |

^{a}For the models including weighted averages, the weight _{i}

^{b}Each row shows the statistics that provide the best estimate of imported cases of each disease, the correlation with the actually reported data in 2019, and the approximation for the proportion of overestimation as obtained from the linear models.

^{c}GDP: gross domestic product.

Summary of the best linear models for 2019 imported cases of malaria (top row) and dengue (bottom row). The left column shows the predictions of the models (in red), together with the number of reported cases (in blue) for 2019 at each province in Spain. The right column shows the fit between the estimates of the models and the official records (inset figures correspond to the fit after removing Madrid and Barcelona from the data set).

We performed a residual analysis to check for normality and autocorrelation of the residuals of the models. The malaria model showed close-to-normal residuals with no autocorrelation (statistically significant W=0.67 and DW=2.03 in the Shapiro-Wilk and Durbin-Watson tests, respectively). For the dengue model, a relevant deviation was caused by the estimate for Barcelona (

The models constructed using simple averages provided a unanimous choice of indicators associated with tourism, economy, and visits to friends and relatives. On the contrary, the best weighted models included different economic indicators. GDP provided the best estimate for imported malaria cases, while no influence of the economic indicator was considered in the best dengue model. In addition, different drivers were the most important ones for each disease, as shown by the much higher relative weight for economic motivations in malaria cases and for visits to friends and relatives in dengue cases (

When ranking the contribution of each of the variables to the accuracy of the models, similar results were found for both diseases, with some minor variations across variables (

Contribution of each variable to model accuracy.

Variable | Malaria^{a} |
Malaria (loss)^{b} |
Dengue^{a} |
Dengue (loss)^{b} |

National travelers in hotels | 0.83 | 0.08 | 0.74 | 0.05 |

Overnight stays by national travelers in hotels | 0.83 | 0.07 | 0.73 | 0.05 |

Foreign travelers in hotels | 0.82 | 0.07 | 0.76 | 0.08 |

Total hotel capacity | 0.81 | 0.05 | 0.74 | 0.05 |

Public limited companies (Sociedades Anónimas) | 0.81 | 0.05 | 0.72 | 0.04 |

National travelers in tourist apartments | 0.79 | 0.03 | 0.70 | 0.01 |

Overnight stays by national travelers in tourist apartments | 0.79 | 0.03 | 0.70 | 0.02 |

GDP^{c} |
0.78 | 0.02 | 0.71 | 0.02 |

Private limited companies (Sociedades Limitadas) | 0.78 | 0.02 | 0.70 | 0.01 |

Foreign residents by country of nationality | 0.77 | 0.01 | 0.70 | 0.01 |

Foreign residents by country of birth | 0.77 | 0.01 | 0.70 | 0.01 |

Population | 0.76 | 0.00 | 0.69 | 0.00 |

Overnight stays by foreign travelers in hotels | 0.75 | −0.01 | 0.69 | 0.00 |

National residents by country of birth | 0.75 | −0.02 | 0.68 | −0.02 |

Total tourist apartment capacity | 0.71 | −0.06 | 0.64 | −0.06 |

GDP per capita | 0.69 | −0.09 | 0.63 | −0.07 |

Foreign travelers in tourist apartments | 0.67 | −0.10 | 0.62 | −0.08 |

Overnight stays by foreign travelers in tourist apartments | 0.63 | −0.15 | 0.58 | −0.13 |

^{a}The average correlation of the estimates of the models including each variable in their fit with the officially reported 2019 data.

^{b}The average difference in correlation between models including each variable in their fit and models not including each of the variables (variables ranked by the average correlation for predictions).

^{c}GDP: gross domestic product.

A similar procedure was followed for the weighted models. The average correlation between the models including each variable and the 2019 official data was computed in this case with stratification by the weight assigned to the variable (

Summary of each input variable’s performance on the estimates for malaria (A) and dengue (B). Each square in the figure is colored according to the average correlation between the official 2019 reports and the estimates provided by the weighted models including each of the variables, with the associated weight ranging from 0 (no contribution from the variable is assumed in the model) to 1 (the model only includes that variable). The variables are ranked from top to bottom according to the overall average correlation with 2019 data of the estimates of the models including each variable. GDP: gross domestic product; SA: Sociedades Anónimas; SL: Sociedades Limitadas.

For both dengue and malaria, the human mobility models ranked higher in terms of correlation with 2019 data for higher values of the assumed proportion of travelers who do not move from their destination province upon arrival (

In general, the estimates of the generic mobility model for the distribution of imported cases were less accurate when compared to actually reported cases than those resulting from the travelers’ index models. This was the case for both malaria and dengue (0.59 and 0.66 correlation with 2019 data, respectively;

Summary of the human mobility model that most accurately approximated the reported cases from 2015 to 2018 (including all provinces).

Model | Proportion of cases that do not move after arrival ( |
Exponent of the power law distribution ( |
Correlation with 2019 data | Overestimation |

Malaria^{a} |
1 | Any | 0.59 | 99.5% |

Dengue^{a} |
1 | Any | 0.66 | 95.2% |

^{a}Each row shows the parameters of the model that provide the best estimate of imported cases of each disease, the correlation with the actually reported data in 2019, and the overestimation of the models as obtained from the linear fit with official records.

We computed estimates for the number of imported cases of malaria and dengue at each province in Spain based on simple methodological assumptions. Our approach makes use of readily available data and provides approximations of the actually declared number of cases of the disease. This advance may contribute to the adequate modeling and monitoring of VBDs, which might be relevant for effective outbreak prevention strategies. More efficient resource allocation strategies for both vector control and disease prevention can be designed if reliable predictions of the geographical locations of imported cases are available. By circumventing the need for detailed large-scale data on human mobility or traveler behavior, this methodology is accessible and suitable to be used in countries lacking more exhaustive data infrastructure, for instance [

The high correlation found between our estimates and real data support the validity of our approach based on a priori theoretical conceptualization. This agreement in trend suggests that our estimates are reliable enough for the elaboration of scale-less risk indicators, for instance. On the other hand, our estimates of the raw number of imported cases were simplistic (product of yearly prevalence and total number of travelers), which resulted in substantial overestimation of the number of imported cases. For the case of malaria, this is coherent with the epidemiology of the disease, being more severe unless treatment is available and having a higher incidence in economically deprived populations [

The proposed computation of the key indicators involved in our model (the travelers’ index _{ik}

A key finding in this direction is that while the impact of each particular indicator in the quality of the estimate was similar for both diseases, the relevant drivers for case importation were different (economic motivations for malaria cases and visits to friends and relatives for dengue cases). This may be due to the different nature of the motivation for international travel across countries in the world. Most malaria cases were imported from African countries, while travelers carrying dengue usually arrived from America or Asia (see Table S1 in

Further evidence of the appropriateness of our approach was provided by a comparison with the human mobility model. While the validity of this model has been established in many contexts and is widely acknowledged [

Future developments of our approach should cover the following improvements:

Coupling with postimportation dynamics: Our framework could be integrated into more complicated models incorporating transmission dynamics that involve the life cycle of the disease within the vector and the host [

Refining the gross estimate of imported cases: As mentioned above, we computed simple estimates for the total number of cases arriving to the importing country (product of yearly prevalence and total number of travelers). We focused on how these cases distribute over the regions of the importing country. Consideration of more elaborate estimates of these quantities or the local distribution of the disease in the exporting countries would probably yield more precise final estimates.

Extending the scope of the model to other diseases: A particular feature of the treated examples is that virtually all incoming streams of travelers into Spain from regions where malaria and dengue are endemic, which may result in transmission, may be assumed to be associated with air travel. This may not be the case for other diseases and countries, for which detailed data on the total traveler flow or further development of the proposed methodology could be necessary. Similarly, other importation phenomena that may depend on human behavior or allocation of resources could be analyzed under our assumptions, such as passive mobility of vectors by human means of transportation [

It should be noted that our model was focused on countries with high dengue and malaria prevalences, and hence, they were likely to export these diseases to Spain. However, this concept could be generalized to other types of risk-related importation scenarios like the transport of new vectors or exotic species (invasion biology), which is another crucial process in the spread of VBDs.

Several factors may be limiting the extent of our results. First, both malaria and dengue are diseases known to be subject to high underreporting [

We have shown the validity of the travelers’ index as a method to estimate the distribution of imported cases of malaria and dengue from endemic regions. This is an appropriate way to improve disease risk prediction on the basis of human mobility patterns. Our methodology adds value to available socioeconomic information relevant to public health. Nonetheless, human mobility is just 1 component of VBD risk models. The other key components that need to be added are vector (mosquito) distribution and suitability. Our work will be combined with multi-sourced presence/absence and suitability vector data in Spain, including both authoritative and citizen science data collections [

Supporting data and detailed results for the statistical analysis.

gross domestic product

Global Health Data Exchange

vector-borne disease

The project leading to the present results has received funding from “la Caixa” Foundation (ID 100010434) under agreement HR-18-0036.

Indicators for tourism, economy, and visits to friends and relatives are available from the Spanish National Institute for Statistics [

None declared.