^{1}

^{1}

^{*}

^{2}

^{3}

^{4}

^{5}

^{1}

^{6}

^{7}

^{2}

^{8}

^{1}

In this article we propose a novel hurdle negative binomial (HNB) regression combined with a distributed lag nonlinear model (DLNM) to model weather factors’ impact on heat related illness (HRI) in Singapore. AIC criterion is adopted to help select proper combination of weather variables and check their lagged effect as well as nonlinear effect. The process of model selection and validation is demonstrated. It is observed that the predicted occurrence rate is close to the observed one. The proposed combined model can be used to predict HRI cases for mitigating HRI occurrences and provide inputs for related public health policy considering climate change impact.

Heat related illness (HRI) is a well-known hazard for workers under high temperature environments and people exercising in extreme heat. The understanding of the association of weather factors on HRI is important for making preventive measures and related policy in workplace and schools, or during sports activities.

Several modeling studies had investigated the association between weather conditions and HRI [

Previous studies on HRI in Singapore have been documented [

Factors that impact HRI may include weather, social activity, and individual health status. In our study, we focused on modeling weather impact on HRI. Other factors were also considered by categorized parameters like holidays and weekdays.

For modeling the impact of weather on HRI in Singapore, we employed hurdle negative binomial (HNB) models together with distributed lag non-linear models (DLNMs). As many zero counts were found in our HRI data, a hurdle was assumed when the count of HRI cases turns from zero to non-zero. For the HNB model, a censored Poisson (right-censoring at 0) governed the binary outcome of the impact of weather on the occurrence and non-occurrence of HRI cases. If the counts of HRI cases were not zero, the conditional distribution of

the counts was governed by a zero truncated negative binomial model [

The DLNMs were developed and had been used, simultaneously, to estimate the nonlinear and delayed effects of temperature (or air pollution) on mortality [

Singapore is a city-state in Southeast Asia and located at 01˚22'N and 103˚48'E. Because of its geographical location and maritime exposure, the climate of Singapore is tropical with daily ambient temperature ranging from 25˚C to 33˚C and daily humidity between 60% and 95%. There are two main monsoon seasons, the northeast monsoon (December-early March) and the southwest monsoon season (June-September), separated by two relatively short inter-monsoon periods (late March-May and October-November).

The HRI data, provided by the Singapore Ministry of Health, recorded daily incidence of hospital admission for HRI cases (categorized as 992.0, 992.3, 992.5 under ICD-9 CM codes) from 1 January 1991 to 31 December 2010 in Singapore. The summary of the data is listed in

Number of daily HRI | Proportion of days withHRI (%) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 12 | 15 | ||

Counts of days corresponding to the number of illness cases | |||||||||||||

Overall | 5626 | 1222 | 289 | 88 | 40 | 23 | 5 | 5 | 2 | 3 | 1 | 1 | 22.98 |

Public holiday | 236 | 16 | 3 | − | − | − | − | − | − | − | − | − | 7.46 |

Sunday | 874 | 121 | 21 | 8 | 6 | 5 | 2 | 1 | 1 | 3 | 1 | 16.20 | |

Monday | 830 | 142 | 47 | 16 | 6 | 1 | 1 | − | − | − | − | − | 20.42 |

Tuesday | 785 | 192 | 46 | 14 | 3 | 3 | − | 1 | − | − | − | − | 24.81 |

Wednesday | 792 | 193 | 37 | 15 | 3 | 4 | − | − | − | − | − | − | 24.14 |

Thursday | 769 | 194 | 52 | 12 | 8 | 5 | 2 | 2 | − | − | − | − | 26.34 |

Friday | 767 | 202 | 46 | 15 | 8 | 5 | − | 1 | − | − | − | 26.53 | |

Saturday | 809 | 178 | 40 | 8 | 6 | − | − | 1 | − | − | − | 1 | 22.44 |

MinT | MeanT | MaxT | RH | Wdsp | WBGT_MeanT | HRI | |
---|---|---|---|---|---|---|---|

Min. | 20.2 | 23.1 | 23.6 | 66.0 | 0.1 | 26.1 | 0 |

1st Qu. | 24.0 | 26.9 | 30.8 | 79.8 | 1.1 | 28.8 | 0 |

Median | 24.8 | 27.7 | 31.8 | 83.1 | 1.7 | 29.5 | 0 |

3rd Qu. | 25.7 | 28.6 | 32.6 | 86.8 | 2.5 | 30.1 | 0 |

Max. | 29.1 | 30.9 | 36.0 | 99.5 | 5.9 | 32.3 | 15 |

Mean (sd) | 24.9 (1.2) | 27.7 (1.2) | 31.6 (1.6) | 83.4 (5.0) | 1.9 (1.0) | 29.4 (0.9) | 0.34 (0.80) |

1st Qu. and 3rd Qu. represent the 25th and 75th percentiles, respectively.

Wet-bulb globe temperature index (WBGT) is a common index for measuring environmental heat stress and is used widely as an assessment for the risks of HRI [

WBGT ≈ 0.567 × T a + 0.393 × e + 3.94 (1)

where e is environmental water vapor pressure, calculated from temperature and relative humidity using e = ( R H / 100 ) × 6.105 × exp [ 17.27 × T a / ( 273.7 + T a ) ] , RH is the relative humidity and Ta is dry bulb temperature (˚C). In this study, Ta was the daily mean temperature, and the corresponding approximated WBGT based on Formula (1) was denoted WBGT_MeanT. Formula (1) was first proposed by [

A hurdle negative binomial (HNB) regression model combined with a distributed lag non-linear model (DLNM) was applied here to explore the association between daily HRI counts with weather conditions. The hurdle model was suggested by [

Actually, the occurrence of HRI is a comprehensive impact of weather factors and other factors, such as physical activities, the lack of environmental acclimatization, poor physical fitness, and illness [

On the other hand, the positive count of HRI cases could be accounted for different origins. Hence, zero-truncated Negative binomial (ZNB) distribution was applied to describe the count, as a NB distribution is a mixture of Poisson distribution [

A DLNM was used to check if weather conditions had delayed adverse effect on the occurrence of HRI. At the same time, linear or nonlinear effect was also investigated through DLNMs.

For the zero hurdle part, a censored Poisson (right-censoring at zero) was employed to model the impact of weather on the occurrence and non-occurrence of HRI cases. The Poisson mean on day t was denoted as μ 0 t . On the other hand, the positive counts part was modeled by a ZNB model with the negative binomial mean of μ 1 t and shape parameter of θ on day t. Modeling processes for the two parts could be conducted separately according to the work by [

log ( μ k t ) = α k + ∑ j s k ( Weather t j ) + δ k ⋅ Dow t + γ k ⋅ Holiday t + log ( Pop t ) + trend k t (2)

where α k was the intercept, s k ( ⋅ ) was a smoothing function. It was noted that the default knots were equally-spaced percentiles. s k ( Weather t j ) were obtained by applying a DLNM to Weather j t (i.e., MinT, MeanT, MaxT, RH,

Date | 12/8/2002 | 12/7/2003 | 12/5/2004 | 12/4/2005 | 12/3/2006 | 12/2/2007 | 12/7/2008 | 12/6/2009 | 12/5/2010 |
---|---|---|---|---|---|---|---|---|---|

HRI | 4 | 9 | 12 | 5 | 5 | 6 | 5 | 4 | 9 |

Wdsp and WBGT_MeanT). Dow t was day t of the week, and δ k was vector of coefficient. Holiday t was a binary variable that was “1” if day t was a holiday, and γ k was the coefficient. Dow t and Holiday t were used to reflect the impact of social activities related with heat illness. For the offset term, Pop t was the total population of Singapore on day t. As only the mid-year population was obtained (from the Department of Statistics, Singapore), daily population was approximated by linear interpolation. trend k t was determined through n-phase piecewise linear functions. n-phase piecewise linear function means that the piecewise linear function has n phases.

R software (version 3.1.1; R Development Core Team, 2014) was used to fit all models. The “dlnm” package was used to create the DLNM [

The modeling process was as follows. The first step was to determine the trend of HRI cases through n-phase piecewise linear functions. The intercept and slope might differ for different phases. Corresponding knots were chosen among a sequence of the last day of dual months: “02/28/1991”, “04/30/1991”, …, “10/31/2010”, “12/31/2010”. The Akaike information criterion (AIC) was used to determine the best trend. For the zero hurdle part, a three phase piecewise linear function with knots “2/28/1994” and “6/30/2002” was detected. On the other hand, for the positive counts part, a two phase piecewise linear function with the knot “10/31/1998” was detected, i.e., a change point was detected and it happened during the period between year 1998 and 1999 (see

Before this point, the pattern of HRI monthly counts was an apparent

decreasing trend with higher number of cases and higher variance. After this point, the pattern showed a relatively stable trend with relatively lower number of cases, lower variance.

The second step was to explore the weather predictors that best predicted the occurrence of HRI after adjusting for trend, holiday and specific days in a week. A DLNM was used to find the best weather predictors. A natural splines (ns) function, with degree of freedom (df) of 3 and knots placed at equally-spaced quantiles, was applied to reflect the nonlinear effect of each weather predictor.

In the censored Poisson model, both single lag effect and distributed lag effect were investigated with a maximum lag number of 14 days. For a single lag model, AIC of the single lag models was used to determine the best lag. In our study, the best lag numbers found for weather predictors were all 0s. For the distributed lagged effect model (DLNM), a different ns function with df of 2 was used to describe the effect of lags. Nevertheless, constrained DLNM does not exist when the maximum lag number is less than 2. Hence unconstrained DLNM was used for the lag number 0 and 1. During this process, we also found that the best lags for weather predictors were all 0’s.

In the TZBN model, similar model selection procedure was conducted as described above for the binomial model. For a single lag model, the current effect (lag number = 0) of all considered weather parameters was found with smaller AIC values than other lagged effect. For the distributed lagged effect model, under AIC, the best maximum number of lags of all considered weather predictors were all 0’s except for approximated WBGT, in which the maximum number of lag 3 was associated with the smallest AIC values. We then applied Vuong test (in “pscl” package, R software, version 3.1.1), which was designed to do model selection between two non-nested models [

The third step was to explore the linear or nonlinear effect of the weather predictors on HRI. The ns functions with df from 1 to 5 was compared to reflect the impact of the weather predictors. When df = 1, the ns function was actually a linear function. AIC was also used to determine the best model.

The forth step was to examine the best combination of weather predictors. The first weather predictor was selected if it was associated with the smallest AIC value among all candidate weather predictors. After the first weather predictor was included, a second predictor variable was included if the AIC value of the new model was reduced. The process continued until no new entry could reduce the AIC value or all weather predictors were included in the model. During the selection procedure, the best ns function was selected and used for each new entry.

During the process, if a complicated model was first selected according to AIC values, the significance of its superiority against a corresponding simple model was further tested. If it was not significant, the simple model would be chosen.

There were 2474 HRI cases admitted to hospital from year 1991 to 2010 (7035 days) with a daily mean of 0.34 cases (see

In the study period, the average daily maximum temperature (MaxT) was 31.6˚C, mean temperature (MeanT) was 27.7˚C, minimum temperature (MinT) was 24.9˚C, mean relative humidity was 83.4%, and mean wind speed was 1.9 m/sec (see

Mean temperature and approximated WBGT were associated with lower AIC values (see

MeanT | MaxT | RH | Wdsp | WBGT_ MeanT | |
---|---|---|---|---|---|

MinT | 0.80 | 0.53 | −0.53 | 0.24 | 0.77 |

MeanT | 0.81 | −0.76 | 0.27 | 0.91 | |

MaxT | −0.66 | 0.17 | 0.71 | ||

RH | −0.50 | −0.41 | |||

Wdsp | 0.05 |

All the correlation coefficients are statistically significant with p < 0.001.

Weather predictor | MinT | MeanT | MaxT | RH | Wdsp | WBGT_meT |
---|---|---|---|---|---|---|

Zero hurdle part (censored Poisson model) | ||||||

AIC^{a} | 7155.9 | 7041.6 | 7090.4 | 7205.1 | 7319.0 | 7047.5 |

Best df^{b} | 3 | 1 | 3 | 2 | 3 | 1 |

Non−zero counts part (ZTNB model) | ||||||

AIC^{a} | 2958.3 | 2948.8 | 2952.2 | 2967.1 | 2974.8 | 2947.7 |

Best df^{b} | 1 | 1 | 1 | 2 | 2 | 1 |

^{a}The AIC values were obtained for anns function with the best df (df: from 1 to 5) and only current effect was included. ^{b}The best df was selected in from 1 to 5.

the significance of the difference. It showed that the model with approximated WBGT was not significantly better than the model associated with MeanT (p-value = 0.4). Hence, MeanT was first included in the model.

In the censored Poisson model for the occurrence of HRI or not, mean temperature was first chosen, following which the model further selected relative humidity and MaxT. It was noted that linear model (df of ns is 1) showed the best prediction capability (AIC value was the lowest) among all candidate ns functions (df: from 1 to 5) for all the selected weather predictors. The estimated coefficients and results were presented in

In the ZTNB model for positive counts of HRI cases, MeanT with linear effect (df of ns is 1) was selected, as it was associated with the smallest AIC value. After that, wind speed with linear affect was selected. However, the effect of wind speed was observed not significant with p-value = 0.11. Hence, only MeanT was included in the ZTNB model.

For the zero hurdle part of HRI (

The high temperature is the key factor leading to the occurrence of HRI. Considering the concern of global climate change, we may foresee the increase of temperature in both tropical and temperature regions. In IPCC Fourth Assessment Report [

Zero hurdle model coefficients (censored Poisson with log link, AIC = 7022.5) | |||||
---|---|---|---|---|---|

Estimate | Std. Error | z value | Pr (>|z|) | RR^{a} (95%CI^{b}) | |

(Intercept) | −26.6893 | 1.6537 | −16.1388 | < 0.001 | −−−−− |

MeanT | 0.3599 | 0.0459 | 7.8486 | < 0.001 | 1.43 (1.31, 1.57) |

RH | 0.0286 | 0.0083 | 3.4361 | < 0.001 | 1.03 (1.01, 1.05) |

MaxT | 0.1046 | 0.03 | 3.4892 | < 0.001 | 1.11 (1.05, 1.18) |

Holi | −1.2769 | 0.2316 | −5.5131 | < 0.001 | 0.28 (0.18, 0.44) |

Monday | 0.3625 | 0.1041 | 3.484 | < 0.001 | 1.44 (1.17, 1.76) |

Tuesday | 0.5271 | 0.0997 | 5.2849 | < 0.001 | 1.69 (1.39, 2.06) |

Wednesday | 0.5246 | 0.1008 | 5.2021 | < 0.001 | 1.69 (1.39, 2.06) |

Thursday | 0.605 | 0.0988 | 6.122 | < 0.001 | 1.83 (1.51, 2.22) |

Friday | 0.6205 | 0.0989 | 6.2769 | < 0.001 | 1.86 (1.53, 2.26) |

Saturday | 0.3838 | 0.1018 | 3.7714 | < 0.001 | 1.47 (1.20, 1.79) |

v1^{a} | 1.3e−04 | 2.17e−5 | 6.0064 | < 0.001 | −−−−− |

v3^{b} | −7.8e−04 | 4.04e−5 | −17.7376 | < 0.001 | −−−−− |

Count model coefficients (ZTNB with log link, AIC = 2948.8) | |||||

Estimate | Std. Error | z value | Pr (>|z|) | ||

(Intercept) | −30.4068 | 510.4238 | −0.0596 | 0.9525 | −−−−− |

MeanT | 0.2728 | 0.0544 | 5.013 | < 0.001 | 1.31 (1.18, 1.46) |

Holi | −0.9665 | 0.7309 | −1.3224 | 0.186 | 0.38 (0.09, 1.59) |

Monday | −1.1671 | 0.2459 | −4.7469 | < 0.001 | 0.31 (0.19, 0.50) |

Tuesday | −1.3346 | 0.2379 | −5.6093 | < 0.001 | 0.26 (0.17, 0.42) |

Wednesday | −1.5246 | 0.2455 | −6.211 | < 0.001 | 0.22 (0.13, 0.35) |

Thursday | −1.1226 | 0.2335 | −4.8083 | < 0.001 | 0.33 (0.21, 0.51) |

Friday | −1.2291 | 0.235 | −5.2291 | < 0.001 | 0.29 (0.18, 0.46) |

Saturday | −1.366 | 0.2438 | −5.6021 | < 0.001 | 0.26 (0.16, 0.41) |

a1^{c} | 1.8163 | 0.1978 | 9.1823 | < 0.001 | −−−−− |

a3^{d} | −3.97e−04 | 1.20e−04 | −3.3139 | < 0.001 | −−−−− |

Log (theta) | −14.8195 | 510.4217 | −0.029 | 0.9768 | −−−−− |

^{a}relative risk. ^{b}confidence interval. ^{c}v1 = t − 4199. ^{d}v3 = (t − 1155)*I (1155 < t < 4199) − 3044*I (t < 4199)^{ e}a1 = I (t < 2861). ^{f}a3 = t*I (t < 2861). where t is the number of days starting from 1st Jan. 1991, in which t = 1, 1155, 2861 and 4199 imply respectively, 1st Jan. 1991, 28th Feb. 1994, 31st Oct. 1998 and 30^{th} Jun. 2002. I (x) is an indicator function which is equal to 1 when x is true and 0 otherwise.

assumption that the current demographic, social and environmental settings remain similar. Sensitivity analysis To evaluate if the modeling on the impact of weather predictors is stable, we examined the estimated coefficients of weather predictors when our model adopted different trends. For example, we exchanged the trend functions used in the binomial regression model and in the ZTNB model. The model AIC values increased slightly, but results for the impact of weather predictors were still similar. In addition, when we included the information of HRI of the Standard Chartered Marathon Singapore from 2002-2010 (see

Model validation To validate the model, we used data from 1991 to 2008 as training data to estimate model parameters, then applied the trained model to predict HRI cases in 2009 and 2010 given weather variables in these two years. The results were shown in

To our best knowledge, this study is the first mathematical modeling study to demonstrate the combined effect of weather conditions on the incidence of hospital admissions for HRI over 20 years in Singapore. This study provides an extensive investigation on the association between weather conditions and occurrence of HRI. The model developed can be used to predict HRI occurrences to enable the mitigation of HRI and preparation of onsite medical cares. Further, based on the model constructed, we also estimate the future HRI occurrences in Singapore under climate change impact to provide references for future related policy making.

In our study, a hurdle model showed that the impacts of weather factors on the occurrence and non-occurrence of HRI cases and on the positive counts of HRI cases were different and extra zeros existed. The extra zeros might be

because some people were vulnerable to HRI incidence while others were not. In [

Besides weather factors, the impact of day-of-the-week (

The positive counts of HRI were described by a zero-truncated negative binomial (ZTNB) model, as NB distribution was a mixture of several Poisson distributions [

The tropical climate in Singapore will continue to pose a health risk on HRI, especially among those who undertake strenuous outdoor physical activity. With the global climate change, the temperature is with increasing trend, which will impact more to the health risk of HRI in tropical country, like Singapore. This will alert the public health policy makers to think about corresponding policy to mitigate the increase of HRI cases and utilize HRI prediction model to forecast HRI occurrences to prevent HRI cases or prepare for onsite medical cares.

Heat related illness is an event with low occurrence rates in Singapore since the late of 1990s. To reveal the impact of weather factors on HRI and the future public health risk of HRI, we proposed a hurdle negative binomial regression combined with a distributed lag nonlinear model considering that hurdle models could answer simultaneously the possibility of occurrence of HRI and the count when it occurred. The model constructed could be used to predict the occurrence of HRI cases and evaluate the future public health burden of HRI under climate change trend, which can provide useful inputs for policy makers for related policy making. The proposed modeling approach can also be applied to model and evaluate lag and nonlinear effect of environmental factors on other public health concerns based on count data with excess zeros.

Xu, H.-Y., Fu, X.J., Lim, C.L., Ma, S., Lim, T.K., Tambyah, P.A., Habibullah, M.S., Lee, G.K.K., Ng, L.C., Goh, K.T., Goh, R.S.M. and Lee, L.K.H. (2018) Weather Impact on Heat-Related Illness in a Tropical City State, Singapore. Atmospheric and Climate Sciences, 8, 97-110. https://doi.org/10.4236/acs.2018.81007