The links between low temperature and the incidence of disease have been studied by many researchers. What remains still unclear is the exact nature of the relation, especially the mechanism by which the change of weather effects on the onset of diseases. The existence of lag period between exposure to temperature and its effect on mortality may reflect the nature of the onset of diseases. Therefore, to assess lagged effects becomes potentially important. The most of studies on lags used the method by Lag-distributed Poisson Regression, and neglected extreme case as random noise to get correlations. In order to assess the lagged effect, we proposed a new approach, i.e., Hidden Markov Model by Self Organized Map (HMM by SOM) apart from well-known regression models. HMM by SOM includes the randomness in its nature and encompasses the extreme cases which were neglected by auto-regression models. The daily data of the number of patients transported by ambulance in Nagoya, Japan, were used. SOM was carried out to classify the meteorological elements into six classes. These classes were used as “states” of HMM. HMM was used to describe a background process which might produce the time series of the incidence of diseases. The background process was considered to change randomly weather states, classified by SOM. We estimated the lagged effects of weather change on the onset of both cerebral infarction and ischemic heart disease. This fact is potentially important in that if one could trace a path in the chain of events leading from temperature change to death, one might be able to prevent it and avert the fatal outcome.
The association between low temperature and morbidity of disease is well recognized (e.g., [
Many works have been done to clarify the lagged effects, and most of them used Poisson regression model with certain spline functions or some smoothing technique. This direction resulted to neglect all of the extreme cases in order to get “stochastically significant” correlation.
For example, in [
Thus, many works used Poisson regression model with certain smoothing functions to address the lagged effects. These methods inevitably neglected the existence of extreme case as exceptional cases or as random noise. However, such exceptional cases (e.g. combination of a rise in temperature and an increase in risk) were recognized to be important and not negligible for cerebral infarction by [
This study was carried out for the data from Nagoya city, Japan. Its population is over 2,260,000 inhabitants. It is situated in the middle of Japan, facing the Pacific Ocean. The climate in the city is that known as a typical mild Japanese climate, representing the change of four seasons.
The daily number of patients was obtained from Nagoya City Fire Department. The data contained the number of patients who were first transported by ambulance to a hospital and then diagnosed, at the hospital, as cerebral infarction, ischemic heart disease, myocardial infarction, angina pectoris and so on. The data contained all ages. These data was taken among two periods. One was from 2002 to 2005 and the other from 2009 to 2012.
As for meteorological data, we selected a daily data supplied by Japan Meteorological Agency. The data consisted of temperature (mean, maximum and minimum temperature) and the hours of sunshine and so on.
Self-Organizing Map (SOM) is a kind of “cluster mapping”, and was first introduced by [
SOM uses artificial neural networks to find a continuous mapping from input space or layer to a target layer or lattices in two-dimensional space. These lattices are considered as “neurons”. These points in lattices in plane were also called “units”. The map was realized so that as much as possible of the original structure of the measurement vectors in the n-dimensional space are to be conserved in lattice structure in plane. As a result, if the points in original data are “near” (or “distant”), then they were mapped to “near” (or “distant”) units in plane. Thus SOM visualizes cluster tendency of the data.
A hidden Markov model was a tool for representing random change of states over time series of observations. The method was applied broadly to many fields, for example, to DNA profiles ( [
The “states” in HMM were considered as a representation of a process in “background”. One can suppose there was a sort of “background” even for the incidence of diseases. Here we supposed that such background states were a kind of randomly changing weather states. In this article, such states were given by the classification due to Self-Organized Map (SOM) which was applied to meteorological elements. This idea realized the links between the change of weather patterns and the change of the risk of cerebral infarction and ischemic heart disease. For basic elements of HMM, see [
In this article, SOM was carried out to the daily data of eight weather elements (such as maximum temperature, minimum temperature, precipitation, humidity, local pressure, wind velocity, the hours of daylight and solar radiation) in Nagoya city. The data were supplied by Japan Meteorological Agency and were collected during two periods, i.e., from 2002 to 2005 and from 2009 to 2012. Here we used the so-called “standard” SOM, based on unsupervised neural learning algorithms. The obtained units or classes were used as the states of Hidden Markov Models (described later).
As a target layer, lattices of 3 times 2 units (totally 6 units) were selected. Thus, we obtained our classification of meteorological data to just six classes of “weather states”. See
Thus these six classes were named “(a) high pressure (warm), (b) high pressure (cold), (c) low pressure (cold, windy), (d) low pressure (rainy), (e) low pressure (warm), and (f) low pressure (humid)”, according to the character of each class.
HMMs consist of two kinds of elements: one is the set of “states”, the other is series of “observation”. Both “states” and “observations” change randomly as times go by and the “states” were supposed to generate “observations” by some mechanism.
To understand links between the incidence of diseases and the weather, the variability of weather could be thought as a s “background” bringing the incidence of diseases such as stroke incidence and ischemic heart disease. Here we supposed that such background states of weather changed randomly and formed a set of states in HMM. The “states” were those classes obtained by the above SOM which express six weather patterns; “high pressure (warm), high pressure (cold), low pressure (cold, windy), low pressure (rainy), low pressure (warm), and low pressure (humid)”.
The “observation” was the daily data of numbers of patients who were once transported by ambulance in Nagoya city and were diagnosed later as cerebral infarction or ischemic heart diseases.
The observation at time t(day)is represented by the variable R(t) (the number of patients). The observation R(t) at time t is generated randomly by some process whose state S(t) (one of the six weather states given by SOM). HMM assumes that the state S(t) is determined randomly from the state S(t − 1) of the previous day. Both random processes are assumed to be Markov process. See
Each state is supposed to change to another state with some probability. The collection of all of these probabilities formes a “Transition Matrix” P, where j-th state changes to i-th state with probability Pij. Each state (e.g., j-th state) at time t generates an observed value, according to some distribution Qj. The collection of such distribution formes a “distribution matrix” Q, where j-th column of Q is equal to Qj. As a consequence, we have a set
where S and R are the sets of states and observation. This set defines a Hidden Markov Model. We calculated
these two matrices P and Q by analyzing the data from 2002 January to 2004 December, and also from the data from 2009 to 2012, separately. See for details [
Cold exposure is not generally associated with an immediate increase of patients or death with respect to cerebral
infarction and ischemic heard disease. There appears to be some interval between the incidence of temperature change and the onset of these diseases. Such interval is called “lag” or “delay”.
Many studies used Poisson regression and distributed lag models. In this article, we applied our hidden Markov model to find “lag” or “delay”. For this purpose, hidden Markov model was shifted according to the amount of delay. This procedure was illustrated by the comparison of
To estimate the “lag”, the simulation was carried out for several times (500 or 1000 times) for this shifted hidden Markov model for the given delay. The comparison between the original observed values and these simulated sequences was performed by calculating the root mean square errors (RMSEs) of these two sequences. We first fixed the delay = d (days) and considered shifted HMM of delay d. Then, starting from certain day (e.g. 15-th of January in 2005), we let the HMM generate simulated sequences of the risk R(t) during T days:
Here, T was taken to be equal to one of the numbers 3, 5 or 7 days.
We compared this simulated sequence with the original sequence of the number of patients:
and calculated the Root Mean Square Error of these two sequences:
We repeated this process 500 times to get the 500 sequences of
For example, in
The next example was described in
The calculations of RMSE versus delay were similarly performed for the data from 2009 to 2012 with respect to the incidence of both Cerebral Infarction (CI) and Ischemic Heart Disease (IHD). We fixed the month, and
calculated transition matrix and distribution function looking at all the months from 2009 to 2012. Then we constructed the HMM by the transition matrix and distribution. By shifting this HMM according to each delay, we could get the graphs of RMSE versus delay for each month. The results were illustrated in
The existence of lags was shown by both graphs. The lags of 4 - 6 days were observed for the months, January, February, March, April, August, October and December for CI. The lags of 2 - 6 days were observed for the months, January, February, April, June, July, August, September, November and December for IHD.
The 4 - 6 days of delay were observed for the months 1, 2, 3, 4, 8, 10, 11 and 12. No delay was found for the months 5, 6, 7 and 9.
The 2 - 6 days of delay were observed for the months 1, 2, 4, 6, 7, 8, 9, 11 and 12. No delay was found for the months 3, 5 and 10.
The t-test and Wilcoxon-test were used to test the stochastic significance of the existence of lagged effects. To illustrate this process, we selected, for example, the case of cerebral infarction in August from 2009-2012 from
The lag of 4 days was observed in
The lagged effects of weather state on the onset of cerebral infarction (CI) and ischemic heart disease (IHD) were investigated using shifted hidden Markov model with weather states given by self-organized maps. We found the delay of 2 - 6 days both for CI and IHD. The existence of delay was examined by the graphs of root mean squared error (RMSE) versus delay. The stochastic significance of the existence of delay was assured by t-test and Wilcoxon-test.
t-test | Wilcoxon-test | |
---|---|---|
p-value | 0.005 | 0.008 |
The existence of delay was already well-known, but most of researchers used regression models by excluding exceptional cases as noise and by smoothing with spline functions. The present paper proposed a use of hidden Markov models with weather states as a new method for this direction.
To compare regression models and our HMM, we performed to calculate the lagged effects by usual Poisson regression models. For this purpose, we selected “Residual standard error (RSE)” which were the standard index of errors for auto-regression models.
The delay of 3 - 4 days was seen from
This result showed that regression models were not statistically significant if they did not use the spline or smoothing function, whereas our HMM included exceptional cases as randomness innature, which had been excluded by regression model.
In [
Delay 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|---|
Intercept | 0.01 | 0.02 | 0.004 | 0.004 | 0.26 | 0.27 | 0.03 | 0.019 |
Mean temperature | 0.23 | 0.31 | 0.11 | 0.13 | 0.9 | 0.88 | 0.4 | 0.29 |
In [
In [
All these evidences suggest that cold exposure does not direct the onset of CI or IHD immediately, and that it needs more time to lead to the thrombosis through the state of increase of plasma and whole blood viscosity.
In conclusion, our hidden Markov model is more natural than regression model to assess the lagged effects of weather states on the incidence of cerebral infarction and ischemic heart disease. While regression models are not statistically significant without use of spline or smoothing functions, our hidden Markov model encompasses exceptional cases (as random possibility) which were excluded normally by regression models. Our HMM could show the existence of lags for the effect of weather changes on cerebral infarctions and ischemic heart disease. This finding may make it possible to take precautionary measures against the fatal outcome after heat shock including cold exposure.
Hiroshi Morimoto, (2016) Hidden Markov Models to Estimate the Lagged Effects of Weather on Stroke and Ischemic Heart Disease. Applied Mathematics,07,1415-1425. doi: 10.4236/am.2016.713122