| 
					 Vol.3, No.12, 712-731 (2011) doi:10.4236/health.2011.312120    Copyright © 2011 SciRes.                               Openly accessible at  http://www.scirp.org/journal/HEALTH/                                                                 Health    Spatial autocorrelation analysis of 13 leading malignant  neoplasms in Taiwan: a comparison between the  1995-1998 and 2005-2008 periods  Pui-Jen Tsai1*, Cheng-Hwang Perng2    1Center for General Education, Aletheia University, New Taipei, Taiwan; *Corresponding Author: puijentsai@gmail.com  2Department of Statistics and Actuarial Science, Aletheia University, New Taipei, Taiwan.  Received 23 September 2011; revised 10 November 2011; accepted 21 November 2011.  ABSTRACT  Spatial autocorrelation methodologies, includ-  ing Global Moran’s I and Local Indicators of  Spatial Association statistic (LISA), were used  to describe and map spatial clusters of 13  leading malignant neoplasms in Taiwan. A lo-  gistic regression fit model was also used to  identify similar characteristics over time. Two  time periods (1995-1998 and 2005-2008) were  compared in an attempt to formulate common  spatio-temporal risks. Spatial cluster patterns  were identified using local spatial autocorrela-  tion analysis. We found a significant spatio-  temporal variation between the leading malig-  nant neoplasms and well-documented spatial  risk factors. For instance, in Taiwan, cancer of  the oral cavity in males was found to be clus-  tered in locations in central Taiwan, with distinct  differences between the two time periods. Sto-  mach cancer morbidity clustered in aboriginal  townships, where the prevalence of Helicobacter  pylori is high and even  quite  marked  differ ence s  between the two time periods were found. A  method which combines LISA statistics and  logistic regression is an effective tool for the  detection of space-time patterns with discon-  tinuous data. Spatio-temporal mapping com-  parison helps to clarify issues such as the spa-  tial aspects of both two time periods for leading  malignant neoplasms. This helps planners to  assess spatio-temporal risk factors, and to as-  certain what would be the most advantageous  types of health care policies for the planning  and implementation of health care services.  These issues can greatly affect the performance  and effectiveness of health care services and  also provide a clear outline for helping us to  better understand the results in depth.   Keywords: Spatial Autocorrelation Analy sis; Global  Moran’s I Statistic; Local Indicators of Spatial   Association Statistic; Logi stic R egression;   Malignant Neoplasm; Taiwan  1. INTRODUCTION  Spatial analytical techniques and models can identify  spatial anomalies in the epidemiology of diseases, iden-  tify “hot spots” and locate spatio-temporal patterns.  Cluster mapping clarifies issues of internal and external  correlations, while logistic regression is a useful ap- proach for the differentiation of spatial distribution pat- terns over time. Common spatial techniques for health  research include: disease mapping, clustering techniques,  diffusion studies, identification of risk factors through  comparisons, and regression analyses [1]. All of these  methods are useful when assessing risk factors. They  also facilitate the planning of health care policies and  support the implementation of effective health care ser- vices.   Cuzick and Edwards (1990) [2] proposed three gen-  eral methodologies for the detection of clustering. Spa-  tial autocorrelation statistics, such as Moran’s I [3-6] an d  Geary’s C [3-5] are global methods used to estimate the  overall degree of spatial autocorrelation in a dataset.  However, the possibility of spatial heterogeneity sug-  gests that the estimated degree of autocorrelation may  vary significantly. Local spatial autocorrelation statistics  provide estimates disaggregated to the unit level, allow-  ing the assessment of dependency relationships in dif-  ferent areas. LISA detect local spatial autocorrelation in  aggregated data by dividing Moran’s I statistic into con-  tributions for each area within a study region. These in-  dicators can detect clusters of similar or dissimilar dis-  ease frequency values around a given observation [7].  Unlike Moran’s I statistic, which measures the correla-  tion between attribute values in adjacent areas, the Gi(d)    P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at  http://www.scirp.org/journal/HEALTH/  713713 local statistic is an indicator of local clustering that  measures the “concentration” of a spatially distributed  attribute variable [8,9].   The analysis of spatio-temporal change is a major  concern in geographical research. Analytical approaches  include: the Knox  test [10], Mantel’s Z statistic [11], the  Jacquez k nearest neighbor test [12], Kulldorff’s spatial  scan statistic [13-15] and Bayesian spatial scan statistic  [16]. Herein, we are primarily interested in detecting  clusters that emerge over time, and our goal is to detect  emerging clusters as early as possible. For example, in  the public health domain, our goal is to detect emerging  clusters of disease indicative of naturally occurring dis-  ease outbreaks (such as influenza), bioterrorist attacks  (such as anthrax release), or environmental hazards  (such as a radiation leak). Clearly, the early detection of  such clusters would contribute to a more rapid response,  leading to lives being saved.   Cancer is one chronic disease with a multi-stage pro-  gression. Many studies examine cancer incidence at dif-  ferent times, under different environmental exposures  and in different ethnic groups. Cancer incidence changes  over time for people of different ages, which may be due  to variations in lifestyle, changing environmental expo-  sure, etc. Cancer incidence also varies in different geo-  graphic locations [17-20]. Again, this may have various  explanations with environmental impact being a strong  possibility.   The detection of spatio-temporal clustering generally  requires continuous data. Discontinuous data, with dif-  ferent durations of disease surveillance at the same loca-  tion, present a challenge. This study focuses on the use  of a set of discontinuo us data to detect ch anges in  spatio-  temporal clustering. We propose herein a method for  ascertaining spatial clustering associated with the 13  leading malignant neoplasms, based on medical-care  data collected by the Taiwan National Health Insurance  and Taiwan Cancer Registry agencies. To test this ap-  proach, we have compared local clusters between two  periods (1995-1998 and 2005-2008) looking for simi-  larities. We have also investigated potential spatial risks  that could contribute to these health care events, rede-  fining epidemiologic and spatially referenced data.   2. MATERIALS AND METHODS  2.1. Study Area  The study area included the main island of Taiwan  (excluding all surrounding islets) which, in the year  2000, comprised more than 22 million  inhabitants living  in an area of 36 ,0 00 k m2. A total of 350 local administra-  tive government areas, including five main urban areas,  two secondary urban areas, 162 rural townships, and 54  aboriginal townships on the plain and in mountainous  regions, were assessed (Figure 1). According to a 2002  Ministry of Interior report, urban areas are classified as  regions having at least one metropo litan centre, and they  can include neighboring cities and townships that share  socio-economic activities. Main urban areas are defined  as those with a population larger than one million , speci-  fically, Taipei-K eelun g, Kaohsiung , Taich ung -Changh u a,  Jhongli-Taoyuan and Tainan. Secondary urban areas are  defined as those with a residential population ranging  from 0.3 to 1 million (e.g. Hsinchu and Chiayi).  2.2. Data Collection and Management  The Taiwan Nation al  Health In suran ce (NHI) prog ram  was initiated in 1995. The coverage rate of the program  increased from 92.4% in 1995 to more than 96.2% in  2000, increasing to 98% after the inclusion of those ac-  tive in the military forces in 2001. Once the NHI medi-  cal care data were properly collected and analyzed, a  complete picture of population behaviors according to  disease could be used for reference in the calculation of  prevalence and incidence of various diseases.   At the beginning of  2004, NHI data that was available  relative to medical care, such as the leading causes of  death, were reclassified and reprocessed in relation to  smaller units or areas (for example, precincts or town-  ships rather than the country as a whole). In addition,  regional data from the statistical analysis system (SAS)  program are now announced publicly by the NHI in     Figure 1. Map of urban areas and aboriginal townships in the  study area. Map of the study area divided into 350 administra-  tive districts including seven urban areas and an integrated area  of 54 plains and mountain aboriginal townships.  regular annual reports (for example, NHI, 2005-2008  [21-24]). These reports provide an accurate and reliable  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at  http://www.scirp.org/journal/HEALTH/  714  data source for the investigation of health care issues in  Taiwan.   Data were collected from contractual medical-care in-  stitutions, where th e NHI covers the costs of prescription  medicines and treatment at outpatient clinics. Such fa-  cilities accumulate detailed databases on medical costs  for inpatient care. The number of outpatient cases were  classified in relation to disease codes, as defined in the  1975 edition of “The International Classification of Dis-  eases, 9th Revision, Clinical Modification” (ICD 9 CM).  Patients suffering from diseases that were difficult to  classify into a given code or had mismatched ID num-  bers were not included in the final statistical data set.  Disease codes were classified according to gender and  age. Cases with the same ID numbers, but which exhib-  ited different diseases, were counted as different in-  stances.   Medical care data obtained from the 2005-2008 NHI  reports were examined, and the morbidity rates of the 13  leading causes of death were calculated. Disease classi-  fications (according to the ICD 9 CM) included the fol-  lowing (indicated within parentheses): trachea, bronchus,  and lung cancer (ICD 162); liver and intrahepatic bile  ducts cancer (ICD 155); colon and rectum cancer (ICD  153, 154); stomach cancer (ICD 151); oral cavity cancer  (ICD 140, 141, 143-146, 148, 149); oesophagus cancer  (ICD 150); pancreas cancer (ICD 157); non-Hodgkin’s  lymphoma (ICD 200, 202, 203); gallbladder and extra-  hepatic bile ducts cancer (ICD 156); leukaemia (ICD  204-208); female breast cancer (ICD 174); cervix uteri  cancer (ICD 179, 180); and prostate cancer (ICD 185).   Demographic information was provided by the Minis-  try of Interior [25]. The smallest administrative units  coded for examination of the various diseases cases or  health care events were precincts and townships. Age-  adjusted standard morbidity rates, adjusted using the  Segi (“world”) population in 1976 as the standard [26],  were then calculated prov iding results giving the leading   causes of death for males and females in each township.   During the period from 1995 to 1998, data on age-  adjusted malignancies by precinct and township were  obtained from the Atlas of Cancer Mortality and Inci-  dence in Taiwan, officially published by the Bureau of  Health Promotion, Department of Health [27].   2.3. Statistics  The global Moran’s I spatial autocorrelation was used  to assess the correlation among neighbouring observa-  tions and to identify patterns and levels of spatial clus-  tering in neighbouring districts [28]. The Moran’s I sta-  tistic, similar to the Pearson correlation coefficient [29],  was calculated by the following formula:     2 ij ij ij Oi i xx x N Iw Sxx          (1)  where N is the number of districts, wij the element in the  spatial weight matrix corresponding to the observation  pair i, j and xi and xj observations for the areas i and j  with the mean   and:   O ij Sij w              (2)  Since the weights were row-standardized (1 ij w ),  the first step in the spatial autocorrelation analysis was  to construct a spatial weight matrix that contained in-  formation about the neighbourhood structure for each  location. Adjacency was defined as immediately neigh-  boring administrative districts, including the district it- self. Non-neighbouring administrative districts were  assigned the we i ght  of  zero .   Spatial contiguity for polygons is defined as the prop-  erty of sharing a common boundary or vertex. Contigu-  ity analysis is an importan t method fo r assessing unu sual  features in connectivity distribution [4,30]. The Queen’s  measure of contiguity can be utilized to make up for  spatial contiguity by incorporating both the Rook and  Bishop relationships into a single measure [30]. The  administrative districts considered in this study were  highly irregular in  both shape and size. Tsai et al. (20 09)  demonstrated that the most appropriate method is the  first order queen polygon contiguity method for quanti-  fying the spatial weights matrix for the analysis of con-  nectivity. Based on this approach, the spatial weight/  connectivity matrices were determined and utilized in  conjunction with the global Moran’s I statistic and fol-  lowing LISA calculations [6].  Moran’s I va lues may range from –1 (dispersed) to +1  (clustered). A Moran’s I value of 0 suggests complete  spatial randomness. A random permutation procedure  recalculates a statistic many times by reshuffling the data  values among the map units to generate a reference dis-  tribution. The obtained calculated statistic based on the  observed spatial pattern is then compared to this refer-  ence distribution an d a pseu do  significance level (pseudo  p-value) computed. To verify that the value of Moran’s I  was significantly different from the expected value, we  applied a Monte Carlo randomisation test with 999 per-  mutations to achieve highly significant values. Data  values were reassigned among the N locations, providing  a randomised distribution against which one may judge  the observed value. If the observed value of I was within  the tails of this distribution, there was significant spatial  autocorrelation in the data, a pseudo p-value smaller  than 0.05, and the assumption of independence among  the observations could be rejected [31].   LISA statistic provides information related to the lo-  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at  http://www.scirp.org/journal/HEALTH/  715715 cation of spatial clusters and outliers and the types of  spatial correlation. Local statistics are important becau se  the magnitude of spatial autocorrelation is not necessar-  ily uniform over the study area [7,32]. LISA allowed us  to divide the study area into small locations, thus ena-  bling the assessment of significant local spatial cluster-  ing around an individual location. In addition to the de-  gree of spatial clustering, detailed variations of cluster-  ing in the locally defined geo-space were identified as  well as the locations of the spatial clusters. The local  version of Moran’s I at location i is given by:     21i i j i i xx  ijj wx x xx n          (3)  where n indicates the total number of locations (350  townships used in the years 1995-1998 and 349 town-  ships in 2005-2009); xi denotes the value of the variable  of interest, X, at location I; xj denotes the observation at  neighboring location s j; and   is th e sample average of  X. wij is the spatial weight matrix, which defines spatial  interaction across study regions. In general, wij = 1 if  location i and location j are neighboring, (share a com-  mon boundary); otherwise, wij = 0. In this study, spatial  contiguity was assessed as the first order queen’s conti-  guity which defines spatial n eighbors as those areas with  shared borders and vertexes.  Significance was tested by comparison to a reference  distribution obtained by random permutations [7]. This  analysis used 999 permutations to determine differences  between spatial un its. A positive value for the local Mo-  ran’s I index (i ) indicates that a feature has neighboring  features with similarly high or low attribute values and is  therefore part of a cluster. A negative value for (i ) in-  dicates that a feature has neighboring features with dis-  similar values; this feature is an outlier. In either instance,  the p-value for the feature must be small enough for the  cluster or outlier to be considered statistically significant.  LISA enables distinguishment between a statistically  significant (0.05 level) cluster of high values (HH), a  cluster of low values (LL), an outlier in which a high  value is surrounded primarily by low values (HL), and  an outlier in which a low value is surrounded primarily  by high va lues (LH).  In   add itio n to the va lue of a z-score  larger than +1.96, the outcomes are defined as clusters  with both HH and  LL. In th e case of a value of a z-score  less than –1.96, the outlier is considered  as clusters with  (HL) and (LH). We consider that outliers may not be  stablily and precisely displayed the outcomes of spatio-  temporal pattern comparison, because it is difficult to  distinguish between outliers how strength with or with-  out disease risks. Therefore, only hot and cold spots are  mapped on local Moran’s maps.   In addition to mapping, similarities between spatial  distribution patterns for the two periods (1995-1998 and  2005-2008) were determined using logistic regression  analysis. The binary response indicates whether there is  significant autocorrelation between administrative dis-  tricts or areas. The correlation is better (higher) if the  value of the z-score of the local Moran’s I statistic is  larger than +1.96 (clusters with hot spots and cold spots),  otherwise it is deemed to be low. The model is exp ressed  as:    01 PrHigher correlation log Period PrLower correlation        (4)  where the Period is considered an explanatory variable  in the logistic regression model and the two β valu es the  logistic regression coefficients of the model. Pr (Higher  correlation) and Pr(Lower correlation) denote the  “Higher” and “Lower” correlation probabilities, respec-  tively. In this study, two distinct precincts, the central  and west precincts in Tainan city,  merged into one single  unified administrative unit in 2004. These unpaired data  were omitted and the total data from 348 townships were  tested using logistic regression.  Modeling of the logistic regression was performed  using SPSS 12. Global Moran’s I statistic and local Mo-  ran’s I statistic was calculated using Geoda (http://www.  geoda.uiuc.edu/),  an open  source spatial analysis system,  and visualized on LISA cluster maps using ArcMap 9.3.  3. RESULTS  Figure 2 displays the spatial clusters (hot spots and  clod spots) as obtained using LISA statistic for the top  13 leading malignant neoplasms for both males and fe-  males in Taiwan during two time periods (1995-1998  and 2005-2008).  Tab le 1 summarizes the results from global autocor-  relation statistics for the top 13 leading malignant neo-  plasms according to gender and in the two time periods  (1995-1998 and 2005-2008) in Taiwan. The results of  the global Moran’s I tests for most cases related to the  leading malignant neoplasms are statistically significant,  having a pseudo p-value smaller than 0.05, and indicated  spatial heterogeneity. However, opposing results (a pseu-  do p-value larger than 0.05) emerged in nine cases of  which are pancreas cancer for males (1995-1998), non-  Hodgkin’s lymphoma for males (1995-1998) and fe-  males (1995-1998 and 2005-2008), gallbladder and ex-  trahepatic bile ducts cancer for males (1995-1998) and  females (1995-1998 and 2005-2008), and leukemia for  males (2005-2008) and females (1995-1998), respec-  tively.  Table 2 summarizes the typology patterns, as calcu-             P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               http://www.scirp.org/journal/HEALTH/ 716    Openly accessible at   P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  717717 P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  718  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  719719 P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  720  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  721721 P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  722  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  723723 P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  724  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  725725 P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  726  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               http://www.scirp.org/journal/HEALTH/ 727727     Figure 2. Spatial clusters of the 13 leading malignant neoplasms in Taiwan. Maps showing the spatial clusters of the 13 leading ma-  lignant neoplasms in Taiwan: A indicates trachea, bronchus, and lung cancer; B, liver and intrahepatic bile ducts cancer; C, colon and  rectum cancer; D, stomach cancer; E, oral cavity cancer; F, oesophagus cancer; G, pancreas cancer; H, non-Hodgkin’s ly mphoma; I,  gallbladder and extrahepatic bile ducts cancer; J, leukaemia; K, female breast cancer; L, cervix uteri cancer; M, prostate cancer. 1  indicates males within the period from 1995 to 1998 years; 2, males within the period from 2005 to 2008 years; 3, females within the  period from 1995 to 1998; 4, females within the period from 2005 to 2008.  lated using LISA statistic, categorized as clusters or non-  clusters at a z-score larger than +1.96. It also compares  the top 13 leading malignant neoplasms during the two  time periods (1995-1998 and 2005-2008).   Dissimilarities between the spatial distribution pat-  terns during the two  periods (1995-19 98 and 2005-2 008)  are not statistically significant (p-value > 0.05) in males  for six out of eleven spatial clusters, and in females for  ten of twelve spatial clusters. In males, there are dis-  similarities for stomach cancer, oral cavity cancer, pan-  creas cancer, non-Hodgkin’s lymphoma, and prostate  cancer. In females, colon and rectum cancer, and pan-  creas cancer are dissimilar. Ta b l e  2  presents these find-  ings.   4. DISCUSSION  Locations in close proximity tend to share similar  attributes. According to Tobler (1979), “everything is  related to everything else, and nearby things are more  closely related to nearby things than to distant things”  [33]. In epidemiology, a cluster becomes apparent when  a number of health events occur which are situated close  together in space and/or time. The evaluation of spatial  distributions as a measure of disease risk may provide  etiological insights [34]. Spatial autocorrelation is the  relation between the values of a single variable attribut-  able to the geographic arrangement of areal units on a  map and can be used to determine the degree of spatial  clustering [35,36]. In this study, local Moran’s I statistic  was used to measure the degree of spatial clustering and  map the geographic patterns of the areal units. Spatial  clustering of the leading cause of death (also called hot  spots and cold spots) was identified by a z-score value  arger than +1.96. In epidemiology, “hot spots” are  l     Openly accessible at   P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  728    Table 1. Global autocorrelation analysis of data for the 13 leading malignant neoplasms in Taiwan, according to gender, during 1995-  1998 and 2005-2008.   Moran’s I  Leading malignant neoplasms (ICD code) Male Female   1995-1998 2005-2008 1995-1998 2005-2008  Trachea, bronchus, and lung cancer (ICD 162) 0.38* 0.46* 0.17* 0.17*  Liver and intrahepatic bile ducts cancer (ICD 155) 0.45* 0.59* 0.34* 0.42*  Colon and rectum cancer (ICD 153, 154) 0.40* 0.52* 0.40* 0.49*  Stomach cancer (ICD 151) 0.34* 0.37* 0.22* 0.35*  Oral cavity cancer (ICD 140, 141, 143- 146, 148, 149) 0.43* 0.68* 0.09* 0.68*  Oesophagus cancer (ICD 150)  0.24* 0.22* 0.07* 0.25*  Pancreas cancer (ICD 157) 0.05 0.18* 0.07* 0.22*  Non-Hodgkin’s lymphoma (ICD 200, 202, 203) 0.02 0.07* 0.05 0.05  Gallbladder and extrahepa t i c   b i l e ducts cancer (ICD 156) 0.06 0.14* 0.05 0.04  Leukaemia ( ICD 204-208) 0.08* 0.04 0.01 0.08*  Female breast  cancer (ICD 174) n.d. n.d. 0.52* 0.53*  Cervix uteri cancer (ICD 179, 180) n.d. n.d. 0.24 * 0.26*  Prostate cancer (ICD 185) 0.12* 0.60* n.d. n.d.  n.d.: no detection. *: A pseudo p-value smaller than 0.05.  Table 2. Logistic regression model comparisons of the 13 leading malignant neoplasms in Taiwan, during 1995-1998 and 2005-2008.  Male Female  Leading malignant neoplasm s (I CD code) p-value description p-value description  Trachea, bro n c h u s , and lung cancer (ICD 162) 0.245 similaritya 0.21 similaritya  Liver and intrahepatic bile duc t s  c anc er (ICD 155) 0.505 similaritya 0.412 similaritya  Colon and rectum cancer (ICD 153, 154) 0.492 similaritya 0.019 dissimilaritya  Stomach cancer (ICD 151) 0.034 dissimilaritya 0.053 similaritya  Oral cavity cancer (ICD  1 4 0, 141, 143-146, 148, 149) 0.007 dissimilaritya 0.229 similaritya  Oesophagus cancer (ICD 150) 0.844 similaritya 0.266 similaritya  Pancreas cancer (ICD 157) 0.029 dissimilarity 0.047 dissimilaritya  Non-Hodgkin’s lymphoma (ICD 200, 202, 203) 0.006 dissimilarity 0.179 similarity  Gallbladder and extrahepatic   b i le   d u ct s   c a nc e r (ICD 156) 0.409 similarity 0.197 similarity  Leukaemia ( ICD 204-208) 0.137 similarity  0 .781 similarity  Female breast cancer (ICD 174)  n.d. 0.182 similaritya  Cervix uteri cancer (I C D  179 , 180)  n.d. 0.84 similaritya  Prostate cancer (ICD 185) 0.007 dissimilaritya  n.d.  n.d.: no detection. a: A comparison of the two periods during which all of Moran’s test results are clusters (results based on Table 1).  considered interesting because of their correlation to  aetiology. This study, therefore, focuses on the spatial  locations of 13 leading malignant neoplasms. Information  about spatial location is useful for detecting risk from a  spatial point of view. A more detailed survey of these  identified “hot spots” may provide important clues on  risk factors for these diseases.  The modifiable areal unit problem (MAUP) is a phe-  nomenon whereby analysis of the same data provides  different results, grouped into different sets of areal units.  The MAUP can be subdivided into two separate effects  that usually occur simultaneously during the analysis of  aggregated data. The scale effect causes variation in  statistical results according to different levels of aggre-  gation. An association between variables, therefore, de-  pends on the sizes of the areal units of the rep orted data.  Generally, correlation increases as the size of the areal  unit increases. The zone effect describes variations in  correlation statistics caused by the regrouping of data  into different configurations, but with the same scale.  The MAUP occurs because spatial processes generating  the observed data may exist within certain scales, and for  particular areal units. These may be reflected more or  less accurately by the boundaries in use [37]. Manley et  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  729729 al. (2006) concluded that MAUP is not really a problem,  but rather, a resource. Data at different scale levels can  enable the identification of processes operating within  different scales. It is clear that it is not possible to define  an ideal single census geography that captures all of the  processes for all variables [37]. Furthermore, the internal  composition of given areal units may not be homoge-  neous, particularly for disease distribution. Matisziw et  al. (2008) have suggested that down-scaling the spatial  structure of polygonal units could provide valuable in-  formation pertaining to the spatial distribution of disease  [38]. In this study, administrative government regions  are almost similar but not completely consistent in the  two periods (1995-1998 and 2005-2008). This was to  some degree due to the merging of the central and west  districts in Tainan city merging into one unit in 2004.  The use of only one scale to estimate spatial distribu tion  patterns, although still a cluster comparison, would be  more convenient; however, bias could be caused by  using a non-realistic spatial boundary. An ideal process  would be to calculate the spatial autocorrelation coeffi-  cients (such as the z-scores) based on realistic boundaries  (two scales for shape files that represented 350 townships  in 1995-1998 and 349 townships in 2005-2008, respec-  tively) and then omit the values of autocorrelation co-  efficients that were non-paired data from the comparison  of the two periods within the administrative regions.   The local spatial autocorrelation coefficients can be  tested for statistical significance under two rather dif-  ferent model assumptions. The first is the classical statis-  tical assumption of normality, whereb y it is assumed that  the observed value of the coefficient is the result of the  set of z-score values being independent and identically  distributed drawings from a normal distribution, implying  that variances are cons tant across the reg ion. The second  model is one of randomization, whereby the observed  pattern of the set of z-score values is assumed to be just  one realization from all possible random permutations of  the observed values across all the zones. Both models  have important weaknesses. For example, there is an  underlying population size variatio n and a lack of homo-  geneity of probabilities; however these models are  widely implemented in software packages to provide  estimates of the significance of observed results. In the  case of the randomization model, many software pack-  ages generate a set of N random permutations of the  input data, where N is us er specified. For each simulation  run, index values are computed and the set of such  values are used to  provide a pseudo-probability distribu-  tion for the given problem, against which the observed  value can be compared. A z-transform of the coefficients  under normality or randomization assumptions is distri-  buted approximately as N(0, 1); hence, this may be com-  pared to percentage points of the normal distribution to  identify particularly high or low values [39]. In this  study, the comparison of databases from the two periods  (1995-1998 and 2005-2008) was addressed by the Tai-  wan Cancer Registry and the Taiwan National Health  Insurance agencies, respectively. Although the two data-  bases have a referenced value with high validity and  reliability, this case was defined with the same diagnostic  criteria (ICD 9 CM) and a world standard population in  1976 to calculate the morbidity rate. However, the esti-  mated morbidity rates derived from the two databases  cannot be directly compared with one another. Our  suggested resolution is to change the morbidity rate into  a z-transform by using a spatial autocorrelation calcula-  tion with a randomization of 999 permutations, and this  then makes two z-transform comparisons feasible. Bino-  minal variable logistic regression models were used to  distinguish spatial distribution patterns that addressed  the two periods (1 995-1998 and 2005-2008).  Z-scores for the LISA method were calculated using  the logistic regression model and results for various  leading malignant neoplasms during two periods (1995-  1998 and 2005-2008) were compared. However, the  constraint condition for spatial clustering comparison  (such as global Moran’s tested clusters on both sides) are  required to be satisfied before calculating the logistic  regression for purposes of comparison. Based on this  constraint, the results demonstrate statistically significant  differences for stomach cancer (in males), oral cavity  cancer (in males), prostate cancer (in males), colon and  rectum cancer (in females), and pancreas cancer (in  females). Another eleven compared cases were not signi-  ficantly different. The null hypothesis is, therefore,  accepted. The accepted null hypothesis results indicate  that the common spatial factor(s) may interact with both  periods.  Few previous ecological studies relate to malignant  neoplasms and their correlation to risk  factors in Taiwan,  although oral cancer and stomach cancer have been  documented and are discussed briefly below. It is hoped  that this assessment of the spatial clustering of Taiwan’s  leading malignant neoplasms can contribute to the study  of spatial epidemiology.  Two separate groups identified clusters of areas  showing elevated mortality from oral cavity cancer in  females in the aboriginal townships in eastern Taiwan.  The habits of cigarette smoking, alcohol drinking and  betel nut chewing had higher prevalence in aboriginal  women in eastern Taiwan than in women in other regions  [40,41]. Chiang et al. suggested that high-risk areas of  oral cancer incidence in males closely coincided with  spatial distribution of heavy-metal pollution in soils  (such as chromium and nickel) in central Taiwan [42]. In  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               Openly accessible at http://www.scirp.org/journal/HEALTH/  730  this study, oral cavity cancer clusters for each gender  were calculated using the LISA statistic. Results identify  clear spatial clustering in central Taiwan, for males, and  eastern Taiwan for females, among Taiwanese aboriginal  townships. These observations, therefore, support the  results described in previous studies. However, according  to our results, the two periods (1995-1988 and 2005-  2008), show dissimilarity in the spatial distribution of  oral cavity cancer in males. Spatial risks affecting oral  cancer morbidity in males reveal space-time changes.  These findings could be interpreted as the changing  disease clusters over time, are due to the changes of  exposure cond itions to metal pollu tant and leading to the  results of a variation of virulence. Further investigation  is therefore warranted.   Several meta-analyses identified a strong and consis-  tent association between H. pylori infection and non-  cardiac gastric cancer [43-46]. The ecological study in  Taiwan suggests an association between this infection  and gastric cancer. H. pylori in fection in early childhood  may be a key issue and, it appears, a long indu ction time  is required for gastric carcinogenesis. High gastric  cancer mortality areas are clustered in the aboriginal  townships where the prevalence of H. pylori is high [40,  47]. Our results are similar to these previous studies.  Stomach cancer clusters for males and females are  located in the Taiwanese aboriginal townships, and a  new carcinogen cluster was identified in the northern  coastal region of Taiwan. This is worthy of further  investigation. However, the two periods (1995-1988 and  2005-2008) show dissimilarity in the spatial distribution  of gastric cancer in males. Spatial risks affecting gastric  cancer morbidity in males reveal space-time changes. By  changing disease clusters over time, a possible reason is  due to the changes of prevalence ranges of H. pylori or  increased in the interference of other risks in the study  area. Further investigation is therefore warranted.   5. CONCLUSIONS  A method which combines LISA statistics and log istic  regression is an effective tool for the detection of space-  time patterns with discontinuous data. Similarity is a  result of unchangeable condition in disease risks. Con-  versely, dissimilarity is deemed a significant change of  morbidity risks over the studied periods. This enables  planners to assess spatial risk factors and to determine  the most advantageous types of health care policies for  the planning and imple mentation of health care services.  These issues can greatly improve the performance and  effectiveness of health care services and also provide a  clear outline for better understanding of the results in  depth.   6. ACKNOWLEDGEMENTS  The authors would like to thank Taiwan’s Department of Health for  providing the National Health Insurance and Bureau of Health Pro mo-  tion databases.    REFERENCES  [1] Gesler, W. (1986) The uses of spatial analysis in medical  geography: A review. Social Science & Medicine, 23,  963-973. doi:10.1016/0277-9536(86)90253-4  [2] Cuzick, J. and Edwards, R. (1990) Spatial clustering for  inhomogeneous populations. Jo u rnal of the Roy al Statistical  Society, 52, 73-104.  [3] Cressie, N.A.C. (1993) Statistics for spatial data. Wiley,  New York.  [4] Legendre, P. and Legendre, L. (1998) Numerical ecology.  2nd English Edition, Elsevier , Amsterdam.  [5] Fortin, M.J. (1999) Spatial statistics in landscape ecology.  In: Klopatek, J.M. and Gardner, R.H., Eds., Landscape  Ecological Analysis: Issues and Applications, Springer-  Verlag, New York, 253-279.  doi:10.1007/978-1-4612-0529-6_12  [6] Tsai, P.J., Lin, M.L., Chu, C.M. and Perng, C.H. (2009)  Spatial autocorrelation analysis of health care hotspots in  Taiwan in 2006. BMC Public Health, 9, 464.   doi:10.1186/1471-2458-9-464  [7] Anselin, L. (1995) The local indicators of spatial associa-  tion―LISA. Geographical Anal ysis, 27, 93-115.  d oi:10.1111/j.1538-4632.1995.tb00338.x  [8] Getis, A. and Ord, J.K. (1992) The analysis of spatial  association by use of distance statistics. Geographical  Analysis, 24, 189-206.   d oi:10.1111/j.1538-4632.1992.tb00261.x  [9] Getis, A. and Ord, J.K. (1996) Local spatial statistics: An  overview. In: Longley, P. and Batty, M., Eds., Spatial  Analysis: Modeling in A GIS Environment, John Wiley  &  Sons, New York, 261-277.   [10] Knox, E.G. (1964) The detection of space-time interac-  tion. Appied Statistics, 13, 25-29.  doi:10.2307/2985220  [11] Mantel, N. (1967) The detection of cancer clustering and  the generalized regression approach. Cancer Research,  27, 209-220.  [12] Jacquez, G.M. (1996) A k nearest neighbor test for  space-time interaction. Statistics in Medicine, 15, 1935-  1949.  doi:10.1002/(SICI)1097-0258(19960930)15:18<1935::AI D-SIM406>3.0.CO;2-I  [13] Kulldorff, M. and Nagarwalla, N. (1995) Spatial disease  clusters: Detection and inference. Statistics in Medicine,  14, 799-810. doi:10.1002/sim.4780140809  [14] Kulldorff, M. (1997) A spatial scan statistic. Communi-  cation in Statistic: Theory and Methods, 26, 1481-1496.  doi:10.1080/03610929708831995  [15] Kulldorff, M. (1999) Spatial scan statistics: Models, cal-  culations, and applications. In: Glaz, J. and Balakrishnan,  N., Eds., Scan Statistics and Applications, Birkhäuser,  Boston, 303-322. doi:10.1007/978-1-4612-1578-3_14  [16] Neill, D.B., Moore, A.W. and Cooper, G.F. (2006) A  Bayesian spatial scan statistic. Advances in Neural In-  P.-J. Tsai et al. / Health 3 (2011) 712-731  Copyright © 2011 SciRes.                               http://www.scirp.org/journal/HEALTH/Openly accessible at   731731 formation Processing Systems, 18, 1003-1010.  [17] Greenlee, R.T., Murray, T., Bolden, S. and Wingo, P.A.  (2000) Cancer statistics. A Cancer Journal for Clinicians,  50, 7-33. doi:10.3322/canjclin.50.1.7  [18] Adami, H.O., Hunter, D. and Trichopoulos, D. (2002)  Textbook of cancer epidemiology. Oxford University  Press, New York.  [19] Parkin, D.M., Whelan, S.L., Ferlay, J., Teppo, L. and  Thomas, D.B. (2002) Cancer incidence in five continents.  IARC Scientific Publications, Lyon.  [20] Frank, S.A. (2007) Dynamics of cancer: Incidence, in-  heritance, and evolution. Princeton University Press,  Princeton.  [21] National Health Insurance (2007) Statistical annual re-  port of medical care 2005. National Health Insurance  (Taiwan), Taipei.  [22] National Health Insurance (2008) Statistical annual re-  port of medical care 2006. National Health Insurance  (Taiwan), Taipei.  [23] National Health Insurance (2009) Statistical annual re-  port of medical care 2007. National Health Insurance  (Taiwan), Taipei.  [24] National Health Insurance (2010) Statistical annual re-  port of medical care 2008. National Health Insurance  (Taiwan), Taipei.  [25] Ministry of the Interior (2009) The demographic data-  base. http://www.moi.gov.tw/stat/index.aspx   [26] Ahmad, O.E., Boschi-Pinto, C., Lopez, A.D., Murray,  C.J.L., Lozano, R. and Inoue, M. (2000) Age standardi- zation of rates: A new WHO standard (GPE discussion  paper series, No. 31). World Health Organization Press,  Geneva.  [27] Liaw, Y.P., Chen, C.J., Lee, W.C. and Hsu, S.Y. (2003)  The construction and use of the electric atlas of cancer  mortality and incidence in Taiwan. Taiwan Journal of  Public Health, 22, 227-236.  [28] Boots, B.N. and Getis, A. (1998) Point pattern analysis.  Sage Publications, Newbury Park.  [29] Cliff, A.C. and Ord, J.K. (1973) Spatial autocorrelation.  Pion Limited, London.  [30] Grubesic, T.H. (2008) Zip codes and spatial analysis:  Problems and prospects. Socio-Economic Planning Sci-  ences, 42, 129-149. doi:10.1016/j.seps.2006.09.001  [31] Cliff, A.D. and Ord,  J.K. (1981) Spatial processes: Mod-   els and applications. Pion Limited, London.  [32] Ord, J.K. and Getis, A. (1995) Local spatial autocorrela-  tion statistics: Distributional issues and an application.  Geographical Analysis, 27, 286-306.  d oi:10.1111/j.1538-4632.1995.tb00912.x  [33] Tobler, W. (1979) Cellular geography. In: Gale, S. and  Olsson, G., Eds., Philosophy in Geography, Riedel, Dor-  drecht, 379-386.  [34] Moore, D.A. and Carpenter, T.E. (1999) Spatial analyti-  cal methods and geographic information systems: Use in  health research and epidemiology. Epidemiologic Re- views, 21, 143-161.  [35] Griffith, D.A. and Arnrhein, C.G. (1991) Statistical analy-  sis for geographers. Prentice Hall, Englewood Cliffs.   [36] Kitron, U. and Kazmierczak, J.J. (1997) Spatial analysis  of the distribution of Lyme disease in Wisconsin. Ameri- can Journal of Epidemiology, 145, 558-566.  [37] Manley, D., Flowerdew, R. and Steel, D. (2006) Scales,  levels and processes: Studying spatial patterns of British  census variables. Computers, Environment and Urban  Systems, 30, 143-160.   doi:10.1016/j.compenvurbsys.2005.08.005  [38] Matisziw, T.C., Grubesic, T.H. and Wei, H. (2008)  Downscaling spatial structure for the analysis of epide-  miological data. Computers, Environment and Urban  Systems, 32, 81-93.  [39] De Smith, M.J., Goodchild, M.F. and Longley, P.A.  (2007) Geospatial Analysis: A comprehensive guide to  principles, techniques and software tools. Matador,  Leicester.  [40] Lin, J.T., Wang, L.Y., Wang, J.T., Wang, T.H. and Chen,  C.J. (1995) Ecological study of association between  Helicobacter pylori infection and gastric cancer in Tai-  wan. Digestive Diseases and Sciences, 40, 385-388.   doi:10.1007/BF02065425  [41] Yang, Y.H., Lee, H.Y., Tnug, S. and Shieh, T.Y. (2001)  Epidemiological survey of oral submucous fibrosis and  leukoplakia in aborigines of Taiwan. Journal of Oral  Pathology & Medicine, 30, 213-219.   doi:10.1034/j.1600-0714.2001.300404.x  [42] Chiang, C.T., Hwang, Y.H., Su, C.C., Tsai, K.Y., Lian,  I.B., Yuan, T.H. and Chang, T.K. (2010) Elucidating the  underlying causes of oral cancer through spatial cluster-  ing in high-risk areas of Ta iwan with a distinct gender ra-  tio of incidence. Geospatial Health, 4, 231-242.  [43] Huang, J.Q., Sridhar, S., Chen, Y. and Hunt, R.H. (1998)  Meta-analysis of the relationship between Helicobacter  pylori seropositivity and gastric cancer. Gastroenterology,  114, 1169-1179. doi:10.1016/S0016-5085(98)70422-6  [44] Eslick, G.D., Lim, L.L. and Byles, J. (1999) Association  of Helicobacter pylori infection with gastric carcinoma:  A meta-analysis. The American Journal of Gastroen- terology, 94, 2373-2379.   d oi:10.1111/j.1572-0241.1999.01360.x  [45] Xue, F.B., Xu, Y.Y. and Wan, Y. (2001) Association of  Helicobacter pylori infection with gastric carcinoma: A  meta-analysis. World Journal of Gastroenterology, 7,  801-804.   [46] Wang, C., Yuan, Y. and Hunt, R.H. (2007) The associa-  tion between Helicobacter pylori infection and early gas-  tric cancer: A meta-analysis. World Journal of Gastroen- terology, 102, 1789-1798.   d oi:10.1111/j.1572-0241.2007.01335.x  [47] Teh, B.H., Lin, J.T., Pan, W.H., Lin, S.H., Wang, L.Y.,  Lee, T.K. and Chen, C.J. (1994) Seroprevalence and as-  sociated risk factors of Helicobacter pylori infection in  Taiwan. Anticancer Research, 14, 1389-1392.      |