Email updates

Keep up to date with the latest news and content from IJHG and BioMed Central.

Open Access Research

Detection of arbitrarily-shaped clusters using a neighbor-expanding approach: A case study on murine typhus in South Texas

Zhijun Yao1*, Junmei Tang2 and F Benjamin Zhan3

Author Affiliations

1 Texas Center for Geographic Information Science, Department of Geography, Texas State University-San Marcos, 601 University Drive, San Marcos, TX, 78666, USA

2 The Department of Geography and Environmental Systems, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD, 21250, USA

3 School of Resource and Environmental Science, Wuhan University, Wuhan, 430079, China

For all author emails, please log on.

International Journal of Health Geographics 2011, 10:23  doi:10.1186/1476-072X-10-23

Published: 31 March 2011

Abstract

Background

Kulldorff's spatial scan statistic has been one of the most widely used statistical methods for automatic detection of clusters in spatial data. One limitation of this method lies in the fact that it has to rely on scan windows with predefined shapes in the search process, and therefore it cannot detect cluster with arbitrary shapes. We employ a new neighbor-expanding approach and introduce two new algorithms to detect cluster with arbitrary shapes in spatial data. These two algorithms are called the maximum-likelihood-first (MLF) algorithm and non-greedy growth (NGG) algorithm. We then compare the performance of these two new algorithms with the spatial scan statistic (SaTScan), Tango's flexibly shaped spatial scan statistic (FlexScan), and Duczmal's simulated annealing (SA) method using two datasets. Furthermore, we utilize the methods to examine clusters of murine typhus cases in South Texas from 1996 to 2006.

Result

When compared with the SaTScan and FlexScan method, the two new algorithms were more flexible and sensitive in detecting the clusters with arbitrary shapes in the test datasets. Clusters detected by the MLF algorithm are statistically more significant than those detected by the NGG algorithm. However, the NGG algorithm appears to be more stable when there are no extreme cluster patterns in the data. For the murine typhus data in South Texas, a large portion of the detected clusters were located in coastal counties where environmental conditions and socioeconomic status of some population groups were at a disadvantage when compared with those in other counties with no clusters of murine typhus cases.

Conclusion

The two new algorithms are effective in detecting the location and boundary of spatial clusters with arbitrary shapes. Additional research is needed to better understand the etiology of the concentration of murine typhus cases in some counties in south Texas.