<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1476-072X-6-52</ui>
   <ji>1476-072X</ji>
   <fm>
      <dochead>Methodology</dochead>
      <bibl>
         <title>
            <p>Effect of spatial resolution on cluster detection: a simulation study</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Ozonoff</snm>
               <fnm>Al</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>aozonoff@bu.edu</email>
            </au>
            <au id="A2">
               <snm>Jeffery</snm>
               <fnm>Caroline</fnm>
               <insr iid="I2"/>
               <email>cjeffery@hsph.harvard.edu</email>
            </au>
            <au id="A3">
               <snm>Manjourides</snm>
               <fnm>Justin</fnm>
               <insr iid="I2"/>
               <email>jmanjour@hsph.harvard.edu</email>
            </au>
            <au id="A4">
               <snm>White</snm>
               <mnm>Forsberg</mnm>
               <fnm>Laura</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>lfwhite@bu.edu</email>
            </au>
            <au id="A5" ca="yes">
               <snm>Pagano</snm>
               <fnm>Marcello</fnm>
               <insr iid="I2"/>
               <email>pagano@hsph.harvard.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Biostatistics, Boston University School of Public Health, 715 Albany Street, Boston, MA 02118, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA</p>
            </ins>
         </insg>
         <source>International Journal of Health Geographics</source>
         <issn>1476-072X</issn>
         <pubdate>2007</pubdate>
         <volume>6</volume>
         <issue>1</issue>
         <fpage>52</fpage>
         <url>http://www.ij-healthgeographics.com/content/6/1/52</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18042281</pubid>
               <pubid idtype="doi">10.1186/1476-072X-6-52</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>07</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>27</day>
               <month>11</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>27</day>
               <month>11</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Ozonoff et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Aggregation of spatial data is intended to protect privacy, but some effects of aggregation on spatial methods have not yet been quantified.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>We generated 3,000 spatial data sets and evaluated power of detection at 12 different levels of aggregation using the spatial scan statistic implemented in SaTScan v6.0.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Power to detect clusters decreased from nearly 100% when using exact locations to roughly 40% at the coarsest level of spatial resolution.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Aggregation has the potential for obfuscation.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>1 Introduction</p>
         </st>
         <p>The Centers for Disease Control and Prevention (CDC) define surveillance to be the ongoing, systematic collection, analysis, interpretation, and dissemination of data about a health-related event for use in public health action to reduce morbidity and mortality and to improve health <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. To control and prevent disease, it is surely important to be vigilant for infectious disease outbreaks or geographic areas of notably high chronic disease incidence. Indeed this is a primary aim of public health surveillance, and explains in part why surveillance plays an integral role in public health practice <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
         <p>When caring for a single patient, the clinician understandably desires as much diagnostic information as possible, and at the highest possible level of precision. Analogously, a public health professional is concerned with diagnosing a public ailment, and should similarly desire all available information with the greatest possible level of precision. Thus it is noteworthy, in the context of public health surveillance, that for reasons of privacy, information is sometimes destroyed or intentionally degraded before being proffered to the analyst.</p>
         <p>The argument to protect patient data for reasons of privacy could also be used to shield these data from clinicians. In a clinical setting, we choose not to protect the privacy of the patient by hiding relevant information from the clinician, because it is patently silly to do so. However, we often suffer from a similarly framed argument to obscure population level data, even when addressing matters of concern to the public health.</p>
         <p>We argue that one important reason to retain important, specific information such as precise location is that the "requisite" aggregation for privacy necessarily reduces the power available for outbreak detection. To balance the cost of this and other troubles for spatial analysis <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, aggregation does indeed make it more difficult to identify individual patients. This is crucial if the data are made publicly available or if there are other reasons to safeguard privacy, but it also makes an already challenging surveillance task even more difficult.</p>
         <p>A growing body of literature addresses statistical protection of privacy and its effects on analysis of surveillance data. Cox has written a useful survey of the general problem of confidentiality within small geographic areas, and the impacts of privacy concerns on public health policy and practice <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>.</p>
         <p>Armstrong et al. thoroughly discuss the design and implementation of several different approaches to protect privacy in the context of spatial analyses <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Importantly, methods were evaluated both on the impact on analysis as well as the effectiveness of preserving confidentiality. Yet the restriction of the quantitative assessment to the Cuzick-Edwards test statistic <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, which is no longer commonly used for spatial surveillance <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>, limits the application of this knowledge to a surveillance setting. Further, data with exact locations were not considered for this evaluation.</p>
         <p>Waller and colleagues have written extensively on factors that may influence power of cluster detection methods. For example, they have studied the effects of geographic scale on focused tests of clustering <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>, and the importance of cluster location amidst a heterogeneous underlying population <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Notably, this group has investigated more than one statistical method, using several different measures for evaluation. However these studies generally use focused tests of clustering, where a putative exposure source has been identified <it>a priori</it>, whereas surveillance purposes typically require a general test of clustering <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>.</p>
         <p>Just as we trust clinicians and hospital personnel with sensitive and confidential information, so too, one can argue, we should find trustworthy individuals to handle surveillance data responsibly.</p>
         <p>Informatics-based approaches offer a potential compromise to the trade-off between privacy and surveillance utility. For example, development of automated surveillance algorithms might allow sensitive data to be analyzed without human intervention <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. But in order to evaluate the benefit that such an approach might provide, we must first better understand the costs in performance that the obfuscation or destruction of information may cause.</p>
         <p>We reported briefly <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> that there is an undesirable loss of power to detect disease outbreaks when the spatial information provided is degraded from a continuous scale of measurement to a coarser, aggregate level. For example, often only a patient's ZIP code is available to a surveillance system, instead of the patient's listed residential address. Similar results have appeared in contemporaneous work <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, and a recent paper by the same group further confirms this basic premise <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. However, those studies focused solely on exact locations compared to a single level of aggregation.</p>
         <p>In our present work, we add to these previous results by considering multiple levels of aggregation. Using synthetic data, we systematically quantify the loss of cluster detection performance as a function of spatial resolution, while limiting confounding influences from a variety of complex factors that affect spatial analyses. We may interpret these results relative to geographic scales we might encounter while surveilling a large metropolitan city. In this way, we attempt to clarify the price one pays for aggregation, and in turn to better inform future policy decision-makers.</p>
      </sec>
      <sec>
         <st>
            <p>2 Methods</p>
         </st>
         <sec>
            <st>
               <p>2.1 Data</p>
            </st>
            <p>We designed a simulation study to determine the effect of spatial aggregation on power to detect spatial clusters. Random samples of size 90 were drawn from an underlying uniform distribution on the unit disk (i.e. the Euclidean circle of radius one). Atop this background sample, we then superimpose a simulated cluster consisting of 10 points uniformly distributed in a small square at a location randomly determined for each simulated data set (Figure <figr fid="F1">1</figr>). Thus each simulated data set consists of a total sample of 100 points. Although the clusters are not defined by circles, for ease of discussion we speak of a cluster "radius" to mean the radius of the circle inscribed within the square cluster boundary. In the occasional instance where the cluster center falls within one radius of the unit disk boundary, we require that all 10 cluster points lay within the intersection of the cluster boundary and the unit disk.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Illustration of a simulated cluster</p>
               </caption>
               <text>
                  <p><b>Illustration of a simulated cluster</b>. 90 points were distributed uniformly on the unit circle, and 10 additional "outbreak" points form the square "cluster" left of center.</p>
               </text>
               <graphic file="1476-072X-6-52-1"/>
            </fig>
            <p>We generated three separate sets of simulated data with cluster radii of 0.025, 0.05 and 0.10, corresponding to disease clusters with a geographical extent equal to 2.5%, 5%, or 10% respectively of the radius of the study area. Although this results in clusters of different intensities, the corresponding relative risks are quite large (greater than 10) for all simulations. For each cluster radius, we generate 1, 000 data sets under these conditions, or a total of 3, 000 data sets for the entire simulation study.</p>
            <p>To simulate spatial aggregation at different geographic scales, we use a sequence of 12 uniform grids of varying spacing, superimposed on the unit disk. The levels of aggregation are chosen according to their corresponding grid spacing, ranging from 15 grid squares per side (length of grid square 0.067) to four grid squares per side (length of grid square 0.25). Throughout, we use the average distance between grid points (equivalently, the average diameter of an aggregation region) as an index of the level of spatial aggregation (Figure <figr fid="F2">2</figr>).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Illustration of spatial aggregation</p>
               </caption>
               <text>
                  <p><b>Illustration of spatial aggregation</b>. One of 12 levels of spatial aggregation used in this study. Grid lines define spatial regions of aggregation, and representative points are chosen randomly within each region. All simulated points are reassigned to the representative point of the appropriate region.</p>
               </text>
               <graphic file="1476-072X-6-52-2"/>
            </fig>
            <p>By assigning all simulated data points to the nearest grid point, these grids thereby define spatial regions of aggregation. Prior to analysis, we modified each grid by adding small amounts of bivariate jitter to each grid point (i.e. region center). Our purpose was to mitigate the high degree of spatial regularity across a uniform grid of assignment points, and in part to reflect the non-uniform nature of administrative regions as they appear in real systems. We note however that the use of a uniform population distribution implies constant population densities across administrative region, something unlikely to be seen in a real system.</p>
         </sec>
         <sec>
            <st>
               <p>2.2 Statistical analysis</p>
            </st>
            <p>We use SaTScan version 6.0 (2005) with a purely spatial Bernoulli model, with cluster size constrained to be no greater than 25% of the population. Statistical significance of spatial clusters is determined using a nominal Type I error rate of 0.05.</p>
            <p>Our primary outcome is the proportion of simulated data sets, under each level of aggregation, for which SaTScan accurately detects the simulated cluster. We denote this proportion as the power to detect clusters. In order to ensure that the cluster detected by SaTScan is sufficiently close in space to the true cluster location, we record a detection as successful if and only if the identified cluster center is within one cluster radius of the true cluster center. We also record the proportion of false detections, defined as any cluster identification with center more than one cluster radius from the true cluster center, or failure of any identified cluster to achieve significance level (i.e. p-value) below 0.05.</p>
            <p>To measure the spatial accuracy of cluster detection, we further consider the identification (correctly or not) of individual data points in a significant disease cluster. Within each simulated data set, there were 10 points of 100 that comprised the simulated cluster. For these "cluster points", we calculate the proportion correctly included in a SaTScan-identified cluster with p-value below 0.05. Similarly for the remaining 90 "non-cluster points", we calculate the proportion incorrectly included in a statistically significant SaTScan-identified cluster. These proportions are analogous to traditional definitions of sensitivity and 1 minus specificity, respectively, where we compare the classification via SaTScan of points involved in a cluster to the "gold standard" of cluster status as determined by simulation design.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>3 Results</p>
         </st>
         <p>Figures <figr fid="F3">3</figr> through <figr fid="F6">6</figr> illustrates our results. For all three sets of simulations, power decreases as the size of aggregation regions increases. These simulated clusters are sufficiently large so that the power to detect for all three cluster radii is nearly 100% when exact locations are used; this decreases to roughly 40% at the coarsest level of aggregation, which corresponds to a more than halving of the probability of successful detection (Figure <figr fid="F3">3</figr>).</p>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>Effect of aggregation on power</p>
            </caption>
            <text>
               <p><b>Effect of aggregation on power</b>. As spatial data are aggregated, power to detect clusters decreases. Horizontal axis denotes level of spatial aggregation, determined by radius of aggregation region; vertical axis denotes proportion of simulated clusters correctly identified at significance level <it>&#945; </it>= 0.05.</p>
            </text>
            <graphic file="1476-072X-6-52-3"/>
         </fig>
         <fig id="F4">
            <title>
               <p>Figure 4</p>
            </title>
            <caption>
               <p>Effect of aggregation on false detection rate</p>
            </caption>
            <text>
               <p><b>Effect of aggregation on false detection rate</b>. Vertical axis denotes proportion of simulations where spurious clusters are detected.</p>
            </text>
            <graphic file="1476-072X-6-52-4"/>
         </fig>
         <fig id="F5">
            <title>
               <p>Figure 5</p>
            </title>
            <caption>
               <p>Effect of aggregation on sensitivity</p>
            </caption>
            <text>
               <p><b>Effect of aggregation on sensitivity</b>. Identification of cases involved in an outbreak becomes more difficult as data are aggregated. Vertical axis denotes proportion of cases falsely identified as outside the disease cluster (false negatives).</p>
            </text>
            <graphic file="1476-072X-6-52-5"/>
         </fig>
         <fig id="F6">
            <title>
               <p>Figure 6</p>
            </title>
            <caption>
               <p>Effect of aggregation on specificity</p>
            </caption>
            <text>
               <p><b>Effect of aggregation on specificity</b>. Vertical axis denotes proportion of cases falsely identified as inside the cluster (false positives).</p>
            </text>
            <graphic file="1476-072X-6-52-6"/>
         </fig>
         <p>Using exact locations, the false detection rate is approximately 2%. In the presence of any level of aggregation, the false detection rate increases to nearly 20% or higher in all of our simulations (Figure <figr fid="F4">4</figr>). This rate appears to increase slowly for greater levels of aggregation.</p>
         <p>We further evaluate the effect of aggregation on the sensitivity and specificity of SaTScan (Figures <figr fid="F5">5</figr> and <figr fid="F6">6</figr>). While performance is nearly ideal when using exact locations, the proportion of false negatives rises to almost 50% at the coarsest level of aggregation. In concordance with our earlier results, sensitivity tends to decrease as spatial aggregation increases, while the false positive fraction (1 minus specificity) follows an inverse and nearly monotonic association.</p>
      </sec>
      <sec>
         <st>
            <p>4 Discussion</p>
         </st>
         <p>Our results are noteworthy for a number of reasons. First, we have used more than two levels of aggregation in an effort to estimate the incremental effect of this aggregation on the power of cluster detection. Second, we have further investigated the effect of aggregation on the rate of false detection. Finally, when viewed in the context of similar studies, our results add to a body of evidence that the underlying relationships reported appear robust to differing geographies and population distributions.</p>
         <p>Our calculation of power and false detection differs from the same measures as otherwise used in an important way. We expect a certain proportion of spurious "clusters" to arise by chance alone. Thus we have placed an additional requirement on what we denote a successful identification of a cluster, namely that the identified cluster be proximal to the true cluster as determined by the simulation design. Because our simulations involve only one cluster per data set, an identification far from the true cluster is genuinely spurious and must be considered a false detection in this context. Indeed, for practical purposes such an identification might divert resources for investigation to a geographic area not related to the true outbreak or cluster present in the data.</p>
         <p>To place our results in context, consider the metropolitan Boston area. The city and adjacent suburbs can be enclosed in a circle of radius roughly 7, 500 meters. Although the size of city ZIP codes and census tracts varies, an approximate median radius for Boston ZIP codes is roughly 1, 500 meters, or 20% of the region radius. Boston census tracts have an approximate median radius of 500 meters, or 6.7% of the region radius. Thus census tract and ZIP code aggregation of Boston data corresponds roughly to our first and penultimate levels of aggregation respectively. Likewise, the simulated clusters of radii 0.025, 0.05, and 0.10 correspond to disease outbreaks smaller than one census tract, about one census tract, or several census tracts (perhaps a small ZIP code) respectively.</p>
         <p>The number of false detections rose well above the nominal alpha level when spatial data were aggregated. Interestingly, the level of aggregation does not appear to be a major contributor to false alarms; rather, there is an immediate increase upon aggregation above the nominal false alarm rate, with little additional increase for further aggregation. To our knowledge, this has not been reported previously. Since false alarms form a major limitation to the actionable consequences of cluster detection, this issue should be considered carefully. Even in situations where loss of power is not severe, the increase in false detection rates may impose further limits of the utility of spatial methods when using aggregated data.</p>
         <p>Our study is limited in several ways. We have only included an evaluation of SaTScan as a test of clustering, although we have seen similar results using other methods <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. The use of synthetic data is both helpful and harmful to generalizability of results. There are few populations that even approximate a homogeneous and uniform distribution, and thus the simulated data sets do not reflect a realistic surveillance scenario. However, using a homogeneous distribution removes some of the potentially confounding interactions between cluster location, geography, population distribution, and spatial methods. Thus despite its limitations, our study contributes to an understanding of the complex association between spatial resolution and power of detection.</p>
         <p>We chose not to investigate spatio-temporal methods (implemented for example with a space-time scan, also available using SaTScan). Space-time interactions imply greater complexity when considering effects of spatial aggregation (or indeed, temporal aggregation), and the potential parameter space of simulation studies increases greatly as well. For this and other reasons, the effect of spatial aggregation (or indeed, temporal aggregation) in a cluster detection context remains an area for further investigation.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The author(s) declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>AO and MP conceived of the study, participated in the design, and drafted the manuscript. AO, CJ, and JM were responsible for statistical programming and data analysis. All authors read and approved the final manuscripts.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>Research partially supported by NIH grants R01-AI51164 and R01-EB006195.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <aug>
               <au>
                  <snm>Teutsch</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Churchill</snm>
                  <fnm>RE</fnm>
               </au>
            </aug>
            <source>Principles and Practice of Public Health Surveillance</source>
            <publisher>Oxford Univ Press</publisher>
            <pubdate>2000</pubdate>
         </bibl>
         <bibl id="B2">
            <aug>
               <au>
                  <snm>Brookmeyer</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Stroup</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <cnm>Eds</cnm>
               </au>
            </aug>
            <source>Monitoring the Health of Populations: Statistical principles and methods for public health surveillance</source>
            <publisher>Oxford Univ Press</publisher>
            <pubdate>2004</pubdate>
         </bibl>
         <bibl id="B3">
            <title>
               <p>On the use of ZIP codes and ZIP code tabulation areas (ZCTAs) for the spatial analysis of epidemiological data.</p>
            </title>
            <aug>
               <au>
                  <snm>Grubesic</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Matisziw</snm>
                  <fnm>TC</fnm>
               </au>
            </aug>
            <source>Int J Health Geogr</source>
            <pubdate>2006</pubdate>
            <volume>5</volume>
            <issue>58</issue>
            <xrefbib>
               <pubidlist>
                  <pubid>17166283</pubid>
                  <pubid idtype="pmpid" link="fulltext">17166283</pubid>
                  <pubid idtype="pmcid">1762013</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Protecting confidentiality in small population health and environmental statistics</p>
            </title>
            <aug>
               <au>
                  <snm>Cox</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Statistics in Medicine</source>
            <pubdate>1996</pubdate>
            <volume>15</volume>
            <fpage>1895</fpage>
            <lpage>1905</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/(SICI)1097-0258(19960915)15:17&lt;1895::AID-SIM401>3.0.CO;2-W</pubid>
                  <pubid idtype="pmpid">8888482</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Geographically masking health data to preserve confidentiality</p>
            </title>
            <aug>
               <au>
                  <snm>Armstrong</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rushton</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Zimmerman</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Statistics in Medicine</source>
            <pubdate>1999</pubdate>
            <volume>18</volume>
            <fpage>497</fpage>
            <lpage>525</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/(SICI)1097-0258(19990315)18:5&lt;497::AID-SIM45>3.0.CO;2-#</pubid>
                  <pubid idtype="pmpid" link="fulltext">10209808</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Spatial clustering for inhomogeneous populations</p>
            </title>
            <aug>
               <au>
                  <snm>Cuzick</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>J Royal Statist Soc B</source>
            <pubdate>1990</pubdate>
            <volume>52</volume>
            <fpage>73</fpage>
            <lpage>104</lpage>
         </bibl>
         <bibl id="B7">
            <aug>
               <au>
                  <snm>Waller</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Gotway</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Applied Spatial Statistics for Public Health Data</source>
            <publisher>Wiley</publisher>
            <pubdate>2004</pubdate>
         </bibl>
         <bibl id="B8">
            <aug>
               <au>
                  <snm>Lawson</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Statistical Methods in Spatial Epidemiology</source>
            <publisher>Wiley</publisher>
            <edition>2</edition>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B9">
            <title>
               <p>The power of focused tests to detect disease clustering</p>
            </title>
            <aug>
               <au>
                  <snm>Waller</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Lawson</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Statistics in Medicine</source>
            <pubdate>1995</pubdate>
            <volume>14</volume>
            <fpage>2291</fpage>
            <lpage>2308</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/sim.4780142103</pubid>
                  <pubid idtype="pmpid">8711270</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Statistical power and design of focused clustering studies</p>
            </title>
            <aug>
               <au>
                  <snm>Waller</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Statistics in Medicine</source>
            <pubdate>1996</pubdate>
            <volume>15</volume>
            <fpage>765</fpage>
            <lpage>782</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/(SICI)1097-0258(19960415)15:7/9&lt;765::AID-SIM248>3.0.CO;2-N</pubid>
                  <pubid idtype="pmpid">9132904</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>The geography of power: Statistical performance of tests of clusters and clustering in heterogeneous populations</p>
            </title>
            <aug>
               <au>
                  <snm>Waller</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rudd</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Statistics in Medicine</source>
            <pubdate>2006</pubdate>
            <volume>25</volume>
            <fpage>853</fpage>
            <lpage>865</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/sim.2418</pubid>
                  <pubid idtype="pmpid" link="fulltext">16453372</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <aug>
               <au>
                  <snm>Lawson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kleinman</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <cnm>Eds</cnm>
               </au>
            </aug>
            <source>Spatial and Syndromic Surveillance for Public Health</source>
            <publisher>Wiley</publisher>
            <pubdate>2005</pubdate>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Using software agents to preserve individual health data confidentiality in micro-scale geographic analyses</p>
            </title>
            <aug>
               <au>
                  <snm>Boulos</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cai</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Padget</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rushton</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Journal of Biomedical Informatics</source>
            <pubdate>2006</pubdate>
            <volume>39</volume>
            <fpage>160</fpage>
            <lpage>170</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jbi.2005.06.003</pubid>
                  <pubid idtype="pmpid" link="fulltext">16098819</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>The cost of obfuscation when reporting locations of cases in syndromic surveillance systems</p>
            </title>
            <aug>
               <au>
                  <snm>Jeffery</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ozonoff</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Forsberg</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Nuno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pagano</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Advances in Disease Surveillance</source>
            <pubdate>2006</pubdate>
            <volume>1</volume>
            <fpage>36</fpage>
         </bibl>
         <bibl id="B15">
            <title>
               <p>A novel, context-sensitive approach to anonymizing spatial surveillance data: impact on outbreak detection</p>
            </title>
            <aug>
               <au>
                  <snm>Cassa</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Grannis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Overhage</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mandl</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Advances in Disease Surveillance</source>
            <pubdate>2006</pubdate>
            <volume>1</volume>
            <fpage>10</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid>16357353</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Privacy protection versus cluster detection in spatial epidemiology</p>
            </title>
            <aug>
               <au>
                  <snm>Olson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Grannis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mandl</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Am J Public Health</source>
            <pubdate>2006</pubdate>
            <volume>96</volume>
            <fpage>2002</fpage>
            <lpage>2008</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.2105/AJPH.2005.069526</pubid>
                  <pubid idtype="pmpid" link="fulltext">17018828</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>

