Geographic bias related to geocoding in epidemiologic studies
-
* Corresponding author: M Norman Oliver mno3p@virginia.edu
1 Department of Family Medicine, University of Virginia, Charlottesville, VA, USA
2 Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
3 Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD, USA
International Journal of Health Geographics 2005, 4:29 doi:10.1186/1476-072X-4-29
Published: 10 November 2005Abstract
Background
This article describes geographic bias in GIS analyses with unrepresentative data owing to missing geocodes, using as an example a spatial analysis of prostate cancer incidence among whites and African Americans in Virginia, 1990–1999. Statistical tests for clustering were performed and such clusters mapped. The patterns of missing census tract identifiers for the cases were examined by generalized linear regression models.
Results
The county of residency for all cases was known, and 26,338 (74%) of these cases were geocoded successfully to census tracts. Cluster maps showed patterns that appeared markedly different, depending upon whether one used all cases or those geocoded to the census tract. Multivariate regression analysis showed that, in the most rural counties (where the missing data were concentrated), the percent of a county's population over age 64 and with less than a high school education were both independently associated with a higher percent of missing geocodes.
Conclusion
We found statistically significant pattern differences resulting from spatially non-random differences in geocoding completeness across Virginia. Appropriate interpretation of maps, therefore, requires an understanding of this phenomenon, which we call "cartographic confounding."