Open Access Highly Accessed Research

Geocoding police collision report data from California: a comprehensive approach

John M Bigham1, Thomas M Rice12*, Swati Pande1, Junhak Lee1, Shin Hyoung Park1, Nicolas Gutierrez1 and David R Ragland13

Author Affiliations

1 Safe Transportation Research & Education Center, University of California, Berkeley, 2614 Dwight Way #7374, Berkeley, CA 94720-7374, USA

2 Department of Environmental Health Sciences, University of California at Berkeley, Berkeley, CA, USA

3 Department of Epidemiology, University of California at Berkeley, Berkeley, CA, USA

For all author emails, please log on.

International Journal of Health Geographics 2009, 8:72  doi:10.1186/1476-072X-8-72

Published: 29 December 2009

Abstract

Background

Collision geocoding is the process of assigning geographic descriptors, usually latitude and longitude coordinates, to a traffic collision record. On California police reports, relative collision location is recorded using a highway postmile marker or a street intersection. The objective of this study was to create a geocoded database of all police-reported, fatal and severe injury collisions in the California Statewide Integrated Traffic Records System (SWITRS) for years 1997-2006 for use by public agencies.

Results

Geocoding was completed with a multi-step process. First, pre-processing was performed using a scripting language to clean and standardize street name information. A state highway network with postmile values was then created using a custom tool written in Visual Basic for Applications (VBA) in ArcGIS software. Custom VBA functionality was also used to incorporate the offset direction and distance. Intersection and address geocoding was performed using ArcGIS, StreetMap Pro 2003 digital street network, and Google Earth Pro. A total of 142,007 fatal and severe injury collisions were identified in SWITRS. The geocoding match rate was 99.8% for postmile-coded collisions and 86% for intersection-coded collisions. The overall match rate was 91%.

Conclusions

The availability of geocoded collision data will be beneficial to clinicians, researchers, policymakers, and practitioners in the fields of traffic safety and public health. Potential uses of the data include studies of collision clustering on the highway system, examinations of the associations between collision occurrence and a variety of variables on environmental and social characteristics, including housing and personal demographics, alcohol outlets, schools, and parks. The ability to build maps may be useful in research planning and conduct and in the delivery of information to both technical and non-technical audiences.