Open Access Highly Accessed Research

Modeling larval malaria vector habitat locations using landscape features and cumulative precipitation measures

Robert S McCann17*, Joseph P Messina2, David W MacFarlane3, M Nabie Bayoh4, John M Vulule4, John E Gimnig5 and Edward D Walker6

Author Affiliations

1 Department of Entomology, Michigan State University, East Lansing, MI, USA

2 Department of Geography, Michigan State University, East Lansing, MI, USA

3 Department of Forestry, Michigan State University, East Lansing, MI, USA

4 Centre for Global Health Research, Kenya Medical Research Institute/Centers for Disease Control and Prevention, Kisumu, Kenya

5 Division of Parasitic Diseases and Malaria, Centers for Disease Control and Prevention, Atlanta, GA, USA

6 Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA

7 Current address: Laboratory of Entomology, Wageningen University and Research Centre, PO Box 8031, Wageningen 6700 EH, Netherlands

For all author emails, please log on.

International Journal of Health Geographics 2014, 13:17  doi:10.1186/1476-072X-13-17

Published: 6 June 2014



Predictive models of malaria vector larval habitat locations may provide a basis for understanding the spatial determinants of malaria transmission.


We used four landscape variables (topographic wetness index [TWI], soil type, land use-land cover, and distance to stream) and accumulated precipitation to model larval habitat locations in a region of western Kenya through two methods: logistic regression and random forest. Additionally, we used two separate data sets to account for variation in habitat locations across space and over time.


Larval habitats were more likely to be present in locations with a lower slope to contributing area ratio (i.e. TWI), closer to streams, with agricultural land use relative to nonagricultural land use, and in friable clay/sandy clay loam soil and firm, silty clay/clay soil relative to friable clay soil. The probability of larval habitat presence increased with increasing accumulated precipitation. The random forest models were more accurate than the logistic regression models, especially when accumulated precipitation was included to account for seasonal differences in precipitation. The most accurate models for the two data sets had area under the curve (AUC) values of 0.864 and 0.871, respectively. TWI, distance to the nearest stream, and precipitation had the greatest mean decrease in Gini impurity criteria in these models.


This study demonstrates the usefulness of random forest models for larval malaria vector habitat modeling. TWI and distance to the nearest stream were the two most important landscape variables in these models. Including accumulated precipitation in our models improved the accuracy of larval habitat location predictions by accounting for seasonal variation in the precipitation. Finally, the sampling strategy employed here for model parameterization could serve as a framework for creating predictive larval habitat models to assist in larval control efforts.

Random forest; Logistic regression; Anopheles gambiae; Larval habitats; Predictive models