Case Study 1: Predicting distributions of known and unknown species in Madagascar
This online guide has been adapted from a Network for Conservation Practitioners (NCEP) synthesis document developed by Richard G. Pearson, titled Species' distribution modeling for conservation educators and practitioners. The document, in its entirety, is also available for download in the pdf (677K) format.

Based on Raxworthy et al. 2003.

Our knowledge of the identity and distribution of species on Earth is remarkably poor, with many species yet to be described and catalogued. This problem has two key elements, which may be termed the ‘Linnean’ and ‘Wallacean’ shortfalls (Whittaker et al. 2005). The Linnean shortfall refers to our lack of knowledge of how many, and what kind, of species exist. The term is a reference to Carl Linnaeus, who laid the foundations of modern taxonomy and the 18th century. The Linnean shortfall concerns our highly incomplete knowledge of the diversity of life that exists on Earth.

The Wallacean shortfall refers to our inadequate knowledge of the distributions of species. This term is a reference to Alfred Russel Wallace who, as well as contributing to the early development of evolutionary theory, was an expert on the geographical distribution of species (he is sometimes referred to as ‘the father of biogeography‘). The Wallacean shortfall thus refers to our poor knowledge of the biogeography of most species. Species distribution modeling offers a powerful tool to address both the Linnean and Wallacean shortfalls, as demonstrated in a study by Raxworthy et al. (2003).

Raxworthy et al. (2003) modeled the distributions of 11 species of chameleon that are endemic to the island of Madagascar. They used species occurrence records from recent surveys and from older specimens deposited in collections of natural history museums. No observed absence records were available for building the models. Environmental variables were derived from remote sensing data, from a digital elevation model, and from weather station data that had been interpolated to a grid (i.e. converted from point vector to raster data). In all, 25 GIS layers were used in the modeling, including environmental variables describing temperature, precipitation, land cover and elevation. All analyses were undertaken at a resolution of 1km2. The modeling algorithm used was GARP (see GARP), which generated an output ranging from 0 – 10 at increments of 1. Two alternative thresholds of occurrence were used: threshold = 1 (termed by the authors “any model predicts”), and threshold = 10 (termed “all models predict”).

Predictive performance of the models was first evaluated by splitting the available data into two parts, 50% for calibrating the model and 50% for testing the model. The authors calculated the number of test localities at which the species was correctly predicted to be present and tested the statistical significance of the results using a chi-square test. Performance of the 11 models was generally good, with overall prediction success as high as 83%. Predictions usually were better than random. A second evaluation tested model performance using independent test data from herpetological surveys undertaken at 11 sites after the models had been built. In this case, model evaluation was based on both presence and absence records, since surveys were sufficiently thorough that detection probability was high. The success of these predictions was more than 70% and levels of statistical significance were uniformly high.

Raxworthy et al. (2003) thus demonstrated the potential for species’ distribution models to be used to guide new field surveys toward areas in which the probability of species presence was high. This approach takes advantage of the type of model prediction illustrated by area 2 in Figure 3: the model identifies an area that is environmentally similar to where the species has already been found, but for which no occurrence data are available. The models can thus help to address the Wallacean shortfall, by improving our knowledge of the distributions of known species.

Raxworthy et al. (2003) also demonstrated that the models can help to address the Linnean shortfall by guiding field surveys toward areas where species new to science are most likely to be discovered. In this case the approach makes use of the type of model prediction illustrated by area 3 in Figure 3: areas are identified that are unoccupied by the species being modeled, but where closely-related species that occupy similar environmental space are most likely to be found. By surveying sites identified by the distribution models for known species, Raxworthy et al. (2003) discovered seven new species, considerably greater than the number that would usually be expected on the basis of similar survey effort across a less-targeted area.