Species Distribution Model: Classification and Regression Trees_TAKE2
Model Category: Classification and Regression Tree (CART)
Model Description: Classification and regression trees work by successively splitting data points at nodes that create more homogeneous groups with respect to explanatory variables. This splitting can be done by a variety of criteria including minimizing sum of squares about the mean for regression trees, and minimizing Shannan-Weiner Diversity index for categorical variables. These nodes are often summarized as the proportion sum of squares explained by the split, and for classification trees misclassification rate is used.
When “growing” the tree, nodes continue to be made until improvement from additional nodes is less than some specified threshold. This leaves final tree structure sensitive to this arbitrarily specified cutoff and can results in over-grown or overly simplified trees. Setting a liberal threshold and using techniques for pruning trees, such as finding the best explanatory tree of a given size, and then using cross-validation to find the tree size with the best predictive power.
In regression trees nodes are often based on Least Squares regression, so the using these trees requires all the assumptions of least squares regression to be met, including normally distributed data and predictors, homogeneity of variances between data points,
Model Response Data:
These algorithms can work with a variety of response variables including presence-absence, abundance and survival data, but Presence-only data will not work.
Model Explanatory Data:
These models are well suited for using both categotrical and continuous explanatory variables, in fact this is the difference between classification trees and regression trees. Classifiction trees work with categorical covariates and regression trees work with continuous covariates.
However, as the methods used to create nodes are specific to categorical or continuous variables, a single tree cannot handle both categorical and continuous variables.
Model Links and Use with R:
R Package Tree: Classification and regression trees
R Package RPart: Recursive partitioning for classification, regression and survival trees
De’ath & Fabricus (2000) give a great introduction to CARTs for ecolgists and model coral distributions using the method.
Glenn De’ath and Katharina E. Fabricius 2000. CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS. Ecology 81:3178–3192
Lawler et al.(2006) review several methods for predicting distribution shifts including regression trees.
LAWLER, J. J., WHITE, D., NEILSON, R. P. and BLAUSTEIN, A. R. (2006), Predicting climate-induced range shifts: model differences and model reliability. Global Change Biology, 12.
Example with R: