Species Distribution Model: Classification and Regression Trees_TAKE2Model Category: Classification and Regression Tree (CART) Model Description: Classification and regression trees work by successively splitting data points at nodes that create more homogeneous groups with respect to explanatory variables. This splitting can be done by a variety of criteria including minimizing sum of squares about the mean for regression trees, and minimizing ShannanWeiner Diversity index for categorical variables. These nodes are often summarized as the proportion sum of squares explained by the split, and for classification trees misclassification rate is used. When “growing” the tree, nodes continue to be made until improvement from additional nodes is less than some specified threshold. This leaves final tree structure sensitive to this arbitrarily specified cutoff and can results in overgrown or overly simplified trees. Setting a liberal threshold and using techniques for pruning trees, such as finding the best explanatory tree of a given size, and then using crossvalidation to find the tree size with the best predictive power. Model Assumptions: In regression trees nodes are often based on Least Squares regression, so the using these trees requires all the assumptions of least squares regression to be met, including normally distributed data and predictors, homogeneity of variances between data points, Model Response Data: These algorithms can work with a variety of response variables including presenceabsence, abundance and survival data, but Presenceonly data will not work. Model Explanatory Data: These models are well suited for using both categotrical and continuous explanatory variables, in fact this is the difference between classification trees and regression trees. Classifiction trees work with categorical covariates and regression trees work with continuous covariates. However, as the methods used to create nodes are specific to categorical or continuous variables, a single tree cannot handle both categorical and continuous variables. Model Links and Use with R: R Package Tree: Classification and regression treesR Package RPart: Recursive partitioning for classification, regression and survival treeshttps://cran.rproject.org/web/packages/rpart/index.html Example Papers: De’ath & Fabricus (2000) give a great introduction to CARTs for ecolgists and model coral distributions using the method. Glenn De’ath and Katharina E. Fabricius 2000. CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS. Ecology 81:3178–3192 Lawler et al.(2006) review several methods for predicting distribution shifts including regression trees. LAWLER, J. J., WHITE, D., NEILSON, R. P. and BLAUSTEIN, A. R. (2006), Predicting climateinduced range shifts: model differences and model reliability. Global Change Biology, 12. Example with R:
