Species Distribution Model: Boosted Regression (Decision) TreesModel Category: Hybrid Model Description: Boosted regression trees combine the concept of regression trees with the concept of boosting. Regression trees partition the space of explanatory variable values and come up with a prediction for each region of this space. Boosting is an ensemble method that derives an answer from a large pool of models. In this case, this large pool of models is generated by creating trees that deal with the residuals leftover from pre-existing trees. In other words, they attempt to explain the variation in the data that is currently unexplained by existing trees. Model Assumptions: If explanatory variables are highly co-linear, standard caveats about not being able to disentangle their effects apply. Boosted regression trees will (semi-arbitrarily) choose one of the co-linear variables to use. Standard assumptions about data accuracy also apply. Learning rate and tree complexity are both important variables that will influence results. Appropriate values for them can be determined using a tuning set. Model Response Data: Should cover the range of values you want to predict. In the case of categorical values (decision trees), this means that there should be some data points from all classes. In the case of SDMs specifically, this means that both presence data and absence data are required. Boosted regression trees can also handle continuous response data, such as species abundances. Model Explanatory Data: Can be either categorical or continuous. The modeling process should automatically weed out uninformative variables. Model Links and Use with R: R gbm package, Introduction to Boosted Regression Trees, Another useful reference Example Papers: Elith, J., Leathwick, J. R. and Hastie, T. (2008), A working guide to boosted regression trees. Journal of Animal Ecology, 77: 802–813. doi: 10.1111/j.1365-2656.2008.01390.x Veran, S., Piry, S., Ternois, V., Meynard, C. N., Facon, B. and Estoup, A. (2015), Modeling spatial expansion of invasive alien species: relative contributions of environmental and anthropogenic factors to the spreading of the harlequin ladybird in France. Ecography. doi: 10.1111/ecog.01389 Example with R: From the dismo boosted regression tree vingette. Full model:
Reduced model (bio1, bio5, and bio12 as predictors):
4 Comments
|
Spatial Ecology @ MSUClick on "Category" below to search for R code compiled by the Zarnetske Spatial & Community Ecology Lab and students in MSU's Spatial Ecology graduate course (FOR870/FW870) Category
All
Archive
October 2016
|