Species Distribution Model: Boosted Regression (Decision) TreesModel Category: Hybrid Model Description: Boosted regression trees combine the concept of regression trees with the concept of boosting. Regression trees partition the space of explanatory variable values and come up with a prediction for each region of this space. Boosting is an ensemble method that derives an answer from a large pool of models. In this case, this large pool of models is generated by creating trees that deal with the residuals leftover from pre-existing trees. In other words, they attempt to explain the variation in the data that is currently unexplained by existing trees. Model Assumptions: If explanatory variables are highly co-linear, standard caveats about not being able to disentangle their effects apply. Boosted regression trees will (semi-arbitrarily) choose one of the co-linear variables to use. Standard assumptions about data accuracy also apply. Learning rate and tree complexity are both important variables that will influence results. Appropriate values for them can be determined using a tuning set. Model Response Data: Should cover the range of values you want to predict. In the case of categorical values (decision trees), this means that there should be some data points from all classes. In the case of SDMs specifically, this means that both presence data and absence data are required. Boosted regression trees can also handle continuous response data, such as species abundances. Model Explanatory Data: Can be either categorical or continuous. The modeling process should automatically weed out uninformative variables. Model Links and Use with R: R gbm package, Introduction to Boosted Regression Trees, Another useful reference Example Papers: Elith, J., Leathwick, J. R. and Hastie, T. (2008), A working guide to boosted regression trees. Journal of Animal Ecology, 77: 802–813. doi: 10.1111/j.1365-2656.2008.01390.x Veran, S., Piry, S., Ternois, V., Meynard, C. N., Facon, B. and Estoup, A. (2015), Modeling spatial expansion of invasive alien species: relative contributions of environmental and anthropogenic factors to the spreading of the harlequin ladybird in France. Ecography. doi: 10.1111/ecog.01389 Example with R: From the dismo boosted regression tree vingette. Full model:
Reduced model (bio1, bio5, and bio12 as predictors):
4 Comments
Rose
5/23/2016 04:38:00 am
Hi. I would like to know how can I generate the presence-absence data for the BRT?
Reply
spatial ecology & R
8/12/2016 11:28:05 am
The dismo R package contains a dataset on a sloth species (Bradypus variegatus); https://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf
Reply
sb charles
1/2/2017 02:56:04 am
Hi, Need your help. I have data in .csv format I would like to know how can I bring in the data (.csv format) for BRT analysis.
Reply
Nick
9/24/2017 01:39:04 pm
Does color ramp and lat/lon ticks come up when 'plot' is called?
Reply
Leave a Reply. |
Spatial Ecology @ MSUClick on "Category" below to search for R code compiled by the Zarnetske Spatial & Community Ecology Lab and students in MSU's Spatial Ecology graduate course (FOR870/FW870) Category
All
Archive
October 2016
|