Do Random Forest-Driven Climate Envelope Models Require Variable Selection? A Case Study on Crustulina guttata (Theridiidae: Araneae)

基于随机森林的气候包络模型是否需要变量选择?以斑点小蜘蛛(球蛛科:蜘蛛目)为例

阅读:1

Abstract

Climate Envelope Models (CEMs) commonly employ 19 bioclimatic variables to predict species distributions, yet selecting which variables to include remains a critical challenge. Although it seems logical to select ecologically relevant variables, the biological responses of many target species are poorly understood. Random Forest (RF), a popular method in CEMs, can effectively handle correlated and nonlinear variables. In light of these strengths, this study explores the full model hypothesis, which involves using all 19 bioclimatic variables in an RF model, using Crustulina guttata (Theridiidae: Araneae) as a test case. Four model variants-a simplified model with two variables, an ecologically selected model with seven variables, a statistically selected model with ten variables, and a full model with nineteen variables-were compared against a thousand randomly assembled models with matching variable counts. All models achieved high performance, though results varied based on the number of variables employed. Notably, the full model consistently produced stronger predictions than models with fewer variables. Moreover, specifying particular variables did not yield a significant advantage over random selections of equally sized sets, indicating that omitting variables may risk the loss of important information. Although the final model suggests that C. guttata may have dispersed beyond its native European range through artificial means, this study examined only a single species. Thus, caution is warranted in generalizing these findings, and additional research is needed to determine whether the full model hypothesis extends to other taxa and environmental contexts. In scenarios where ecological knowledge is limited, however, using all available variables in an RF model may preserve potentially significant predictors and enhance predictive accuracy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。