Feature-ranked self-growing forest: a tree ensemble based on structure diversity for classification and regression.

Bibliographic Details
Title: Feature-ranked self-growing forest: a tree ensemble based on structure diversity for classification and regression.
Authors: Carino-Escobar, Ruben I.1 (AUTHOR), Alonso-Silverio, Gustavo A.2 (AUTHOR), Alarcón-Paredes, Antonio3 (AUTHOR), Cantillo-Negrete, Jessica1 (AUTHOR) jcantillo@inr.gob.mx
Superior Title: Neural Computing & Applications. May2023, Vol. 35 Issue 13, p9285-9298. 14p.
Subject Terms: *RANDOM forest algorithms, *REGRESSION trees, *TREE growth, *COMPUTATIONAL complexity, *MACHINE learning, *PYTHON programming language
Abstract: Tree ensemble algorithms, such as random forest (RF), are some of the most widely applied methods in machine learning. However, an important hyperparameter, the number of classification or regression trees within the ensemble must be specified in these algorithms. The number of trees within the ensemble can adversely affect bias or computational cost and should ideally be adapted for each task. For this reason, a novel tree ensemble is described, the feature-ranked self-growing forest (FSF), that allows the automatic growth of a tree ensemble based on the structural diversity of the first two levels of trees' nodes. The algorithm's performance was tested with 30 classification and 30 regression datasets and compared with RF. The computational complexity was also theoretically and experimentally analyzed. FSF had a significant higher performance for 57%, and an equivalent performance for 27% of classification datasets compared to RF. FSF had a higher performance for 70% and an equivalent performance for 7% of regression datasets compared to RF. Computational complexity of FSF was competitive compared to that of other tree ensembles, being mainly dependent on the number of observations within the dataset. Therefore, it can be implied that FSF is a suitable out-of-the-box approach with potential as a tool for feature ranking and dataset's complexity analysis using the number of trees computed for a particular task. A MATLAB and Python implementation of the algorithm and a working example for classification and regression are provided for academic use. [ABSTRACT FROM AUTHOR]
Copyright of Neural Computing & Applications is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Academic Search Premier
Full text is not displayed to guests.
Description
Description not available.