Deep Forest (gcForest or multi-Grained Cascade forest) – a novel decision tree ensemble approach with performance highly competitive to deep neural networks.
This method generates a deep forest ensemble, with a cascade structure which enables gcForest to do representation learning. Its representational learning ability can be further enhanced by multi-grained scanning when the inputs are with high dimensionality, potentially enabling gcForest to be contextual or structural aware.
With the recognition that the key of deep learning lies in the representation learning and large model capacity, in this paper we attempt to endow such properties to tree ensembles and propose the gcForest method. Comparing with deep neural networks, gcForest achieved highly competitive or even better performance in our experiments. More importantly, gcForest has much fewer hyper-parameters and is less sensitive to parameter setting; actually in our experiments excellent performance are obtained across various domains by using the same parameter setting, and it can work well no matter on large-scale or small-scale data. Moreover, as a tree-based approach, gcForest should be easier for theoretical analysis than deep neural networks, although this is beyond the scope of this paper. The deep forest source code will be available soon.
Deep Forest (gcForest) Datasets:
- MNIST http://yann.lecun.com/exdb/mnist/
- GTZAN Genre Collection http://marsyasweb.appspot.com/download/data_sets/
- sEMG for Basic Hand movements https://archive.ics.uci.edu/ml/datasets/sEMG+for+Basic+Hand+movements
- The IMDB dataset https://www.kaggle.com/deepmatrix/imdb-5000-movie-dataset
- UCI-datasets https://archive.ics.uci.edu/ml/datasets.html
Related work to Deep Forest:
- [Zhou, 2012] Z.-H. Zhou. Ensemble Methods: Foundations
and Algorithms. CRC, Boca Raton, FL, 2012.
- [Peter et al., 2015] K. Peter, F. Madalina, C. Antonio, and
R. Samuel. Deep neural decision forests. In IEEE International
Conference on Computer Vision, pages 1467–1475,
- [Breiman, 1996] L. Breiman. Stacked regressions. Machine Learning, 24(1):49–64, 1996.
- [Wolpert, 1992] D. H. Wolpert. Stacked generalization. Neural Networks, 5(2):241–260, 1992.
- [Ting and Witten, 1999] K. M. Ting and I. H. Witten. Issues in stacked generalization. Journal of Artificial Intelligence Research, 10:271–289, 1999.
- [Wei and Zhou, 2016] X.-S. Wei and Z.-H. Zhou. An empirical study on image bag generators for multi-instance learning. Machine Learning, 105(2):155–198, 2016.