Some features speak loud, but together they all speak louder: A study on the correlation between classification error and feature usage in decision-tree classification ensembles

© 2017 Elsevier Ltd While diversity has been argued to be the rationale for the success of an ensemble of classifiers, little has been said on how uniform use of the feature space influences classification error. Following an observation from a recent result, published elsewhere, among several ensembles of decision trees, those with a more uniform feature-use frequency also have a smaller classification error. This paper provides further support to such hypothesis. We have conducted experiments over 60 classification datasets, using 42 different types of decision tree ensembles, to test our hypothesis. Our results validate the hypothesis, prompting the design of ensemble construction methods that make a more uniform use of features, for classification problems of low and medium dimensionality.

Some features speak loud, but together they all speak louder: A study on the correlation between classification error and feature usage in decision-tree classification ensembles Academic Article in Scopus