AcademicArticleSCO_85042760285 uri icon

abstract

  • 1089-778X © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Support vector machines (SVMs) are among the most powerful learning algorithms for classification tasks. However, these algorithms require a high computational cost during the training phase, which can limit their application on large-scale datasets. Moreover, it is known that their effectiveness highly depends on the hyper-parameters used to train the model. With the intention of dealing with these, this paper introduces an evolutionary multiobjective model and instance selection (IS) approach for SVMs with Pareto-based ensemble, whose goals are, precisely, to optimize the size of the training set and the classification performance attained by the selection of the instances, which can be done using either a wrapper or a filter approach. Due to the nature of multiobjective evolutionary algorithms, several Pareto optimal solutions can be found. We study several ways of using such information to perform a classification task. To accomplish this, our proposal performs a processing over the Pareto solutions in order to combine them into a single ensemble. This is done in five different ways, which are based on: 1) a global Pareto ensemble; 2) error reduction; 3) a complementary error reduction; 4) maximized margin distance; and 5) boosting. Through a comprehensive experimental study we evaluate the suitability of the proposed approach and the Pareto processing, and we show its advantages over a single-objective formulation, traditional IS techniques, and learning algorithms.