Machine Learning, Missing Values, and Algorithm Selectors: The Untold Story
Academic Article in Scopus
-
- Overview
-
- Identity
-
- Additional document info
-
- View All
-
Overview
abstract
-
This paper presents a study of the potential benefits of incorporating missing values into the training process of algorithm selectors powered by machine learning algorithms, particularly those used for classification. This work analyzes various scenarios related to omitting some of the data available for training and measures the performance of the algorithm selectors produced to estimate how resistant they are to the presence of missing values within the training data. Our experiments open a new and exciting perspective on training algorithm selectors, one where it is possible to save computational resources by omitting some calculations, reducing the effort to produce such selectors, but without significantly harming their performance on unseen instances. For example, our results show that given a proper training set and deciding which runs to omit completely at random, some Machine Learning strategies such as Neural Networks, Naïve Bayes Classifiers, and Support Vector Machines can correctly operate as algorithm selectors with up to 50% of the data missing (data about the solvers to choose from), without any further treatment of the missing values. © 2025 Instituto Politecnico Nacional. All rights reserved.
status
publication date
published in
Identity
Digital Object Identifier (DOI)
Additional document info
has global citation frequency
start page
end page
volume