Process monitoring for quality - a feature selection method for highly unbalance data Academic Article in Scopus uri icon

abstract

  • © 2022, General Motors under exclusive licence to Springer-Verlag France SAS.Generating only a few defects per million of opportunities in state of the art in manufacturing science. Detecting these defects is one of the most critical modern intellectual challenges. Process Monitoring for Quality is a big data-driven quality philosophy aimed at rare quality event detection through binary classification and empirical knowledge discovery through feature interpretation. Feature selection methods enable the latter. These analytical tools help identify the system's driving features, which, in manufacturing, are then used to plan and design randomized experiments to find the optimal level of process parameters. A new filter-type feature selection method based on the separation between classes is presented. As several studies have shown, the predictive ability is strongly correlated to the distribution of its margins. Since manufacturing-derived data sets for binary classification of quality tend to be high/ultra-unbalanced, the proposed method is designed to analyze these data structures effectively. To demonstrate its properties and ability to select high-quality features, three case studies are presented: (1) seven virtual features are created to explain the agenda of the method, (2) manufacturing derived data set is analyzed, where the most relevant feature is identified and used for process redesign, and (3) comparative analysis with ReliefF (one of the most widely used methods) is presented, as well as a study using Correlation-based Feature Selection. According to empirical results, the features selected by the proposed method exhibited a significantly better prediction ability.

publication date

  • January 1, 2022