Conformal prediction in multi-user settings: an evaluation Academic Article in Scopus uri icon

abstract

  • Typically, machine learning classifiers are trained and evaluated without making any distinction between users (e.g., using traditional hold-out and cross-validation). However, this produces inaccurate performance metrics estimates in multi-user settings, that is, situations where the data were collected by multiple users with different characteristics (e.g., age, gender, height, etc.) which is very common in user computer interaction and medical applications. For these types of scenarios, model evaluation strategies that provide better performance estimates have been proposed such as mixed, user-independent, user-dependent, and user-adaptive models. Although those strategies are better suited for multi-user systems, they are typically assessed with respect to performance metrics that capture the overall behavior of the models and do not provide any performance guarantees for individual predictions nor do they provide any feedback about the predictions¿ uncertainty. In order to overcome those limitations, in this work we evaluated the conformal prediction framework in several multi-user settings. Conformal prediction is a model agnostic method that provides confidence guarantees on the predictions, thus increasing the trustworthiness and robustness of the models. We propose a new type of benchmark model (user-calibrated) and conducted extensive experiments using different evaluation strategies and found significant differences in terms of conformal performance measures. Our results show the importance of taking into account different evaluation strategies in multi-user systems. We also propose several visualizations based on matrices, graphs, and charts that capture different aspects of the prediction sets. These visualizations allow for a more fine-grained analysis compared to traditional plots such as confusion matrices. © The Author(s), under exclusive licence to Springer Nature B.V. 2025.

publication date

  • March 1, 2025