HRegBERT-CNN: Multi-Class Regret Detection in Hindi Devanagari Script Academic Article in Scopus uri icon

abstract

  • Regret is a complex negative emotion often associated with feelings of remorse, self-blame, and disappointment regarding past actions or decisions. It plays a significant role in various business and decision-making contexts and also has an impact on the health of individuals. This work aims at the detection of regret-one of the most important emotion. Existing research on regret detection has been predominantly limited to English content. It is observed that people find it easier to communicate their feelings effectively in their native or code-mixed languages. However, there is no work focusing on the detection of regret from text written in these languages. To address this gap, this paper first presents a novel dataset in Hindi using posts/comments written in Hindi or Hindi Roman script from multiple sources, incorporating both manual and automated annotation techniques to enhance the quality and consistency of data labeling. Then, it proposes a multi-class regret detection framework to detect regret and classify its domain. The proposed framework HRegBERT-CNN integrates a fine-tuned BERT(regret) model for Hindi with CNN using N-gram word embeddings, enabling it to capture local contextual features and complex patterns in the text effectively. Experimental results show that the HRegBERT-CNN model outperforms state-of-the-art models on the Hindi regret dataset by at least 3% and 5% for regret detection and domain identification tasks, respectively, in terms of macro F1-score. © 2025 John Wiley & Sons Ltd.

publication date

  • January 1, 2026