Assessing the Effect of Writing Style Features in Detecting Depression on Social Media Chapter in Scopus uri icon

abstract

  • Depression detection in social media is an area that has received increased interest from researchers due to its significant impact on the lives of individuals suffering from it and on the global scale as a whole. The objective of this paper is to verify whether the use of capital letters and repeating characters as features in a Machine Learning model can improve the detection of depression in social media. Using the dataset from the second task of the 2022 edition of the eRisk Lab, five models were created and compared using four classifiers: Logistic Regression, Random Forest, Support Vector Machine and Neural Network. The best performing model overall was the one that used the ratio of capitals in users¿ comments and posts with a Random Forest classifier, achieving an f1-score of 0.57, which was 0.08 better than the baseline model. An increase in performance was observed over the baseline model when using these features, indicating that they had an impact, even if it was small. Further research is necessary to fully evaluate the impact of using capitalization and repeating characters on the detection of depression and other mental disorders. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

publication date

  • January 1, 2025