Exploring Learning Techniques in Language Models for Classifying Hate and Offensive Speech in Portuguese

  • Gabriel Assis Universidade Federal Fluminense
  • Annie Amorim Universidade Federal Fluminense
  • Jonnathan Carvalho Instituto Federal Fluminense
  • Mariza Ferro Universidade Federal Fluminense
  • Daniel de Oliveira Universidade Federal Fluminense
  • Daniela Vianna JusBrasil
  • Aline Paes Universidade Federal Fluminense
Keywords: transformers, classification, hate speech

Abstract

 Social Media platforms, significant in modern debate and communication, face the challenge of managing a vast and disorderly volume of hateful content and disinformation. This work examines the detection of hate speech in Portuguese, contemplating its unique linguistic and cultural nuance. Leveraging Transformer-based models and different training and activation strategies, nine models with variations in architecture, size, and pre-training corpora are evaluated. Our findings show that, even though large generative models with enhanced prompts exhibited promising results, tuned small language models remain superior in addressing this task.

Published
2024-12-27
How to Cite
Assis, G., Amorim, A., Carvalho, J., Ferro, M., de Oliveira, D., Vianna, D., & Paes, A. (2024). Exploring Learning Techniques in Language Models for Classifying Hate and Offensive Speech in Portuguese. Linguamática, 16(2), preprint. Retrieved from https://linguamatica.com/index.php/linguamatica/article/view/446
Section
PROPOR 2024 | Invited Articles