Extraction and analysis of suicide causes through linguistic markers in news reports

  • José Alejandro Reyes Ortiz Universidad Autónoma Metropolitana
  • Mireya Tovar, Dra. BUAP

Abstract

The automatic analysis of suicide data(texts) has become a challenge for the computational linguistics research field, increasingly, tools are needed to help reduce suicide rates, for example, by extracting the suicide causes in order to support their early detection. Linguistic aspects in Spanish texts, such as cue phrases or parts of speech, can help in this task. Therefore, this paper presents a computational approach to the extraction and analysis of suicide causes from news reports in Spanish. The automatic extraction of suicide causes is carried out through linguistic markers based on verbs, connectors, prepositions and conjunctions. On the other hand, the analysis of the suicides causes is performed in two approaches: a) an analysis focused on verbal and noun phrases, studying the presence of the negation; b) an analysis on the frequency about unigrams or bigrams of words. Both analyzes show promising and correlated results, which are useful for recognizing the suicide causes reported in Mexico in a given period. Finally, a corpus is obtained with a collection of 581 suicide causes.

Published
2020-01-04
How to Cite
Reyes Ortiz, J. A., & Tovar, M. (2020). Extraction and analysis of suicide causes through linguistic markers in news reports. Linguamática, 11(2), 67-77. https://doi.org/10.21814/lm.11.2.276
Section
New Perspectives