Compilation and analysis of textual metrics of an essay's corpus

  • Átila Augusto Soares Vital Laboratório de Estudos Empíricos e Experimentais da Linguagem (LEEL) da Universidade Federal de Minas Gerais (UFMG/Brasil)
Keywords: Essays, Corpus linguistics, Textual complexity

Abstract

 The writing test of the National High School Exam (Enem) is very important to guarantee a place for students in undergraduate institutions in Brazil. From 2010 to 2020, the number of texts evaluated in maximum grade (one thousand points) dropped abruptly: in 2011, 3,694 texts gained 1,000 points, and in 2020, only 28 texts were evaluated with the same grade. The objective of this research is to present a corpus of texts graded one thousand points by Enem's team, to describe them and to make brief considerations about their characteristics during the historical series from 2010 to 2020. The compilation was made manually, using the internet. We used Orange: Data Mining and the NILC-Metrix textual complexity analyzer. The results suggest an expressive increase in the number of words and a decrease in the type/token ratio during the period. Finally, syntactic metrics were measured and confirmed the increase in textual complexity.

Published
2023-07-08
How to Cite
Soares Vital, Átila A. (2023). Compilation and analysis of textual metrics of an essay’s corpus. Linguamática, 15(1), 131-140. https://doi.org/10.21814/lm.15.1.393
Section
New Perspectives