Compilation and analysis of textual metrics of an essay's corpus

Authors

  • Átila Augusto Soares Vital Laboratório de Estudos Empíricos e Experimentais da Linguagem (LEEL) da Universidade Federal de Minas Gerais (UFMG/Brasil)

DOI:

https://doi.org/10.21814/lm.15.1.393

Keywords:

Essays, Corpus linguistics, Textual complexity

Abstract

 The writing test of the National High School Exam (Enem) is very important to guarantee a place for students in undergraduate institutions in Brazil. From 2010 to 2020, the number of texts evaluated in maximum grade (one thousand points) dropped abruptly: in 2011, 3,694 texts gained 1,000 points, and in 2020, only 28 texts were evaluated with the same grade. The objective of this research is to present a corpus of texts graded one thousand points by Enem's team, to describe them and to make brief considerations about their characteristics during the historical series from 2010 to 2020. The compilation was made manually, using the internet. We used Orange: Data Mining and the NILC-Metrix textual complexity analyzer. The results suggest an expressive increase in the number of words and a decrease in the type/token ratio during the period. Finally, syntactic metrics were measured and confirmed the increase in textual complexity.

References

Published

2023-07-08

Issue

Section

New Perspectives

How to Cite

Compilation and analysis of textual metrics of an essay’s corpus. (2023). Linguamática, 15(1), 131-140. https://doi.org/10.21814/lm.15.1.393