Automatic Lexical Adaptation in Brazilian Portuguese Informative Texts for Elementary Education

  • Nathan Siegle Hartmann USP
  • Sandra Maria Aluísio
Keywords: text adaptation, lexical simplification, lexical elaboration, reading aid for children

Abstract

Text Adaptation is a large Natural Language Processing (NLP) research area, well known as educational practice and has two main approaches: Simplification and Text Elaboration. There is not much work in the NLP literature that addresses all phases of Lexical Adaptation for systems implementation. Several works independently deal with the Lexical Simplification and Elaboration tasks, bringing partial contributions, since each task has its own challenges. This work proposed a pipeline for Lexical Adaptation and presents contributions in three of the four stages of the Lexical Adaptation pipeline: (i) proposal and evaluation of methods for the Complex Word Identification task; (ii) corpus analysis to survey Lexical Elaboration word definition standards; (iii) the SIMPLEX-PB 3.0 corpus, containing in its new version short definitions extracted from dictionaries that were manually revised, annotations of technical terms extracted from a dictionary, and linguistic metrics of lexical complexity; and (iv) proposal and evaluation of methods for Lexical Simplification, establishing a new SOTA for the task applied in Brazilian Portuguese.

Published
2020-12-31
How to Cite
Hartmann, N. S., & Aluísio, S. M. (2020). Automatic Lexical Adaptation in Brazilian Portuguese Informative Texts for Elementary Education. Linguamática, 12(2), 3-27. https://doi.org/10.21814/lm.12.2.323
Section
Research Articles