Arbitrary Portuguese text style transfer

Keywords: natural language generation, arbitrary style transfer, paraphrases, sequence-to-sequence, large language models

Abstract

In Automatic Natural Language Generation, arbitrary style transfer models aim to rewrite a text using any desired new set of stylistic features. In the case of the Portuguese language, however, we notice that the resources required for the development of models of this type are still considerably scarce compared to those dedicated to the English language. Thus, as a first step towards the development of advanced methods of this kind, the present work investigates the issue of arbitrary style transfer with the aid of paraphrases in Portuguese, combined with the use of neural models built from sequence-to-sequence architectures and by refining a number of large language models. In addition to the textual rewriting models themselves, the study also presents novel resources for the task in the form of a corpus of paraphrases and a model of embeddings
validated in both sentence similarity and simplification tasks, with results comparable to the state of the art.

Author Biography

Ivandré Paraboni, Escola de Artes, Ciências e Humanidades (EACH)Universidade de São Paulo (USP)
Professor-doutor junto à Escola de Artes, Ciências e Humanidades (EACH) da Universidade de São Paulo (USP) em São Paulo, Brasil.
Published
2023-12-30
How to Cite
Botton da Costa, P., & Paraboni, I. (2023). Arbitrary Portuguese text style transfer. Linguamática, 15(2), 19-36. https://doi.org/10.21814/lm.15.2.410
Section
Research Articles