Challenges and advantages of the automatic identification of character gender and professions in DIP
Abstract
The development of systems for automatic identification of characters and some of their characteristics is the central objective of the Character Identification Challenge (DIP) project developed in conjunction with Linguateca. Among these characteristics, 2 this article will focus on the identification of gender and professions of the characters. Firstly, we will justify our choice to work with these two data sets, presenting the different paths we have taken to establish guidelines for their identification. Manual identification of gender and profession is exhaustive and susceptible to errors, making the use of computer systems increasingly common for this task. The analysis of professions would allow reflection on issues such as the definition of a profession, its frequency in Brazilian and Portuguese works, and possible relationships with literary genres. We present some results from distant and close reading of a group of works, contrast these results and comment on the challenges and advantages we encountered throughout this task, which seem to reinforce our hypothesis of a preference for a combined effort of automatic systems and human interpretation in character identification.
Copyright (c) 2023 Emanoel Pires, Marcia Caetano Langfeldt, Rebeca Schumacher Fuão
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).