AIA-BDE: a corpus of Portuguese Questions, Variations and other Annotations

  • Hugo Gonçalo Oliveira CISUC, Universidade de Coimbra
  • Ana Alves

Abstract

 We present the AIA-BDE corpus, which has as main goal the evaluation of computational systems that attempt at assigning questions with known answers (i.e., FAQs) to information needs, expressed in natural language. This corpus includes several questions in the domain of the Portuguese Public Administration and their answers. To 855 of those questions, alternative ways of making them were manually and automatically added. We call them variations and they can be used in the simulation of human user interactions. Such questions are classified according to their source, with four possible values, and have also a question type, based on the opinion of five human annotators. Besides presenting AIA-BDE, we illustrate how it can be used through three experiments, with results that might be seen as the baselines for future improvements, namely: variation assignment to the original questions; automatic automatic identification of the questions according to their source; and automatic classification of the questions according to their type.

Published
2021-12-30
How to Cite
Gonçalo Oliveira, H., & Alves, A. (2021). AIA-BDE: a corpus of Portuguese Questions, Variations and other Annotations. Linguamática, 13(2), 19-35. https://doi.org/10.21814/lm.13.2.350
Section
Research Articles