Automatic identification of information units in tests based on narrative retelling using semantic similarity methods

evaluating semantic similarity methods

Keywords: neuropsychological tests, narrative retellings, semantic similarity methods

Abstract

Diagnoses of Alzheimer's Disease (AD) and Mild Cognitive Impairment (CCL) are based on the analysis of the patient's cognitive functions by administering cognitive and neuropsychological assessment batteries. The use of retelling narratives is common to help identify and quantify the degree of dementia. In general, one point is awarded for each unit recalled, and the final score represents the number of units recalled. In this paper, we evaluated two clinical tasks: the automatic identification of which elements of a retold narrative were recalled; and the binary classification of the narrative produced by a patient, having the units identified as attributes, aiming at an automatic screening of patients with cognitive impairment. We used two transcribed retelling data sets in which sentences were divided and manually annotated with the information units. These data sets were then made publicly available. They are: the Arizona Battery for Communication and Dementia Disorders (ABCD) that contains narratives of patients with CCL and Healthy Controls and the Avaliação da Linguagem no Envelhecimento (BALE), which includes narratives of patients with AD and CCLs as well as Healthy Controls. We evaluated two methods based on semantic similarity, referred to here as STS and Chunking, and transformed the multi-label problem of identifying elements of a retold narrative into binary classification problems, finding a cutoff point for the similarity value of each information unit. In this way, we were able to overcome two baselines for the two datasets in the SubsetAccuracy metric, which is the most punitive for the multi-label scenario. In binary classification, however, not all six machine learning methods evaluated performed better than the baselines methods. For ABCD, the best methods were Decision Trees and KNN, and for BALE, SVM with RBF kernel stood out.

Published
2020-01-04
How to Cite
Borges dos Santos, L. dos, & Aluísio, S. M. (2020). Automatic identification of information units in tests based on narrative retelling using semantic similarity methods: evaluating semantic similarity methods. Linguamática, 11(2), 47-63. https://doi.org/10.21814/lm.11.2.304
Section
Research Articles