Exploration of aunnotation strategies for entailment-based Automatic Short Answer Grading

Egaña Azpiazu, Aner

View/Open

Master_Thesis (663.7Kb)

Date

2023-06-30

Author

Egaña Azpiazu, Aner

Metadata

Show full item record

Estadisticas en RECOLECTA
(LA Referencia)

URI

http://hdl.handle.net/10810/61822

Abstract

[EN] Recent work has shown that Automatic Short Answer Grading can effectively be reformulated as a Textual Entailment problem. In this work we show that this reformulation is also effective in zero-shot and few-shot settings, where we report competent results close to state-of-the-art performance with the few-shot setting. More importantly, we show that the annotation strategy can have significant impact on performance. When annotating few examples, empirical results show that increasing the variability on the question side, at cost of decreasing the amount of annotated answers per question, is preferable than having the same number of annotated examples with less questions and more answers. With this annotation strategy, using only the 10% of the full training set our model levels with state-of-the-art systems in the SciEntsBank dataset. Finally, experiments over SciEntsBank and Beetle domains show that the use of out-of-domain annotated question-answer examples can be harmful, concluding that task-aware fine-tuned models obtain significantly lower results compared to task-agnostic general purpose inference models, at least with the domains employed for this work.

[EU] Erantzun labur automatikoen sailkapenaren inguruan azken urteetan egindako ikerketek atazaren birformulazio eraginkorra eraikitzea posible dela erakutsi dute, inferentzia testualaren atazarako birformulazioa, bereziki. Gure lan honetan, birformulazioaren eraginkortasuna erakusten da adibide gutxitako eszenarioetan (few-shot) eta adibide gabeko eszenarioetan (zero-shot) ere bai. Are eta garrantzitsuago, atazarako adibideak anotatzeko estrategiak modeloaren erredimenduan eragin nabarmena duela erakusten da. Adibide gutxi batzuk idaztean, emaitza enpirikoek erakusten dute hobe dela galderaren aldeko aldagarritasuna handitzea, galdera bakoitzeko idatzitako erantzun-kopurua murriztearen kostuari dagokionez, galdera gutxiagorekin eta erantzun gehiagorekin idatzitako adibide-kopuru bera izatea baino. Idazteko estrategia honi jarraituz, entrenamendu osoko datu-basearen %10a erabiliz artearen egoerako sistemen errendimenduaren parekoa da, SciEntsBank domeinuko datu-basean. Azkenik, Beetle eta SciEntsBank domeinuen gainean aurrera eramandako esperimentuek domeinuz kanpoko galdera-erantzun adibide bikoteek errendimendurako mingarriak izan daitezkeela erakutsi dute, beste domeinu batetik ataza ezagutzen duten sistemek ataza ezagutzen ez dutenak baino emaitza apalagoak emateko joera dutela ondorioztatuz, aztertutako domeinuetan behintzat.

Collections

Máster Universitario en Análisis y Procesamiento del Lenguaje