Exploration of aunnotation strategies for entailment-based Automatic Short Answer Grading

Egaña Azpiazu, Aner

dc.contributor.advisor	López de Lacalle Lecuona, Oier
dc.contributor.advisor	Aldabe Arregi, Itziar
dc.contributor.author	Egaña Azpiazu, Aner
dc.date.accessioned	2023-06-30T14:52:48Z
dc.date.available	2023-06-30T14:52:48Z
dc.date.issued	2023-06-30
dc.identifier.uri	http://hdl.handle.net/10810/61822
dc.description.abstract	[EN] Recent work has shown that Automatic Short Answer Grading can effectively be reformulated as a Textual Entailment problem. In this work we show that this reformulation is also effective in zero-shot and few-shot settings, where we report competent results close to state-of-the-art performance with the few-shot setting. More importantly, we show that the annotation strategy can have significant impact on performance. When annotating few examples, empirical results show that increasing the variability on the question side, at cost of decreasing the amount of annotated answers per question, is preferable than having the same number of annotated examples with less questions and more answers. With this annotation strategy, using only the 10% of the full training set our model levels with state-of-the-art systems in the SciEntsBank dataset. Finally, experiments over SciEntsBank and Beetle domains show that the use of out-of-domain annotated question-answer examples can be harmful, concluding that task-aware fine-tuned models obtain significantly lower results compared to task-agnostic general purpose inference models, at least with the domains employed for this work.	es_ES
dc.description.abstract	[EU] Erantzun labur automatikoen sailkapenaren inguruan azken urteetan egindako ikerketek atazaren birformulazio eraginkorra eraikitzea posible dela erakutsi dute, inferentzia testualaren atazarako birformulazioa, bereziki. Gure lan honetan, birformulazioaren eraginkortasuna erakusten da adibide gutxitako eszenarioetan (few-shot) eta adibide gabeko eszenarioetan (zero-shot) ere bai. Are eta garrantzitsuago, atazarako adibideak anotatzeko estrategiak modeloaren erredimenduan eragin nabarmena duela erakusten da. Adibide gutxi batzuk idaztean, emaitza enpirikoek erakusten dute hobe dela galderaren aldeko aldagarritasuna handitzea, galdera bakoitzeko idatzitako erantzun-kopurua murriztearen kostuari dagokionez, galdera gutxiagorekin eta erantzun gehiagorekin idatzitako adibide-kopuru bera izatea baino. Idazteko estrategia honi jarraituz, entrenamendu osoko datu-basearen %10a erabiliz artearen egoerako sistemen errendimenduaren parekoa da, SciEntsBank domeinuko datu-basean. Azkenik, Beetle eta SciEntsBank domeinuen gainean aurrera eramandako esperimentuek domeinuz kanpoko galdera-erantzun adibide bikoteek errendimendurako mingarriak izan daitezkeela erakutsi dute, beste domeinu batetik ataza ezagutzen duten sistemek ataza ezagutzen ez dutenak baino emaitza apalagoak emateko joera dutela ondorioztatuz, aztertutako domeinuetan behintzat.	es_ES
dc.language.iso	eng	es_ES
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	automatic short answer grading	es_ES
dc.subject	fine-tuning
dc.subject	transfer learning
dc.subject	task reformulation
dc.subject	zero-shot
dc.subject	few-shot
dc.subject	cross-domain learning
dc.title	Exploration of aunnotation strategies for entailment-based Automatic Short Answer Grading	es_ES
dc.type	info:eu-repo/semantics/masterThesis
dc.date.updated	2022-09-08T07:38:06Z
dc.language.rfc3066	es
dc.rights.holder	© 2022, el autor
dc.contributor.degree	Máster Universitario en Análisis y Procesamiento del Lenguaje
dc.contributor.degree	Hizkuntzaren Azterketa eta Prozesamendua Unibertsitate Masterra
dc.identifier.gaurregister	126779-844573-11	es_ES
dc.identifier.gaurassign	138134-844573	es_ES

Ficheros en el ítem

Nombre:: MasterThesis_Aner_Egaña.pdf
Tamaño:: 663.7Kb
Formato:: PDF
Descripción:: Master_Thesis

Ver/

Este ítem aparece en la(s) siguiente(s) colección(ones)

Máster Universitario en Análisis y Procesamiento del Lenguaje

Mostrar el registro sencillo del ítem