Show simple item record

dc.contributor.advisorLópez de Lacalle Lecuona, Oier ORCID
dc.contributor.advisorAldabe Arregi, Itziar ORCID
dc.contributor.authorEgaña Azpiazu, Aner
dc.date.accessioned2023-06-30T14:52:48Z
dc.date.available2023-06-30T14:52:48Z
dc.date.issued2023-06-30
dc.identifier.urihttp://hdl.handle.net/10810/61822
dc.description.abstract[EN] Recent work has shown that Automatic Short Answer Grading can effectively be reformulated as a Textual Entailment problem. In this work we show that this reformulation is also effective in zero-shot and few-shot settings, where we report competent results close to state-of-the-art performance with the few-shot setting. More importantly, we show that the annotation strategy can have significant impact on performance. When annotating few examples, empirical results show that increasing the variability on the question side, at cost of decreasing the amount of annotated answers per question, is preferable than having the same number of annotated examples with less questions and more answers. With this annotation strategy, using only the 10% of the full training set our model levels with state-of-the-art systems in the SciEntsBank dataset. Finally, experiments over SciEntsBank and Beetle domains show that the use of out-of-domain annotated question-answer examples can be harmful, concluding that task-aware fine-tuned models obtain significantly lower results compared to task-agnostic general purpose inference models, at least with the domains employed for this work.es_ES
dc.description.abstract[EU] Erantzun labur automatikoen sailkapenaren inguruan azken urteetan egindako ikerketek atazaren birformulazio eraginkorra eraikitzea posible dela erakutsi dute, inferentzia testualaren atazarako birformulazioa, bereziki. Gure lan honetan, birformulazioaren eraginkortasuna erakusten da adibide gutxitako eszenarioetan (few-shot) eta adibide gabeko eszenarioetan (zero-shot) ere bai. Are eta garrantzitsuago, atazarako adibideak anotatzeko estrategiak modeloaren erredimenduan eragin nabarmena duela erakusten da. Adibide gutxi batzuk idaztean, emaitza enpirikoek erakusten dute hobe dela galderaren aldeko aldagarritasuna handitzea, galdera bakoitzeko idatzitako erantzun-kopurua murriztearen kostuari dagokionez, galdera gutxiagorekin eta erantzun gehiagorekin idatzitako adibide-kopuru bera izatea baino. Idazteko estrategia honi jarraituz, entrenamendu osoko datu-basearen %10a erabiliz artearen egoerako sistemen errendimenduaren parekoa da, SciEntsBank domeinuko datu-basean. Azkenik, Beetle eta SciEntsBank domeinuen gainean aurrera eramandako esperimentuek domeinuz kanpoko galdera-erantzun adibide bikoteek errendimendurako mingarriak izan daitezkeela erakutsi dute, beste domeinu batetik ataza ezagutzen duten sistemek ataza ezagutzen ez dutenak baino emaitza apalagoak emateko joera dutela ondorioztatuz, aztertutako domeinuetan behintzat.es_ES
dc.language.isoenges_ES
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectautomatic short answer gradinges_ES
dc.subjectfine-tuning
dc.subjecttransfer learning
dc.subjecttask reformulation
dc.subjectzero-shot
dc.subjectfew-shot
dc.subjectcross-domain learning
dc.titleExploration of aunnotation strategies for entailment-based Automatic Short Answer Gradinges_ES
dc.typeinfo:eu-repo/semantics/masterThesis
dc.date.updated2022-09-08T07:38:06Z
dc.language.rfc3066es
dc.rights.holder© 2022, el autor
dc.contributor.degreeMáster Universitario en Análisis y Procesamiento del Lenguaje
dc.contributor.degreeHizkuntzaren Azterketa eta Prozesamendua Unibertsitate Masterra
dc.identifier.gaurregister126779-844573-11es_ES
dc.identifier.gaurassign138134-844573es_ES


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record