Show simple item record

dc.contributor.advisorAgirre Bengoa, Eneko ORCID
dc.contributor.advisorLabaka Intxauspe, Gorka ORCID
dc.contributor.authorArtetxe Zurutuza, Mikel
dc.contributor.otherLenguajes y Sistemas Informáticos; Hizkuntza eta Sistema Informatikoakes
dc.date.accessioned2016-05-24T06:34:55Z
dc.date.available2016-05-24T06:34:55Z
dc.date.issued2016-05-24
dc.date.submitted2016-05-17
dc.identifier.urihttp://hdl.handle.net/10810/18297
dc.description.abstract[EU]Lan honetan semantika distribuzionalaren eta ikasketa automatikoaren erabilera aztertzen dugu itzulpen automatiko estatistikoa hobetzeko. Bide horretan, erregresio logistikoan oinarritutako ikasketa automatikoko eredu bat proposatzen dugu hitz-segiden itzulpen- probabilitatea modu dinamikoan modelatzeko. Proposatutako eredua itzulpen automatiko estatistikoko ohiko itzulpen-probabilitateen orokortze bat dela frogatzen dugu, eta testuinguruko nahiz semantika distribuzionaleko informazioa barneratzeko baliatu ezaugarri lexiko, hitz-cluster eta hitzen errepresentazio bektorialen bidez. Horretaz gain, semantika distribuzionaleko ezagutza itzulpen automatiko estatistikoan txertatzeko beste hurbilpen bat lantzen dugu: hitzen errepresentazio bektorial elebidunak erabiltzea hitz-segiden itzulpenen antzekotasuna modelatzeko. Gure esperimentuek proposatutako ereduen baliagarritasuna erakusten dute, emaitza itxaropentsuak eskuratuz oinarrizko sistema sendo baten gainean. Era berean, gure lanak ekarpen garrantzitsuak egiten ditu errepresentazio bektorialen mapaketa elebidunei eta hitzen errepresentazio bektorialetan oinarritutako hitz-segiden antzekotasun neurriei dagokienean, itzulpen automatikoaz haratago balio propio bat dutenak semantika distribuzionalaren arloan.es
dc.description.abstract[EN]In this work, we explore the use of distributional semantics and machine learning to improve statistical machine translation. For that purpose, we propose the use of a logistic regression based machine learning model for dynamic phrase translation probability mod- eling. We prove that the proposed model can be seen as a generalization of the standard translation probabilities used in statistical machine translation, and use it to incorporate context and distributional semantic information through lexical, word cluster and word embedding features. Apart from that, we explore the use of word embeddings for phrase translation probability scoring as an alternative approach to incorporate distributional semantic knowledge into statistical machine translation. Our experiments show the effectiveness of the proposed models, achieving promising results over a strong baseline. At the same time, our work makes important contributions in relation to bilingual word embedding mappings and word embedding based phrase similarity measures, which go be- yond machine translation and have an intrinsic value in the field of distributional semantics.es
dc.language.isoenges
dc.rightsinfo:eu-repo/semantics/openAccesses
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/*
dc.subjectmachine translationes
dc.subjectmachine learninges
dc.titleDistributional semantics and machine learning for statistical machine translationes
dc.typeinfo:eu-repo/semantics/masterThesises
dc.rights.holderAttribution-NonCommercial 4.0 International*


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial 4.0 International
Except where otherwise noted, this item's license is described as Attribution-NonCommercial 4.0 International