Show simple item record

dc.contributor.authorIñurrieta Urmeneta, Usoa
dc.contributor.authorAduriz, Itziar
dc.contributor.authorDíaz de Ilarraza Sánchez, Arantza ORCID
dc.contributor.authorLabaka Intxauspe, Gorka ORCID
dc.contributor.authorSarasola Gabiola, Kepa Mirena ORCID
dc.date.accessioned2021-01-21T13:27:30Z
dc.date.available2021-01-21T13:27:30Z
dc.date.issued2020-08-27
dc.identifier.citationPlos One 15(8) : (2019) // Article ID e0237767es_ES
dc.identifier.issn1932-6203
dc.identifier.urihttp://hdl.handle.net/10810/49828
dc.description.abstractMultiword Expressions (MWEs) are idiosyncratic combinations of words which pose important challenges to Natural Language Processing. Some kinds of MWEs, such as verbal ones, are particularly hard to identify in corpora, due to their high degree of morphosyntactic flexibility. This paper describes a linguistically motivated method to gather detailed information about verb+noun MWEs (VNMWEs) from corpora. Although the main focus of this study is Spanish, the method is easily adaptable to other languages. Monolingual and parallel corpora are used as input, and data about the morphosyntactic variability of VNMWEs is extracted. This information is then tested in an identification task, obtaining an F score of 0.52, which is considerably higher than related work.es_ES
dc.description.sponsorshipThis work was funded by the Basque Government, who qualified the IXA research group (of which the authors of this article are members) as an A type research group (IT1343-19). It is also part of the project entitled "MODENA: advanced neural modeling for high-quality translation" (KK-2018/00087).es_ES
dc.language.isoenges_ES
dc.publisherPublic Library Sciencees_ES
dc.rightsinfo:eu-repo/semantics/openAccesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.titleLearning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identificationes_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.rights.holder2020 Inurrieta et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.es_ES
dc.rights.holderAtribución 3.0 España*
dc.relation.publisherversionhttps://journals.plos.org/plosone/article?id=10.1371/journal.pone.0237767es_ES
dc.identifier.doi10.1371/journal.pone.0237767
dc.departamentoesLenguajes y sistemas informáticoses_ES
dc.departamentoeuHizkuntza eta sistema informatikoakes_ES


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

2020 Inurrieta et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Except where otherwise noted, this item's license is described as 2020 Inurrieta et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.