Show simple item record

dc.contributor.authorLastra Díaz, Juan José
dc.contributor.authorGoikoetxea Salutregi, Josu ORCID
dc.contributor.authorTaieb, Mohamed Ali Hadj
dc.contributor.authorGarcía Serrano, Ana
dc.contributor.authorBen Aouicha, Mohamed
dc.contributor.authorAgirre Bengoa, Eneko ORCID
dc.date.accessioned2020-01-17T13:05:54Z
dc.date.available2020-01-17T13:05:54Z
dc.date.issued2019-10-26
dc.identifier.citationData In Brief 26 : (2019) // Article ID UNSP 104432es_ES
dc.identifier.issn2352-3409
dc.identifier.urihttp://hdl.handle.net/10810/38598
dc.description.abstractThis data article introduces a reproducibility dataset with the aim of allowing the exact replication of all experiments, results and data tables introduced in our companion paper (Lastra-Diaz et al., 2019), which introduces the largest experimental survey on ontology-based semantic similarity methods and Word Embeddings (WE) for word similarity reported in the literature. The implementation of all our experiments, as well as the gathering of all raw data derived from them, was based on the software implementation and evaluation of all methods in HESML library (Lastra-Diaz et al., 2017), and their subsequent recording with Reprozip (Chirigati et al., 2016). Raw data is made up by a collection of data files gathering the raw word-similarity values returned by each method for each word pair evaluated in any benchmark. Raw data files were processed by running a R-language script with the aim of computing all evaluation metrics reported in (Lastra-Diaz et al., 2019), such as Pearson and Spearman correlation, harmonic score and statistical significance p-values, as well as to generate automatically all data tables shown in our companion paper. Our dataset provides all input data files, resources and complementary software tools to reproduce from scratch all our experimental data, statistical analysis and reported data. Finally, our reproducibility dataset provides a self-contained experimentation platform which allows to run new word similarity benchmarks by setting up new experiments including other unconsidered methods or word similarity benchmarks. (c) 2019 The Authors. Published by Elsevier Inc.es_ES
dc.description.sponsorshipThis work has been partially supported by the Spanish Ministery of Economy and Competitiveness VEMODALEN project (TIN2015-71785-R), the UPV/EHU (excellence research group) and the Spanish Research Agency LIHLITH project (PCIN-2017-118/AEI) in the framework of EU ERA-Net CHIST-ERA.es_ES
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.relationinfo:eu-repo/grantAgreement/MINECO/TIN2015-71785-Res_ES
dc.rightsinfo:eu-repo/semantics/openAccesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.subjectontology-based semantic similarity measureses_ES
dc.subjectword embedding modelses_ES
dc.subjectinformation content modelses_ES
dc.subjectwordnetes_ES
dc.subjectexperimental surveyes_ES
dc.subjectHESMLes_ES
dc.subjectreprozipes_ES
dc.titleReproducibility dataset for a large experimental survey on word embeddings and ontology-based methods for word similarityes_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.rights.holderThis is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Data in brief 26 (2019) 104432es_ES
dc.rights.holderAtribución 3.0 España*
dc.relation.publisherversionhttps://www.sciencedirect.com/science/article/pii/S2352340919307875?via%3Dihub#sec1es_ES
dc.identifier.doi10.1016/j.dib.2019.104432
dc.departamentoesLenguajes y sistemas informáticoses_ES
dc.departamentoeuHizkuntza eta sistema informatikoakes_ES


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Data in brief 26 (2019) 104432
Except where otherwise noted, this item's license is described as This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Data in brief 26 (2019) 104432