EHME: A new word database for research in Basque language
The Spanish Journal of Psychology 17 : 1-23
Abstract
[EN] This article presents EHME, the frequency dictionary of Basque structure, an online program that enables researchers in psycholinguistics to extract word and nonword stimuli, based on a broad range of statistics concerning the properties of Basque words. The database consists of 22.7 million tokens, and properties available include morphological structure frequency and word-similarity measures, apart from classical indexes: word frequency, orthographic structure, orthographic similarity, bigram and biphone frequency, and syllable-based measures. Measures are indexed at the lemma, morpheme and word level. We include reliability and validation analysis. The application is freely available, and enables the user to extract words based on concrete statistical criteria, as well as to obtain statistical characteristics from a list of words .