LexFindR: A fast, simple, and extensible R package for finding similar words in a lexicon
View/ Open
Date
2022Author
Li, ZhaoBin
Crinnion, Anne Marie
Magnuson, James S.
Metadata
Show full item record
Li, Z., Crinnion, A.M. & Magnuson, J.S. LexFindR: A fast, simple, and extensible R package for finding similar words in a lexicon. Behav Res 54, 1388–1402 (2022). https://doi.org/10.3758/s13428-021-01667-6
Behavior Research Methods
Behavior Research Methods
Abstract
Language scientists often need to generate lists of related words, such as potential competitors. Theymay do this for purposes
of experimental control (e.g., selecting items matched on lexical neighborhood but varying in word frequency), or to test
theoretical predictions (e.g., hypothesizing that a novel type of competitor may impact word recognition). Several online
tools are available, but most are constrained to a fixed lexicon and fixed sets of competitor definitions, and may not give the
user full access to or control of source data. We present LexFindR, an open-source R package that can be easily modified
to include additional, novel competitor types. LexFindR is easy to use. Because it can leverage multiple CPU cores and
uses vectorized code when possible, it is also extremely fast. In this article, we present an overview of LexFindR usage,
illustrated with examples.We also explain the details of how we implemented several standard lexical competitor types used
in spoken word recognition research (e.g., cohorts, neighbors, embeddings, rhymes), and show how “lexical dimensions”
(e.g., word frequency, word length, uniqueness point) can be integrated into LexFindR workflows (for example, to calculate
“frequency-weighted competitor probabilities”), for both spoken and visual word recognition research.