LexFindR: A fast, simple, and extensible R package for finding similar words in a lexicon

Li, ZhaoBin; Crinnion, Anne Marie; Magnuson, James S.

View/Open

LexFindR2022.pdf (400.7Kb)

Date

2022

Author

Li, ZhaoBin

Crinnion, Anne Marie

Magnuson, James S.

Metadata

Show full item record

Estadisticas en RECOLECTA
(LA Referencia)

Li, Z., Crinnion, A.M. & Magnuson, J.S. LexFindR: A fast, simple, and extensible R package for finding similar words in a lexicon. Behav Res 54, 1388–1402 (2022). https://doi.org/10.3758/s13428-021-01667-6
Behavior Research Methods

URI

http://hdl.handle.net/10810/58617

Abstract

Language scientists often need to generate lists of related words, such as potential competitors. Theymay do this for purposes of experimental control (e.g., selecting items matched on lexical neighborhood but varying in word frequency), or to test theoretical predictions (e.g., hypothesizing that a novel type of competitor may impact word recognition). Several online tools are available, but most are constrained to a fixed lexicon and fixed sets of competitor definitions, and may not give the user full access to or control of source data. We present LexFindR, an open-source R package that can be easily modified to include additional, novel competitor types. LexFindR is easy to use. Because it can leverage multiple CPU cores and uses vectorized code when possible, it is also extremely fast. In this article, we present an overview of LexFindR usage, illustrated with examples.We also explain the details of how we implemented several standard lexical competitor types used in spoken word recognition research (e.g., cohorts, neighbors, embeddings, rhymes), and show how “lexical dimensions” (e.g., word frequency, word length, uniqueness point) can be integrated into LexFindR workflows (for example, to calculate “frequency-weighted competitor probabilities”), for both spoken and visual word recognition research.

Collections

BCBL-Publications