Multi-LEX: a database of multi-word frequencies for French and English - Institut Langage, Communication et Cerveau Accéder directement au contenu
Article Dans Une Revue Behavior Research Methods Année : 2022

Multi-LEX: a database of multi-word frequencies for French and English

Résumé

Written word frequency is a key variable used in many psycholinguistic studies and is central in explaining visual word recognition. Indeed, methodological advances on single word frequency estimates have helped to uncover novel language-related cognitive processes, fostering new ideas and studies. In an attempt to support and promote research on a related emerging topic, visual multi-word recognition, we extracted from the exhaustive Google Ngram datasets a selection of millions of multi-word sequences and computed their associated frequency estimate. Such sequences are presented with Part-of-Speech information for each individual word. An online behavioral investigation making use of the French 4-gram lexicon in a grammatical decision task was carried out. The results show an item-level frequency effect of word sequences. Moreover, the proposed datasets were found useful during the stimulus selection phase, allowing more precise control of the multi-word characteristics.
Fichier principal
Vignette du fichier
Armando.al.Ngram.BehavResMethods.2022.pdf (766.85 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03840777 , version 1 (04-01-2023)

Identifiants

Citer

Marjorie Armando, Jonathan Grainger, Stephane Dufau. Multi-LEX: a database of multi-word frequencies for French and English. Behavior Research Methods, 2022, ⟨10.3758/s13428-022-02018-9⟩. ⟨hal-03840777⟩
120 Consultations
53 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More