Identification of Ambiguous Multiword Expressions Using Sequence Models and Lexical Resources

Manon Scholivet 1 Carlos Ramisch 1
1 TALEP - Traitement Automatique du Langage Ecrit et Parlé
LIS - Laboratoire d'Informatique et Systèmes
Abstract : We present a simple and efficient tagger capable of identifying highly ambiguous multiword expressions (MWEs) in French texts. It is based on conditional random fields (CRF), using local context information as features. We show that this approach can obtain results that, in some cases, approach more sophisticated parser-based MWE identification methods without requiring syntactic trees from a tree-bank. Moreover, we study how well the CRF can take into account external information coming from a lexicon.
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal-amu.archives-ouvertes.fr/hal-01795903
Contributor : Carlos Ramisch <>
Submitted on : Wednesday, May 23, 2018 - 4:35:24 PM
Last modification on : Friday, May 25, 2018 - 1:40:11 AM
Long-term archiving on : Friday, August 24, 2018 - 4:17:08 PM

File

W17-1723.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-01795903, version 1

Collections

Citation

Manon Scholivet, Carlos Ramisch. Identification of Ambiguous Multiword Expressions Using Sequence Models and Lexical Resources. Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017), 2017, Valencia, Spain. pp.167 - 175. ⟨hal-01795903⟩

Share

Metrics

Record views

114

Files downloads

94