ECM - École Centrale de Marseille : UMR7316 (Pôle de l'étoile - Technopole de Château-Gombert - 38 rue Frédéric Joliot-Curie - 13013 Marseille - France)
Abstract : In this paper, we investigate on 39 Variable Selection procedures to give an overview of the existing 1 literature for practitioners. "Let the data speak for themselves" has become the motto of many applied researchers 2 since the amount of data has significantly grew. Automatic model selection have been raised by the search 3 for data-driven theories for quite a long time now. However while great extensions have been made on the 4 theoretical side still basic procedures are used in most empirical work, eg. Stepwise Regression. Some reviews 5 are already available in the literature for variable selection, but always focus on a specific topic like linear 6 regression, groups of variables or smoothly varying coefficients. Here we provide a review of main methods and 7 state-of-the art extensions as well as a topology of them over a wide range of model structures (linear, grouped, 8 additive, partially linear and non-parametric). We provide explanations for which methods to use for different 9 model purposes and what are key differences among them. We also review two methods for improving variable 10 selection in the general sense. 11
https://hal-amu.archives-ouvertes.fr/hal-01954386
Contributor : Elisabeth Lhuillier <>
Submitted on : Monday, June 8, 2020 - 4:54:34 PM Last modification on : Wednesday, August 5, 2020 - 3:08:26 AM