Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Nonsmooth automatic differentiation: a cheap gradient principle and other complexity results

Abstract : We provide a simple model to estimate the computational costs of the backward and forward modes of algorithmic differentiation for a wide class of nonsmooth programs. Prominent examples are the famous relu and convolutional neural networks together with their standard loss functions. Using the recent notion of conservative gradients, we then establish a "nonsmooth cheap gradient principle" for backpropagation encompassing most concrete applications. Nonsmooth backpropagation's cheapness contrasts with concurrent forward approaches which have, at this day, dimensional-dependent worst case estimates. In order to understand this class of methods, we relate the complexity of computing a large number of directional derivatives to that of matrix multiplication. This shows a fundamental limitation for improving forward AD for that task. Finally, while the fastest algorithms for computing a Clarke subgradient are linear in the dimension, it appears that computing two distinct Clarke (resp. lexicographic) subgradients for simple neural networks is NP-Hard.
Complete list of metadata
Contributor : Ryan Boustany Connect in order to contact the contributor
Submitted on : Tuesday, May 31, 2022 - 5:45:58 PM
Last modification on : Thursday, September 22, 2022 - 10:44:04 AM


Files produced by the author(s)


  • HAL Id : hal-03683640, version 1


Jérôme Bolte, Ryan Boustany, Edouard Pauwels, Béatrice Pesquet-Popescu. Nonsmooth automatic differentiation: a cheap gradient principle and other complexity results. 2022. ⟨hal-03683640⟩



Record views


Files downloads