Deterministic and probabilistic backward error analysis of neural networks in floating-point arithmetic
Abstract
The use of artificial neural networks is now becoming widespread across a wide variety of tasks. In this context of very rapid development, issues related to the storage and computational performance of these models emerge, since networks are sometimes very deep and comprise up to billions of parameters. For all these reasons, the use of reduced precision is increasingly being considered although, until now, its accuracy and robustness had been approached mostly from a practical standpoint or verified by software. The aim of this work is to provide formal tools to better understand, explain, and predict the accuracy and stability of neural networks when using floating-point arithmetic. To this end, we first extend to neural networks some well-known concepts from numerical linear algebra, such as condition number and backward error. We then apply a rounding error analysis based on existing tools in numerical linear algebra to obtain both forward and backward error bounds.
This includes both deterministic worst-case bounds as well as probabilistic bounds that are sharper on average. These bounds both ensure the proper functioning of neural networks once trained, and provide recommendations on architectures and training methods to enhance the robustness of neural networks.
Origin | Files produced by the author(s) |
---|