MTA 07003

Journal Home Page

Cumulative Index

List of all Volumes

Complete Contents
of this Volume

Previous Article

Next Article

Minimax Theory and its Applications 07 (2022), No. 1, 079--108
Copyright Heldermann Verlag 2022

Backtracking Gradient Descent Method and some Applications in Large Scale Optimisation. Part 1: Theory

Tuyen Trung Truong
Matematikk Institut, Universitetet i Oslo, Blindern - Oslo, Norway
tuyentt@math.uio.no

Hang-Tuan Nguyen
Axon AI Research
hnguyen@axon.com

Deep Neural Networks (DNN) are essential in many realistic applications, including Data Science. At the core of DNN is numerical optimisation, in particular gradient descent methods (GD). The purpose of this paper is twofold. First, we prove some new results on the backtracking variant of GD under very general situations. Second, we present a comprehensive comparison of our new results to the previously known results in the literature, providing pros and cons of these methods. To illustrate the efficiency of Backtracking line search, we will present some experimental results (on validation accuracy, training time and so on) on CIFAR10, based on implemetations developed in another paper by the authors. Source codes for the experiments are available on GitHub.

Keywords: Backtracking, deep learning, global convergence, gradient descent, line search method, optimisation, random dynamical systems.

MSC: 65Kxx, 68Txx, 49Mxx, 68Uxx.

[ Fulltext-pdf (815 KB)] for subscribers only.