Shuffling Stochastic Gradient Methods In Finite-sum Optimization

Wednesday, April 15, 2026 - 12:30 to 13:30

Thackeray 325

Speaker Information

Trang H. Tran

Lehigh University

Abstract or Additional Information

Stochastic gradient descent (SGD) and its momentum variants are the dominant methods for solving large-scale finite-sum optimization problems due to their efficiency and scalability. While theoretical research often focuses on i.i.d. sampling (with replacement), most practical machine learning libraries rely on shuffling-based methods (sampling without replacement).

In this talk, we discuss some technical challenges in analyzing the convergence of shuffling gradient methods. We provide insights into why such methods converge and explain the intuition behind our algorithm: Nesterov Accelerated Shuffling Gradient. Our method achieves improved complexity bounds compared to existing shuffling algorithms and demonstrates strong empirical performance across a range of benchmarks.

To conclude, we briefly discuss our recent work on multi-objective optimization with alternating block coordinate and function minimization, as well as developments in non-monotone methods for derivative-free optimization.

Shuffling Stochastic Gradient Methods In Finite-sum Optimization

Abstract or Additional Information

Mathematics Research Center (MRC)

Department Directory

Open Faculty Position

Mathematics Research Center

Event Request

Department of Mathematics

Newsletter

Resources