15 May 2026

New Study Explains Why Neural Networks Prefer “Flatter” Solutions

A new research paper published in the journal Neural Networks sheds light on a long-standing mystery in artificial intelligence: why deep learning systems trained with gradient descent often settle on stable, high-performing solutions.

Graph showing how training instabilities favor flatter solutions in gradient descent

The paper adds to ongoing work in the field aimed at demystifying deep learning, suggesting that what once appeared to be unstable behaviour is a key ingredient in AI’s effectiveness.

Neural networks, core tools in modern AI for tasks from image recognition to language processing, are known for their ability to learn complex patterns, but their training dynamics remain poorly understood. This research contributes to a growing effort to uncover the mathematical principles behind their success. The study investigates training instabilities, that occur as neural networks learn and finds that these instabilities play a constructive role. Rather than being problematic, they guide models toward “flatter” regions of the reward landscape, which are associated with better performance on new data.

“What’s exciting here is that it turns an abstract observation about the orientation of dominant curvature into a concrete mechanism that helps explain the strong generalisation performance of neural networks.”

Lead author, Dr. Lawrence Wang

Central to the paper is a newly identified mechanism, Rotational Polarity of Eigenvectors. This concept describes how the dominant directions of curvature in the reward space rotate during training. During training instabilities, this rotational mechanism gives rise to a coupled dynamical system that captures the intricate dynamics of learning and helps explain how gradient descent navigates the complex, very high-dimensional reward landscapes of modern deep learning models. This connects also account for the strong generalisation performance observed in modern deep neural networks despite their vast numbers of parameters.

The findings could have practical implications for improving training stability and designing more efficient algorithms. By better understanding how instabilities shape learning, researchers may be able to build models that are both more reliable and easier to train.

“By better understanding how large AI models learn, we pave the way to models that are more reliable, easier to train, less power-consuming and safer.”

Co-author, Professor Stephen Roberts

Lawrence Wang, Stephen J. Roberts (2026). “Training instabilities favor flatter solutions in gradient descent”. Neural Networks, Volume 201, 108874, ISSN 0893-6080. https://doi.org/10.1016/j.neunet.2026.108874 https://www.sciencedirect.com/science/article/pii/S0893608026003357

Share on Twitter

Share on Facebook

Share on LinkedIn

04 Jun 2026

New Oxford AI tool predicts droplet behaviour from spray images alone

01 Jun 2026

DPhil Student recognised in OpenAI’s Inaugural ChatGPT Futures Class of 2026

News pic of Adel Bibi, Phil Torr - funding to investigate hidden attacks on AI systems

28 May 2026

Oxford researchers awarded funding to investigate hidden attacks on AI systems

Two students in workshop overalls smile during a practical engineering session in a laboratory environment.

26 May 2026

Uncover Engineering Programme inspires future engineers from across the UK

Undergraduate

Postgraduate

Support for Schools & Young People

New Study Explains Why Neural Networks Prefer “Flatter” Solutions