Skip to main content
Menu

Foerster Lab for AI Research awarded 2023 Amazon Research Award for Machine Learning

Associate Professor Jakob Foerster and research students awarded $70,000 cash and $50,000 AWS credit for the proposal "Compute-only Scaling of Large Language Models"

Foerster Group, Department of Engineering Science, University of Oxford

Associate Professor Jakob Foerster and his DPhil students Jonathan Cook, Eltayeb Ahmed, and Thomas Foster have been awarded an Amazon Research Award for a project aiming to improve Large Language Models (LLMs), the core backbone behind the GenAI revolution that is currently unfolding. The Amazon Research awards program offers flexible funds and AWS Promotional Credits to support research at academic institutions and non-profit organisations in areas that align with their mission to advance customer-obsessed science.

Professor Foerster’s research team aims to revolutionise LLMs by unlocking compute-only scaling, which is currently one of the most crucial real world frontiers in artificial intelligence (AI). So far, jointly increasing (“scaling”) the amount of compute and training data has been a crucial factor in improving the capabilities of these models. However, as hardware capabilities continue to improve, sufficient high quality training data becomes a limiting factor.

Instead, the researchers aim to explore scaling compute alone, without relying on additional training data. By leveraging search-based methods inspired by Alpha-Zero and incorporating the concept of intermediate reasoning in language models, they seek to enable LLMs to engage in contemplation and reasoning before generating output for challenging predictions, much like humans can “think harder” when it matters. Alpha-Zero, which integrates search during both training and testing, was the first Machine Learning algorithm to produce superhuman game play in chess from scratch.

The research will be carried out by the DPhil students and Jakob in the Foerster Lab for AI Research, including proof-of-concept results and large scale experiments using AWS commuting infrastructure unlocked by this grant. 

 

Professor Foerster explainer videos, including 'What is a Large Language Model?'