GLASS
GLASS: Scaling AI Systems
An ARIA-funded Scaling Compute Project
GLASS is a revolutionary new project, rethinking the way we build AI training systems.
The project aims to solve the existing communication bottleneck in AI training, and introduce a new open interconnect for it. The project rethinks system design at multiple levels, from the low level physical design to the highest level application. It combines innovation in photonics, networking, computer architecture, memory systems, hardware/software co-design, distributed systems and more. GLASS will address real-world challenges such as resilience and recovery, manufacturability, and operational constraints.
The Vision
Or how we plan to change computing as you know it
The goal of the project is to develop an AI training system at 1/1000 of the cost. It aims to fit an AI system with 100's to 1000's of AI accelerators within a single server. A single server that provides so much acceleration power will enable more researchers to do frontier AI research, and for industry and universities to afford developing and using cutting edge AI algorithms.
The project aims to achieve more than just cutting costs and increasing access to resources, but also increasing AI sustainability. The project will provide significant power reduction, coupled with reduced use of physical resources and reduced embodied carbon.
The project will use optical interconnect through glass to connect all the accelerators, removing the existing communications bottleneck. By putting all the accelerators within a single server, we can rethink what a system is: what should be the memory hierarchy? How should it be managed? How to prevent, detect and recover from failures?
This is going to completely change the way we do AI research: from the availability of affordable compute resources, through the rethinking of AI models for systems, to the use of acceleration across many scientific computing domains.
The Team
Prof Amro Awad
Prof Amro Awad investigates new usecases enabled by emerging memory technologies and architectures, and also explores novel memory architecture optimizations for important workloads such as AI training.
Prof Dominic O'Brien
Prof Dominic O’Brien has a wide range of optoelectronic systems integration projects, successfully delivering high-data rate connectivity
Prof Martin Booth
Prof Martin Booth's research covers a wide range of optical systems engineering, ranging from biomedical imaging through to laser-based precision manufacturing.
Prof Nick McKeown
Prof Nick McKeown was a networking professor at Stanford, serial entrepreneur in Silicon Valley and executive at Intel.
Prof Noa Zilberman
Prof Noa Zilberman's research focuses on the integration of micro-level architectures and macro level, large scale networked-systems.
Dr Patric Salter
Dr Patrick Salter is passionate about UK manufacturing and translating ideas from academia through to impact in industry.
Join The Team!
The GLASS team will include 16 researchers at its initial stage, including PhD (DPhil) students, postdoctroal researchers and a program manager.
Below are the current and recent open positions in the team.
Reach out if you want to learn more about these positions!
We are looking for a Senior Researcher to join the team, working on system design for AI systems.
You will lead the development of the overall AI system solution, ranging from orchestration and management to the integration of all system components. You will be focused on the performance of AI training workloads, and will collaborate with all other researchers in the project, providing innovation across multiple domains.
You will typically be several years post-PhD, with significant experience in distributed systems, hardware/software co-design, or related areas. You are expected to have publications at related top-tier venues such as SOSP, EuroSys or OSDI. Previous experience of leading a team or relevant industrial experience is desired.
This position will open soon. Meanwhile, you can contact noa.zilberman@eng.ox.ac.uk for more details or to register your interest.
We are looking for a Postdoctoral Researcher to join the team, working on a novel interconnect for AI systems.
You will a new interconnect for AI systems, covering aspects ranging from topology and routing to congestion control and resilience. You will collaborate with networking researchers, as well as researchers in systems, photonics and memory systems.
You should hold a PhD in Computer Science or Engineering, with significant experience in wired computer networks. You are expected to have publications at related top-tier venues such as NSDI or SIGCOMM. Previous industrial experience is an advantage.
This position will open soon. Meanwhile, you can contact noa.zilberman@eng.ox.ac.uk for more details or to register your interest.
We are seeking a Project Manager to provide management support and guidance.
You will need to understand research funding arrangements, advise on budgets and expenditure, financial forecasting and reporting, help with onboarding of new staff/students and navigate governance issues such as data protection, export control and ethics. You will work with a variety of key stakeholders, external partners and other university staff.
You should be educated to degree level or equivalent and have experience of administration or project management in a complex organisation. Proven financial aptitude with the ability to produce financial reports is essential, as are strong communication and relationship building skills.
For more details, contact louise.bristow@eng.ox.ac.uk
We expect to recruit up to 5 PhD students to work on the GLASS project.
Applications must be submitted through Oxford's graduate admissions system. The applications deadline is December 3rd, 2024 (noon UK).
The list of studentships is provided below. We do not expect to recruit to the project in areas not listed.
- PhD (DPhil) studentship in Engineer Science - High performance interconnect for AI systems (1 position).
You will research a novel interconnect for AI systems, covering aspects of topology, routing, congestion control, communication protocols and resilience.
You should have an excellent knowledge in computer networks, with previous experience in network algorithms, simulation tools or prototyping network devices.
See this link for more details.
Contact noa.zilberman@eng.ox.ac.uk to discuss the studentship. - PhD (DPhil) studentship in Engineer Science - Advanced memory architectures for AI systems (2 positions).
You will research an advanced memory architrecture for AI systems, covering aspects such as memory management, hardware prefetching and near-memory computing.
You should have an excellent knowledge in computer architecture, with previous experience in hardware design or memory simulation tools.
See this link for more details.
Contact amro.awad@eng.ox.ac.uk to discuss the studentship. - PhD (DPhil) studentship in Engineer Science - High performance AI Systems (1 position)
You will research aspects of management, resilience and sustainability for frontier AI models in the developed AI system.
You should have an excellent knowledge in networked systems, with previous experience in distributed algorithms, virtualization, and workload monitoring and optimization.
See this link for more details.
Contact noa.zilberman@eng.ox.ac.uk to discuss the studentship. - PhD (DPhil) studentship in Engineer Science - Modelling optical systems (1 position).
You will research the modelling of optical interconnect for AI systems, covering aspects such as volumetric interconnect and processing devices through to system level models.
You should have excellent knowledge in optical systems, with previous experience of modelling optical links and using relevant simulation tools.
Contact dominic.obrien@eng.ox.ac.uk to discuss the studentship.
For all studentships, you must hold a 1st class Bachelor degree in Electrical Engineering, Computer Science or a related field. A Master degree is desired, as well as previous experience in the relevant research field (e.g., final year project, internships).
Studentship applications require 1000-1500 words research proposal, and it is recommended to include a preliminary draft of your proposal when contacting the potential supervisor.
Note that for all studentships, using AI/ML for system development is NOT in scope.
If applying to one of these studentships, on the application form make sure to preface your research proposal title with "GLASS" and to list one of the team members as the potential supervisor.
This position has already been filled.
This position has already been filled.
This position has already been filled.