Machine Learning
Machine Learning Engineer
Artificial Intelligence
United Kingdom
£70000 - £100000
Posted on Tue, 4 Nov 2025
Location: Oxford (hybrid working available)
Type: Full-time, permanent
Salary: Competitive with comprehensive benefits
About the Organisation
This world-leading AI research firm is reimagining how science and innovation translate into real-world impact. Through interdisciplinary collaboration and cutting-edge facilities, the institute develops end-to-end solutions that tackle humanity’s most complex challenges, from sustainable agriculture and healthcare to climate change and artificial intelligence.
They providing state-of-the-art laboratories and collaborative environments designed to accelerate discovery and application at scale.
The Role
The institute is seeking an experienced ML Infrastructure Engineer to join its growing compute and platform engineering team. You’ll play a pivotal role in developing and operating the high-performance cloud and compute backbone that powers large-scale machine learning and scientific discovery. This is a hands-on, high-impact role where you’ll design and optimise GPU infrastructure, improve performance across compute and storage layers, and ensure scalability, resilience, and security across AI research environments.
What You’ll Do
Design, deploy, and operate high-performance GPU compute clusters for large-scale ML training and inference.
Engineer reliable, high-throughput data paths, optimising I/O performance, caching, and storage locality.
Benchmark and troubleshoot compute, network, and orchestration bottlenecks to maximise performance.
Implement observability, automation, and security practices that support compliant, resilient environments.
Collaborate with research and data teams to forecast capacity, manage resources, and streamline ML experimentation pipelines.
Support the transition from traditional HPC systems to modern, containerised, and cloud-native infrastructure.
What We’re Looking For
Required
Proven experience designing, building, and maintaining large-scale ML or HPC compute infrastructure.
Deep understanding of GPU architecture, distributed training, and high-speed networking.
Expertise with high-throughput or parallel storage systems for ML/HPC workloads.
Solid grasp of IaC and CI/CD tooling (e.g. Terraform, Argo CD).
Proactive, self-directed approach with strong systems design and problem-solving skills.
Nice to Have
Familiarity with Lustre or similar distributed file systems.
Experience with performance benchmarking, profiling, and cost optimisation.
Background in scientific or research computing environments.
Why Join?
Help build the infrastructure driving breakthrough AI and scientific research.
Work in a collaborative, forward-thinking environment that values innovation and inclusion.
Access modern facilities and advanced technology in a rapidly growing institute.
Competitive salary and comprehensive benefits, including:
Enhanced holiday pay
Pension, life assurance, and income protection
Private medical insurance and therapy services
Electric car scheme and wellbeing perks
Artificial Intelligence
United Kingdom
£70000 - £100000
Artificial Intelligence
United States
Artificial Intelligence
Oxfordshire
£50000 - £60000
Artificial Intelligence
Bristol