DevOps Engineer IV

$100000 - $150000
Permanent

Posted on Tue, 17 Jun 2025

Framework Track: Technical

Framework Band: C-Upper

 

Purpose and Scope

  • Designs, develops, modifies, adapts, and implements short- and long-term solutions to EIT through new and existing applications, systems architecture, network systems and applications infrastructure with a focus on Kubernetes and cloud-native solutions.
  • Responsible for providing application and infrastructure solutions that support complex, compute intensive workloads, parallel filesystems, low latency networks, and agile project management/DevOps.
  • Partners with scientists, engineers, and other HPC infrastructure and application experts to provide HPC solutions that fully leverage computing resources on in the cloud.

 

Key Responsibilities

  • Design and implement efficient workflows and tools within HPC environments that leverage modern tools: cloud computing (private/public), Cloud Native principles, Kubernetes, CICD, GitOps, Observability to meet our performance and scalability requirements.
  • Collaborate with other teams in the organization, such as IMG R&D, data science, acquisition, data management, and others to understand their HPC requirements and needs to ensure that the HPC environment can support their work.
  • Develop and maintain procedures for deploying, configuring, and monitoring HPC infrastructure (both software and hardware).
  • Continually optimize the HPC environment to improve performance, scalability, and cost-effectiveness by using DevOps methodologies and automation tools.
  • Supports development and execution of cloud/container/kubernetes implementation strategy within organization’s existing IT infrastructure.
  • Collaborate with colleagues in HPC Computing, Information Technology, and Upstream/Downstream business units to design and deliver HPC solutions in the cloud.
  • Effectively and efficiently automate, optimize, troubleshoot, and resolve system problems and business solutions.
  • Ensure all technical solutions are aligned with the vision, principles, architecture, and standards defined by the relevant architecture, security, and data teams.
  • Demonstrate experience with four or more technologies: Linux, Parallel filesystems, Networking, Cloud, Containerization, Kubernetes, Python, Perl, Bash, Ansible, PowerShell
  • Support of HPC infrastructures including multiple sub-environments such as visualisation, compute, storage, interconnects, code, and systems management.
  • Maintain system software, utilising debugging tools for problem isolation; will perform software builds, software upgrades, and patch installation as needed.
  • Use and share knowledge of HPC software development techniques, including compilers, languages, and programming models

 

Key Competencies

  • Experienced in managing large projects, or leading development of a product, and researching solutions to technical problems and challenge
  • Acts as a key technical advisor to a project team.
  • Able to provide direct input to resource planning, and constructively contributes to strategic technology planning.
  • Strong ability to perform advanced analysis of complex technical issues and determine their impact and implications on current practices within own discipline.
  • Continually improves existing technology/solutions, practices, and approaches
  • Effectively communicates highly technical information to diverse internal and external audiences and stakeholders.
  • Able to recognise technical issues and change the way problems are approached and solved in the future.
  • Can effectively translate technical objectives/practices into local procedures and workflows.
  • Maintains a significant breadth and/or depth of experience in a technical area or field while having sound knowledge of related disciplines.
  • Maintains a high level of internal and external visibility as a technical expert within multiple technologies.
  • Coach and guide in areas and topics that will promote growth in the team.

 

Required Experience

  • 5+ years of DevOps engineering experience
  • 5+ years of AWS cloud engineering experience
  • Mastery of linux operating system, CI/CD, automated deployments, Infrastructure-as-Code, configuration management (Github/Azure DevOps)
  • Proficiency in IaC and configuration management tools. Ansible, Terraform
  • Proficiency is script/programming languages. Python, Bash, Go
  • Fundamental knowledge of container orchestration, Kubernetes and Cloud-Native architecture.
  • Fundamental knowledge of large-scale networks, parallel file systems (lustre, GPFS, BeeGFS, NFS, etc)
  • Strong communicator – both verbal and written

 

For immediate consideration please apply today. For more information about this and other roles in the DevOps space, please reach out directly.

Apply for this role:

    Advertised By:
    James Husher

    Share this role: