Cloud and GPU Infrastructure Manager

London 1 months ago Remote Full-time External
540.6k - 632.0k / yr
Department of Computing is seeking a dedicated and skilled individual to manage our internal large-scale GPU infrastructure and cloud facilities. About the role The Department of Computing wishes to recruit a Cloud and GPU Infrastructure Manager to be responsible for the administration, maintenance, and optimization of both our internal GPU infrastructure and our cloud-based systems. This hybrid role requires a deep understanding of GPU technologies, cloud platforms, and high-performance computing. You will work closely with various teams to support their computing needs, ensuring that our infrastructure remains robust, scalable, and efficient. What you would be doing Some of the duties include: • Internal GPU Infrastructure Management: Oversee the setup, configuration, and maintenance of large-scale GPU clusters within our data centres, ensuring optimal performance and reliability. • Strategy Development: Develop, execute and evangelize GPU strategy which considers platform, architecture, security and commercial aspects of adopting technologies to support wider departmental aims. • Commercial Management: Defining, seeking agreement and adhering to available budgets, taking into consideration both short and long term, based on wider requirements and constraints. • Cloud Facilities Administration: Manage our cloud infrastructure on platforms such as AWS, GCP, or Azure, with a focus on GPU and computer-intensive resources. • Performance Optimization: Continuously monitor and optimize the performance of both internal and cloud-based systems, implementing best practices for resource utilization. • Automation and Scripting: Develop and maintain automation scripts to streamline infrastructure management tasks, reduce manual intervention, and enhance system efficiency. The ideal candidate will be educated to degree level (or equivalent) in a Computer Science or a closely related subject, or equivalent experience. You will have proven experience in managing large-scale GPU infrastructure and cloud environments, in-depth knowledge of GPU hardware and software (NVIDIA, AMD), cloud platforms (AWS, GCP, Azure), and high-performance computing environments. Further information Our preferred method of application is online via our website, please go to http://www.imperial.ac.uk/job-applicants/ and search using vacancy reference number starting with ENG, or click the 'Apply' button, above. Should you have any queries about the post, please contact: marina.hall@imperial.ac.uk £57,785 to £67,557 per annum