Site Reliability Engineer (Lead Level)
West London (1-2 days per week in the office)
90-100k
Bonus up to 24%
Other outstanding benefits
Xpertise's client are a large complex enterprise currently undergoing a major Technology and SRE transformation and building a brand new SRE function from the ground up. They’re looking for an experienced SRE who can help shape what “good” looks like across multiple high-impact platforms - from web and mobile to payments, CRM, and Cloud Infrastructure.
The position will work on AWS-first infrastructure, modernizing legacy systems and embedding observability, automation, and CI/CD practices.
It also requires someone who has experience with containerization (Kubernetes or Docker) and monitoring tools (Datadog, Splunk, etc.).
This is an opportunity to shape and lead a greenfield SRE capability within a large-scale digital environment. No on-call work, plenty of technical variety, and real influence in how the organisation adopts modern engineering practices.
What you will do:
• Drive SRE adoption across product teams
• Work on AWS-first infra with IaC + CI/CD
• Improve reliability, availability + performance
• Build observability + automation
• Help define SLOs/SLIs and mentor engineers
What we are looking for:
• 5 years SRE / reliability engineering as a Senior or Lead
• AWS, Terraform, CI/CD
• Strong SRE fundamentals (monitoring, incident, automation)
• Docker/K8s + tools like Datadog/Splunk
• Someone who can influence, not just build