Senior DevOps / Site Reliability Engineer (SRE)We are looking for a Senior DevOps / Site Reliability Engineer (SRE) with strong expertise in infrastructure automation, CI/CD optimization, observability, and cloud reliability engineering.
The role involves building scalable DevOps solutions on Azure, ensuring high availability, resilience, and security of mission-critical systems.
The ideal candidate will have hands-on experience in Infrastructure as Code (IaC), container orchestration, monitoring frameworks, and incident response, while driving DevSecOps alignment across development, QA, and architecture teams.
Key ResponsibilitiesCI/CD Pipeline Development – Build and manage scalable CI/CD pipelines for web and mobile applications, enabling automated build, test, and deployment workflows.
Pull Request Validation Workflows – Implement automated PR pipelines with linting, static analysis, unit testing, and integration checks to enforce code quality.
Security & Code Quality Automation – Integrate SonarQube, SCA (Software Composition Analysis), and vulnerability scanning tools to enforce compliance and security.
Environment-Specific Deployments – Configure deployment strategies with approval gates, rollback mechanisms, and environment-specific variables.
Infrastructure as Code (IaC) – Automate infrastructure provisioning using Terraform and Helm charts.
Azure Cloud Management – Ensure availability, scalability, and resilience of applications hosted on Azure (AKS, App Services, VMs, Functions, App Gateway, VNets, Key Vault).Observability & Monitoring – Implement monitoring with Azure Monitor, Grafana, Prometheus, Application Insights and set up custom alerts/dashboards.
Secrets Management – Manage and secure secrets via Azure Key Vault and integrate them with CI/CD pipelines.
Incident Response & SRE Practices – Establish on-call rotations, conduct postmortems, and apply reliability engineering practices for system stability.
Collaboration – Work closely with development, QA, and architecture teams to align with DevSecOps best practices.
Capacity & Reliability Planning – Contribute to scalability, cost optimization, and long-term infrastructure planning.
Must-Have SkillsStrong expertise in Azure DevOps (Pipelines, Repos, Artifacts).Deep knowledge of Terraform and Helm for IaC and Kubernetes management.
Hands-on experience with Azure Kubernetes Service (AKS) and related Azure services (Functions, App Gateway, VNets, Key Vault).Proficiency in observability tools – Azure Monitor, Application Insights, Prometheus, Grafana.
Solid understanding of Linux, Docker, Kubernetes, and CI/CD workflows.
DevOps Tech StackCategory Tools / Technologies CI/CD Pipelines Azure DevOps, GitHub Actions, GitLab CI, Jenkins, Bitrise Version Control Git, GitHub, GitLab, Bitbucket Infrastructure as Code Terraform, Ansible, Helm, Bicep Containerization & Orchestration Docker, Kubernetes, AKS/EKS/GKE, Dapr Code Quality & Security SonarQube, Snyk, Trivy, Checkmarx, ESLint, Prettier Monitoring & Logging Prometheus, Grafana, ELK Stack, Azure Monitor, App Insights Artifact Management JFrog Artifactory, Nexus, GitHub Packages Mobile Build Automation Fastlane, Bitrise, App Center, Firebase App Distribution Release Management Azure DevOps Releases, GitHub Environments, Argo CD Secrets Management Azure Key Vault, HashiCorp Vault, AWS Secrets ManagerGood to HaveExperience with GitOps, ArgoCD, and Service Mesh (Istio/Linkerd).Knowledge of security tools – Snyk, AquaSec, Trivy.
Familiarity with FinOps practices for cloud cost monitoring and optimization.
Soft Skills & CompetenciesStrong problem-solving and analytical abilities.
Ability to manage complex projects and multiple environments.
Excellent communication and collaboration skills.
Passion for automation, reliability, and continuous improvement.
This is an on-site role in Abu Dhabi, UAE, within a fast-paced enterprise digital transformation environment.
The candidate will be at the center of mission-critical projects, collaborating with cross-functional teams to deliver secure, resilient, and scalable DevOps and SRE solutions.
#J-18808-Ljbffr