Roles & Responsibilities
Key Responsibilities
Design, implement, and maintain CI / CD pipelines to enable fast, reliable releases
Automate infrastructure provisioning using Infrastructure as Code (IaC) tools
Manage and optimise cloud infrastructure (AWS / Azure / GCP)
Deploy, scale, and manage containerised applications using Docker & Kubernetes
Monitor system health, availability, and performance using observability tools
Implement SRE best practices : SLIs, SLOs, SLAs, error budgets
Perform incident response, root cause analysis (RCA), and post-mortems
Improve system reliability, scalability, and cost efficiency
Collaborate with development, QA, and security teams
Implement DevSecOps practices and security automation
Required Skills & Qualifications
Strong experience in DevOps / SRE roles
Proficiency in Linux / Unix system administration
Experience with CI / CD tools (Jenkins, GitHub Actions, GitLab CI, Azure DevOps)
Hands-on experience with Docker & Kubernetes
Strong scripting skills in Python / Bash / Shell
Experience with IaC tools (Terraform, CloudFormation, ARM)
Knowledge of monitoring & logging tools (Prometheus, Grafana, ELK, Datadog)
Understanding of networking, security, and cloud architecture
Good to Have (Preferred Skills)
Experience with SRE frameworks and reliability engineering
Exposure to DevSecOps tools (Snyk, Trivy, SonarQube)
Experience in high-availability, large-scale systems
Cloud certifications (AWS / Azure / GCP)
Knowledge of service mesh (Istio, Linkerd)
Tell employers what skills you have
Scalability
Kubernetes
Azure
Pipelines
Root Cause Analysis
Scripting
Reliability
Logging
Networking
Python
Docker
Ansible
Linux