Site Reliability Engineer (SRE) - OpenShift & Release

Singapore 17 days agoFull-time External
Negotiable
Overview We are looking for an experienced DevOps & Release Management Engineer to manage patching, infrastructure upgrades, OpenShift platforms, and enterprise-scale release processes. The ideal candidate will have strong expertise in Linux, container platforms, automation, CI/CD, and cross-functional collaboration to ensure secure, stable, and efficient application delivery. Key ResponsibilitiesPatch & Infrastructure Management • Implement and manage comprehensive patch management strategies for operating systems, applications, and network devices to ensure security and compliance. • Plan, execute, and oversee infrastructure upgrades across hardware, software, and network components while minimizing downtime and ensuring compatibility. • Develop and enhance pre- and post-patching processes to ensure zero-disruption execution and risk-based patch prioritization. • Support Linux environment scaling, tuning, automation, patching, and compliance auditing. • Administer and maintain operating systems, network infrastructure, and security patches. OpenShift (OCP) Administration • Deploy, manage, and maintain OpenShift Container Platform (OCP) clusters, including installation, configuration, scaling, and troubleshooting. • Perform OpenShift cluster maintenance such as upgrades, patching, monitoring, and performance optimization. • Monitor cluster health and ensure high availability, reliability, and compliance with enterprise standards. Network & Systems Operations • Monitor and manage network performance, capacity, and security. • Troubleshoot network, hardware, and software-related issues to ensure business continuity. • Ensure all network changes and upgrades comply with security policies, best practices, and organizational standards. Release Management • Lead the planning, coordination, and execution of software releases across environments and teams. • Develop and manage release plans, schedules, and budgets aligned with business goals. • Design and implement automated build, test, and deployment pipelines to accelerate software delivery. • Manage and maintain version control systems (e.g., Git), ensuring proper branching, merging, and tagging strategies. • Coordinate release activities with Development, QA, and Operations to ensure smooth and timely deployments. • Troubleshoot build and deployment failures, identifying root causes and implementing preventive actions. • Maintain clear and updated documentation of release processes, pipelines, and tools. • Implement and enforce CI/CD best practices to ensure consistency and reliability. • Monitor production environments post-release to ensure stability and address immediate issues when required. • Manage risks impacting release scope, schedule, and quality, escalating issues when necessary. • Ensure adherence to technical standards and governance requirements, including prototype vehicle build support when applicable. • Lead complex deployments in distributed, load-balanced, and service-oriented architectures. Required Skills & Experience • Strong experience in OpenShift or Kubernetes administration • Hands-on experience with patch management and infrastructure updates • Good understanding of Linux systems, networking, and security concepts • Expertise in CI/CD pipelines, automation, and DevOps tools • Proficiency in Git and version control management • Strong troubleshooting and problem-solving skills • Ability to work in cross-functional teams and manage release cycles end-to-end