SRE/Ops Engineer

New York 24 days agoFull-time External
Negotiable
Title: SRE/Ops Engineer Location: Englewood, NJ • Support and enhance observability (monitoring, logging, alerting) across production systems • Help maintain SLIs/SLOs for key services • Participate in evaluating services for production readiness • Collaborate with development teams to identify reliability risks and improve system architecture • Contribute to automation of operations, including CI/CD pipelines, incident response, and infrastructure provisioning • Participate in incident response and on-call rotations for critical services • Contribute to post-incident analysis and drive reliability improvements • Partner with security, infrastructure, and product teams to support performance, compliance, and operational excellence Must-Haves • Willingness to work onsite and participate in a 24/7 on-call rotation as needed • 5+ years of experience managing and supporting high-traffic digital platforms • Strong experience with CI/CD pipelines and deployment automation • Experience with cloud platforms such as AWS and/or GCP • Solid scripting skills (e.g., Python, Bash, Groovy) • Hands-on experience with observability and monitoring tools like Datadog, New Relic, AppDynamics, or similar • Understanding of web, mobile, and OTT architectures • Experience supporting large scale websites, Mobile and OTT applications, microservices, APIs, and distributed systems • Experience with infrastructure-as-code tools such as Ansible, Terraform, or Chef • Familiarity with performance testing tools like JMeter or k6 • Hands on experience with debugging tools like Charles Proxy or Fiddler • Preferred Qualifications • Experience working with CDNs (e.g., Akamai) and reverse proxies (e.g., NGINX, Varnish) • Exposure to video streaming platforms and Familiarity with application/infrastructure security controls and best practices • Certifications in SRE, DevOps, or Performance Engineering are a plus