🔧 What You’ll Do:
• Lead reliability and observability strategy across high-traffic systems
• Architect end-to-end monitoring using New Relic (dashboards, SLOs, alerts, synthetic monitoring)
• Design infrastructure for high availability using Kubernetes, Docker, and Terraform/CloudFormation
• Manage incident response and conduct blameless postmortems
• Collaborate with Dev, QA, and Product teams on performance and chaos testing
• Mentor junior SREs and establish operational best practices
• Optimize system behavior during high-traffic events (e.g., Black Friday)
✅ Must-Have Experience:
• 8+ years in SRE, DevOps, or Platform Engineering
• New Relic expertise (hands-on and architectural level)
• Experience with large-scale retail or eCommerce platforms
• Proficient in Python, Bash, or Go
• Deep knowledge of AWS/GCP/Azure and infrastructure as code tools
• Excellent communication and leadership skills
🌟 Nice to Have:
• Experience with Shopify or headless commerce
• Familiarity with caching, autoscaling, edge optimization
• Prior work on distributed teams
BayOne is an Equal Opportunity Employer and does not discriminate against any employee or applicant for employment because of race, color, sex, age, religion, sexual orientation, gender identity, status as a veteran, and basis of disability or any federal, state, or local protected class.
This job posting represents the general duties and requirements necessary to perform this position and is not an exhaustive statement of all responsibilities, duties, and skills required. Management reserves the right to revise or alter this job description.