Site Reliability Engineer; SRE - Platform Infrastructure team; Remote

Chicago 8 days ago Remote Full-time External
695.1k - 868.9k / yr
Position: Site Reliability Engineer (SRE) - Platform Infrastructure team (100% Remote - USA) About the job Hopper is looking for a Senior Site Reliability Engineer to join our Platform Infrastructure team — the group that builds and operates the cloud foundation powering products used by millions of travelers worldwide . Our mission is to empower engineers across Hopper to ship fast, stay resilient, and scale effortlessly. If you care about automation, scalability, and developer experience — and want to make a tangible impact on a growing travel tech company — this could be the perfect role for you. You’ll help evolve a large-scale, multi-region infrastructure running in Google Cloud , supporting hundreds of engineers and dozens of product teams. You’ll contribute to building automated, self-service platform tools , ensuring the foundation is secure, reliable, cost-efficient , and easy to use. • Thrive on automating repetitive work and turning best practices into platform-level solutions . • Take pride in enabling product teams by providing intuitive tools and interfaces for infrastructure and deployment. • Have a strong bias for practical, reliable solutions over complexity and over-engineering. • Care deeply about operational excellence : scalable systems, high availability, performance, and cost optimization. • See developer experience as a product , and continuously look for ways to improve it. What would your day-to-day look like • Improve and evolve platform tooling to support a growing number of services and teams across Hopper. • Design infrastructure workflows that are simple, consistent, and scalable — enabling engineers to build and deploy with confidence. • Drive automation across key infrastructure components, reducing manual work and increasing reliability. • Adapt and scale infrastructure offerings to meet the needs of product teams while maintaining a cohesive and maintainable platform. • Participate in incident response for platform-level issues as part of a globally distributed, sustainable on‑call rotation (with team coverage across the Americas and Europe). • Support engineering teams by troubleshooting platform issues, answering infrastructure‑related questions, and reviewing pull requests that affect core systems. • Collaborate with a small, high‑impact team of SREs , focused on operational excellence, performance, and developer experience. An ideal candidate has • Professional experience in SRE, Dev Ops, Software Engineering, or Systems Engineering , with a passion for building reliable, scalable infrastructure. • Strong troubleshooting and incident response skills across distributed systems and cloud‑native environments. • Solid system design and analytical thinking , with a focus on simplicity, performance, and maintainability. • Clear and effective communication skills , with the ability to collaborate across engineering teams. Cloud & Infrastructure Expertise • Hands‑on experience with major cloud platforms — ideally Google Cloud Platform (GCP). • Deep familiarity with Infrastructure as Code , preferably using Terraform . • Experience building and operating with containers and Kubernetes , and tools like Helm or Kustomize . • Working knowledge of Service Mesh technologies , preferably Istio . Networking & Security • Solid understanding of networking fundamentals — DNS, TLS, certificates, ingress controllers, etc. • Knowledge of cloud and infrastructure security best practices , including IAM, RBAC , and network segmentation . • Familiarity with authentication and authorization protocols and technologies. Observability & Tooling • Experience with observability stacks — logs, metrics, tracing, and APM (preferably using Datadog ). • Practical knowledge of CI/CD pipelines and deployment automation. • Exposure to database technologies , both SQL and No SQL. Scripting & Automation • Comfortable writing scripts in Bash , Python , or similar scripting languages to automate routine tasks and build tooling. Perks and Benefits • Well‑funded and proven startup with large ambitions, competitive salary and the upsides of pre‑IPO equity packages. • Unlimited PTO. • Carrot Cash travel stipend. • Access to co‑working space on demand through Flex Desk AND Work‑from‑home stipend. • Please ask us about our very generous parental leave, much above industry standards!. • Entrepreneurial culture where…