We are looking for an experienced and motivated hands-on leader for a high-visibility position within our client core infrastructure team. This role will lead a variety of critical tasks across DevOps, MLOps, and security/IT with cross-functional duties in multiple teams. This is a unique opportunity to shape the direction of the infrastructure which powers our models, enabling groundbreaking discoveries, and accelerating real-world applications with our partners.
What you need / Qualifications:
• Bachelor’s degree in Computer Science, Information Technology, or a related field.
• 3+ years of experience in Kubernetes administration and development.
• Hands-on experience in managing machine learning model lifecycles, model serving, and distributed training.
• Proven experience in automation and strong systems engineering skills.
• Solid understanding of cybersecurity principles and best practices.
• Strong sense of ownership and responsibility, with the ability to handle urgent operational issues to ensure smooth team development and deployment.
• MLOps Expertise: Proficiency in MLOps frameworks and tools, such as ModelDB, Kubeflow, Pachyderm, and Data Version Control (DVC).
• Domain Knowledge: Basic understanding of biology or chemistry; experience in the pharmaceutical or biotech industry is a plus.
• Software Engineering Background: Strong foundation in software engineering, with a transition into technical operations roles.
• Security Awareness: Ability to identify security risks and implement effective mitigation measures.
• Start-up Experience: Demonstrated experience in start-up environments, showcasing adaptability and problem-solving skills.
• Production Support: Experience supporting both production systems and machine learning pipelines.
• Educational: Master’s/Ph.D. in Computer Science, a related technical field, or equivalent practical experience.
• Leadership Skills: Proven ability to lead and develop engineering teams of 3-5 members or more.
What you'll do:
• Manage and maintain infrastructure, including Kubernetes GPU resource scheduling, scaling, monitoring, logging, and performance optimization.
• Develop and maintain automated machine learning pipelines using tools such as Kubeflow and Ray.
• Manage and support collaboration and productivity platforms, including Google Workspace, GitLab, and related CI/CD integrations.
• Implement and enforce cybersecurity policies and best practices to safeguard sensitive data and systems.
• Provide technical support to team members by troubleshooting hardware, software, and network issues.
ONE Technology Services We specialize in staffing and software development services, including custom app development, custom web development, staff augmentation, managed and cloud solutions, technical recruitment, and IT consultancy
ONE Technology Services is also a product-based company. Our products Workshop solution and ERP solution for services-based businesses are already well-tested and solve business problems by hitting right in the bull's eye. These products are highly customizable to address the specific needs of our clients. Having 20+ years of industry experience with top US and Canadian companies enables us to know exactly how to solve all our client’s business challenges.