Senior MLOps Engineer

Toronto 23 months agoFull-time External
695.0k - 801.9k
Our client is a leading Private Equity firm in Canada. They are looking for a Senior MLOps/DevOps Engineer. Role-Specific Accountabilities: Cloud Solutions: • Design, deploy, and maintain scalable solutions across major cloud platforms, with a preference for Google Cloud (GCP), AWS, or Azure. • Ensure availability, performance, and readiness of Core infrastructure, applications, and services. Microservices and ML Pipelines: • Implement and maintain end-to-end microservices and machine learning pipelines. • Manage the entire ML lifecycle from data ingestion to monitoring in production. CI/CD Implementation: • Utilize CI/CD principles to streamline code deployments and software updates. • Automate the deployment of ML algorithms into production using Infrastructure as Code tools. Cross-Functional Collaboration: • Collaborate with software developers, DL/ML engineers, and other teams for efficient delivery. • Enhance the overall software delivery pipeline and ensure infrastructure health. Monitoring and Security: • Implement and manage monitoring tools for system health and performance. • Enforce security best practices and vulnerability management standards. • Ensure compliance with regulations and company policies for ML models and data pipelines. Continuous Improvement: • Stay updated with emerging trends and tools in DevOps and MLOps. • Proactively seek and recommend opportunities for improvement. Education, Experience & Capabilities: • Bachelor's or higher degree in Computer Science, Engineering, or a related field. • 7 years of experience in DevOps roles, with at least 1 year specifically in MLOps or handling ML in production. • Mastery in scripting languages like Python, Shell, or equivalent. • Deep expertise in major cloud platforms, especially Google Cloud Platform (GCP). • Hands-on experience with containerization technologies (Docker) and orchestration tools (Kubernetes). • Proven experience with Infrastructure as Code tools (Cloud Deployment Manager, CloudFormation, or Terraform). • Experience with version control systems (git) and collaboration platforms (GitHub or GitLab). • Familiarity with MLOps tools like TensorFlow, TFX, MLflow, or KubeFlow (GCP). • Experience in deploying ML models into production and understanding model architectures. • Documentation and Security: • Creating comprehensive documentation for ML and infrastructure workflows. • Understanding of network architectures, VPC designs, and security best practices in cloud environments. • Relevant certifications such as Google Cloud Professional DevOps Engineer are a plus