We’re looking for a Senior MLOps Engineer to design, build, and operate scalable machine learning infrastructure in AWS. This role partners closely with Data Scientists and platform teams to productionize models, establish robust ML platforms, and ensure reliable, observable, and secure model lifecycle management. The ideal candidate brings strong software engineering fundamentals, deep AWS expertise, and a pragmatic approach to enabling data science teams to move quickly from experimentation to production. Experience building new ML infrastructure, vs operating inside of existing infrastructure, is a huge advantage.
Job Responsibilities / Typical Day in the Role
Enable production deployment of machine learning models
• Partner with Data Scientists to prepare development code for production deployment, including refactoring, packaging, standardization, and performance optimization
• Build and maintain CI/CD pipelines for model training, validation, and deployment
• Support batch and real-time inference workflows using scalable AWS-native services
• Develop and maintain model APIs to be integrated into user products
Build and operate core MLOps platform capabilities
• Design and implement a centralized model registry to track versions, metadata, lineage, and promotion stages
• Build and maintain a feature store to support consistent feature computation for training and inference
• Establish standardized ML pipelines for data ingestion, training, evaluation, deployment, and monitoring
• Define infrastructure-as-code patterns to provision and manage ML environments reliably
Ensure reliability, monitoring, and governance of ML systems
• Implement monitoring for model performance, data drift, and operational health
• Establish alerting and rollback strategies for production model failures
• Partner with security and platform teams to ensure compliance, access controls, and auditability
Collaborate across product, data, and platform teams
• Work closely with Data Scientists to align experimentation workflows with production constraints
• Collaborate with data engineers and architects to ensure feature availability, freshness, and quality
• Support agile product PODs by communicating API design and delivery for integration into user products
Must Have Skills / Requirements
1) Experience in MLOps, ML Engineering, or backend software engineering with ML systems
a. 4+ years of experience
2) Strong experience building and operating ML systems in AWS (e.g., SageMaker, ECS, Lambda, Step Functions, S3, IAM)
a. 4+ years of experience
3) Proficiency in Python and experience with ML frameworks and tooling used in production environments
a. 4+ years of experience
Nice to Have Skills / Preferred Requirements
1) None
Soft Skills:
1) Strong understanding of the machine learning lifecycle and how data science workflows translate to production systems
2) Ability to collaborate effectively with Data Scientists and translate experimental work into robust production solutions
3) Strong communication skills and comfort working across technical and non-technical stakeholders
Technology Requirements:
1) Strong experience building and operating ML systems in AWS (e.g., SageMaker, ECS, Lambda, Step Functions, S3, IAM)
2) Proficiency in Python and experience with ML frameworks and tooling used in production environments
3) Experience building APIs and backend services for model inference
4) Hands-on experience with CI/CD, infrastructure as code (e.g., Terraform, CloudFormation), and containerization
Education / Certifications
1) None required.
“Mindlance is an Equal Opportunity Employer and does not discriminate in employment on the basis of – Minority/Gender/Disability/Religion/LGBTQI/Age/Veterans.”
If you are inclined, I would be happy to set up some time to chat more about your background and career interests to see if there could be a possible match. Please feel free to call me on 732-806-7467 or send me email on nirajk@mindlance.com
Regards
Niraj kumar