We are looking for Advanced Machine Learning Engineer - Remote / Telecommute for our client in Toronto, ON
Job Title: Advanced Machine Learning Engineer - Remote / Telecommute
Job Location: Toronto, ON
Job Type: Contract
Job Description:
• Collaborate with colleagues across multiple teams (Data Science and Data Engineering) on unique machine learning system challenges at scale.
• Leverage distributed training systems to build scalable machine learning pipelines for model training and deployments in IT/OT Products space.
• Design and implement solutions to optimize distributed training execution in terms of model hyperparameter optimization, model training/inference latency and system-level bottlenecks.
• Research and impalement state of the art LLM models for different business use cases including finetuning and serving the LLMs.
• Ensure ML Model performance, uptime, and scale, maintaining high standards of code quality and thoughtful design quality and monitoring.
• Optimize integration between popular machine learning libraries and cloud ML and data processing frameworks.
• Build Deep Learning models and algorithms with optimal parallelism and performance on CP GPUs.
• MS or Ph.D. in Computer Science, Software Engineering, Electrical Engineering, or related fields.
• 3+ years of industry experience with Python in a programming intensive role.
• 2+ years of experience with 1+ of the following machine learning topics.
• Classification, clustering, optimization, recommendation system, graph mining, deep learning.
• 3+ years of industry experience with distributed computing frameworks such as Spark, Kubernetes ecosystem, etc.
• 3+ years of industry experience with popular ML frameworks such as Spark MLlib, Keras, TensorFlow, PyTorch, HuggingFace Transformers and libraries (like scikit-learn, spacy, gensim, CoreNLP etc).
• 3+ years of industry experience with major cloud computing services.
• Background or experience in building and scaling Generative AI Applications, specifically around frameworks like Langchain, PGVector, Pinecone, AzureML.
• Prior experience in building data products and establishing a track record of innovation would be a big plus.
Qualifications:
• Proficient Python/PySpark coding experience.
• Proficient in containerization services.
• Proficient in Azure ML to deploy the models.
• Experience with working in CICD framework.
• Motivation to make downstream modelers work smoother