Data Scientist (Machine learning) (IT)

Toronto 5 days agoFull-time External
405.8k - 659.5k / yr
Data Scientist This role is hybrid and requires you to be at our Client office at a minimum 4 days per week - subject to change at any time. We're looking for an AI/ML Engineer with hands‑on experience building and deploying production‑grade models, including Generative AI solutions. You will design, train, evaluate, and operationalize models on modern cloud data platforms, implement robust MLOps/LLMOps practices, and collaborate with data, platform, and product teams to drive end‑to‑end delivery. Design, train, fine‑tune, and evaluate ML and GenAI models (supervised/unsupervised, NLP, CV, and LLM‑based use cases). Model Deployment: Package and deploy models to production using containers and CI/CD; implement scalable serving with REST/gRPC, batch, and streaming pipelines.Establish automated training, evaluation, model registry, feature store integration, monitoring (data drift, model drift, latency, cost), and safe rollback.Build prompts, retrieval pipelines (RAG), and model adapters/LoRA; Data Engineering Collaboration: partner with data engineering on schema design, data contracts, and lineage.Operate on AWS and/or Snowflake for storage, compute, orchestration, and governance; optimize cost/performance.Instrument models and data pipelines with logging, tracing, metrics, and alerting; adhere to security, privacy, and responsible AI guidelines.AI/ML: Proficiency in Python and common ML stacks; strong with TensorFlow and/or PyTorch for training, fine‑tuning, and inference. Generative AI: Experience with LLMs or diffusion models; prompt engineering, RAG, evaluation frameworks, and safety/guardrail techniques.Hands‑on with CI/CD for ML (e.g., GitHub Actions/GitLab CI), model packaging (Docker), model registries, feature stores, and monitoring.Cloud Data Platforms: Practical experience with AWS (e.g., Data Pipelines: Building and operating ETL/ELT; schema management and data quality checks.Collaboration: Ability to work with cross‑functional teams (Data Eng, Platform, Product, Security) and communicate trade‑offs clearly.Vector/RAG: Experience with vector databases (e.g., Kubernetes (EKS) for model serving and autoscaling. Testing: Unit/integration tests for data/model pipelines; Security & Compliance: Secrets management, IAM, PII handling, and Responsible AI practices.Snowpark ML, external functions, UDFs for in‑database ML.Languages/Frameworks: Python, TensorFlow, PyTorch, scikit‑learn, Transformers Cloud/Data: AWS (S3, ECR/ECS/EKS, SageMaker), Snowflake (Snowpark), SQLPipelines & Orchestration: Productionized ML/GenAI services with defined SLAs/SLOsAutomated training and deployment pipelines with traceable experiment lineageReliable data and model monitoring (quality, drift, performance, cost)The determination of this range includes factors such as skill set level, geographic market, experience and training, and licenses and certifications. At CGI, we value the strength that diversity brings and are committed to fostering a workplace where everyone belongs. We collaborate with our clients to build more inclusive communities and empower all CGI partners to thrive. Come join our team—one of the largest IT and business consulting services firms in the world.