LLM Inference Systems Architect Real-Time Zero-to-One

Toronto

8 days ago

Full-time

External

Negotiable

adaption

A forward-thinking AI company in Toronto is seeking an experienced individual to design and build LLM inference systems from the ground up. In this role, you will explore advanced techniques for low-latency serving and collaborate closely with model developers. Ideal candidates will have strong experience optimizing inference frameworks, a deep performance mindset, and solid programming skills in Python. This role provides flexible working arrangements and opportunities for professional development. #J-18808-Ljbffr