A forward-thinking AI company in Toronto is seeking an experienced individual to design and build LLM inference systems from the ground up. In this role, you will explore advanced techniques for low-latency serving and collaborate closely with model developers. Ideal candidates will have strong experience optimizing inference frameworks, a deep performance mindset, and solid programming skills in Python. This role provides flexible working arrangements and opportunities for professional development.
#J-18808-Ljbffr