Job Description
Our client is looking for a LLM Engineer/Researcher. They have a DGX Cluster with 8 x H100s and are actively looking to fine tune, and eventually develop our own LLM models.
Responsibilities
● Train and Fine-tune foundational LLM models (e.g. using PEFT, Lora, QLora) to meet business needs
● Build and maintain LLM applications and infrastructure to meet business needs
● Design LLM inference infrastructure to scalably deploy LLMs within infrastructural constraints
● Research and utilize best of class tools within LLM ecosystem (e.g. Vector databases, LlamaIndex, etc)
● Keep up with latest research around LLMs (e.g. sparse models, hardware-specific LLMs)
● Research and keep up with latest use-cases of LLMs (e.g. RAG, Agents, etc)
● Collaborate closely with LLM research teams to participate in foundation model research, specifically for training productivity-related LLMs
Requirements
● Experience with LLMs, including popular foundation models like Llama2, MPT
● Experience with Training and Fine-tuning foundational LLM models
● Experience with quantization techniques, including llama.cpp, GPTQ, etc
● Experience with LLM related development, e.g. Llamaindex, Langchain, Vector DBs, Prompt Engineering etc
● [Plus but not required] Experience running LLMs in production (e.g. Triton Inference Server, etc)
Benefits
● We pay an “all-in” pay and you will cover your own insurance/medical from the amount
● 14 days leave (and unlimited sick days)
● Annual equipment budget (once 2 month probation has been completed