Senior AI Research Engineer, Model Inference (100 Remote)

Abu Dhabi Tax Free13 days agoFull-time External
Negotiable
Job Description: As a Senior AI Research Engineer, you will play a critical role in pushing the boundaries of desktop and on-device inference and fine-tuning performance for next-generation SLM/LLMs. You will extend the inference framework to support inference and fine-tuning for Language models with a strong focus on mobile and integrated GPU acceleration (Vulkan). Responsibilities: • Implement and optimize custom inference and fine-tuning kernels for small and large language models across multiple hardware backends. • Implement and optimize full and LoRA fine-tuning for small and large language models across multiple hardware backends. • Design and extend datatype and precision support (int, float, mixed precision, ternary QTypes, etc.). • Design, customize, and optimize Vulkan compute shaders for quantized operators and fine-tuning workflows. • Investigate and resolve GPU acceleration issues on Vulkan and integrated/mobile GPUs. • Architect and prepare support for advanced quantization techniques to improve efficiency and memory usage. • Debug and optimize GPU operators (e.g., int8, fp16, fp4, ternary). • Integrate and validate quantization workflows for training and inference. • Conduct evaluation and benchmarking (e.g., perplexity testing, fine-tuned adapter performance). • Conduct GPU testing across desktop and mobile devices. • Collaborate with research and engineering teams to prototype, benchmark, and scale new model optimization methods. • Deliver production-grade, efficient language model deployment for mobile and edge use cases. • Work closely with cross-functional teams to integrate optimized serving and inference frameworks into production pipelines designed for edge and on-device applications. • Define clear success metrics such as improved real-world performance, low error rates, robust scalability, optimal memory usage and ensure continuous monitoring and iterative refinements for sustained improvements.