AI/LLM Engineer

Hong Kong 3 days agoFull-time External
Negotiable
We are partnering with a leading global technology company at the forefront of innovation to recruit for a pivotal role in their AI team. This is a unique opportunity to advance the practical application of Large Language Models (LLMs) and Multimodal AI within critical industrial verticals, working alongside top-tier research talent. About the Role: You will be instrumental in the end-to-end lifecycle of cutting-edge large models—from research and training to optimization and real-world deployment. Your work will bridge the gap between state-of-the-art AI research and impactful industrial solutions, directly shaping how next-generation AI integrates into complex business environments. Key Responsibilities: • Lead the training, fine-tuning (SFT), and system deployment of vertical domain-specific large models. • Research and implement advanced model compression & optimization techniques (pruning, quantization, distillation) to enhance inference efficiency. • Design and develop algorithms for RAG (Retrieval-Augmented Generation) and Agent modules to boost AI reasoning in dynamic scenarios. • Drive the application of multimodal (vision-language) understanding technologies, such as Large Vision Models (LVM), for industrial use cases. • Architect and build large-scale, high-quality industry datasets to support model pre-training, fine-tuning, and evaluation. • Develop and maintain robust validation, evaluation, and performance monitoring frameworks for AI systems. • Contribute to the development of large model application platforms and microservices to improve modularity and usability. What We're Looking For (Must-Have): • Master’s or PhD in Computer Science, Mathematics, Electrical Engineering, Data Science, or a related field. • Solid foundation in ML/DL, with in-depth understanding of Transformer architectures and hands-on experience in end-to-end LLM training/development. • Proficient in Python and PyTorch ecosystem (e.g., Hugging Face, DeepSpeed, PEFT). • Practical experience with large model compression techniques (pruning, quantization) and inference optimization frameworks (e.g., vLLM, Triton). • Strong capability in large-scale data processing and familiarity with big data tools (e.g., Spark). • Preferrably if you have Chat BI experience. Bonus Points (Preferred): • Hands-on experience in developing RAG systems and AI Agent modules. • Knowledge of CUDA programming, distributed training, or hardware acceleration (GPU/TPU). • Publications at top-tier AI conferences (NeurIPS, ICLR, CVPR, etc.). • Experience with Docker, FastAPI, and enterprise-level model deployment pipelines. What We Offer: • The chance to work on groundbreaking AI projects with tangible real-world impact. • Collaboration with leading academic research teams on cutting-edge explorations. • A culture that values technical excellence, innovation, and professional growth. • Competitive compensation, comprehensive benefits, and a supportive work environment.