Job Description:
• Key Responsibilities:
• Gain insights into the evolution direction of industry AI large model training frameworks and key features.
• Plan and layout AI frameworks and software features for scenarios such as large model pre-training, post-training, and integrated training and inference.
• Lead the team to build key technologies such as low-precision training, parallel strategy tuning, and training resource optimization.
• Fully leverage system engineering and software-hardware collaboration capabilities to enhance AI cluster computing efficiency.
• Identify high-quality academic resources in the direction of large model training.
• About the Ideal Candidate:
• Major in artificial intelligence, computer science, software, automation, physics, mathematics, electronics, microelectronics, information technology, or related fields, with more than 5 years of R&D experience in large model training and optimization.
• Proficient in common model structures of large models such as Deepseek and Llama.
• Deep technical expertise in large model training and inference optimization in fields like LLM, MoE, and multimodal learning.
• Familiar with the hardware architecture and programming systems of AI accelerators such as GPU and NPU.
• Enjoys research, has strong learning ability, good communication skills, and teamwork ability.