Expert Engineer, Backend

Singapore 25 days agoFull-time External
Negotiable
Department Engineering and Technology Level Experienced (Individual Contributor) Location Singapore The Engineering and Technology team is at the core of the Shopee platform development. The team is made up of a group of passionate engineers from all over the world, striving to build the best systems with the most suitable technologies. Our engineers do not merely solve problems at hand; We build foundations for a long-lasting future. We don't limit ourselves on what we can or can't do; we take matters into our own hands even if it means drilling down to the bottom layer of the computing platform. Shopee's hyper-growing business scale has transformed most "innocent" problems into huge technical challenges, and there is no better place to experience it first-hand if you love technologies as much as we do. About the Team: The EGO team is dedicated to building an industry-leading machine learning platform to effectively support the implementation of algorithms across various business domains such as recommendation, search, and advertising. It focuses on extreme optimization for CTR/CVR prediction in large-scale sparse parameter scenarios, ensuring maximum performance in e-commerce applications and delivering greater value to the companyThe EGO platform covers the entire deep machine learning workflow — from sample organization and training to model building and publishing, and further to online model loading and inference services. It comes with a user-friendly Web UI and Restful API, providing an end-to-end, one-stop machine learning platform. Job Description: Develop distributed Parameter Server (PS) systems for large-scale sparse model training and inference platforms in the search, advertising, and recommendation domains. The system should support high-throughput parameter read/write and update operations, handle hundreds of billions of features and TB-level sparse models, enable online real-time learning, and meet algorithmic needs such as feature admission and expiration. Participate in the development of the one-stop machine learning platform, integrating the PS system into the platform to provide a user-friendly, stable, high-performance, and platform-level distributed parameter service system. Enhance the platform’s efficiency and usability, accelerating the model iteration process for algorithm teams. Requirements: Bachelor’s degree or above in Computer Science, Electronics, Automation, Software Engineering, or related fields At least 6 years of relevant hands-on experience Proficient in C++ programming with strong low-level technical skills; adept at multi-threaded programming, lock optimisation, memory pool, thread pool, template programming, GDB debugging, performance tuning, and RPC frameworks. Familiarity with distributed PS systems, distributed system backend optimization, high-performance in-memory KV systems, KV storage systems based on NVMe-SSD, and high-performance client-server architecture systems is a plus. Highly passionate about computer technology, proactive in learning, with a strong spirit of in-depth research and hands-on practice. Maintains high standards and strict requirements for delivered code; works with rigor and attention to detail. Strong team player with excellent continuous learning ability. #J-18808-Ljbffr