Job Descriptions: • Develop, test, and maintain data processing applications and pipelines using PySpark and related technologies. • Perform data extraction, transformation, and loading (ETL) from multiple sources into target systems. • Ensure data quality, consistency, and performance across data workflows. • Participate in code reviews, documentation, and continuous improvement of data processes. • Troubleshoot and resolve issues in data processing and integration environments. • Support the deployment, monitoring, and maintenance of data solutions in production. Requirements: • 1-3 Years in Python, including data structures, algorithms, and libraries for data manipulation (e.g., Pandas). • Deep understanding of Apache Spark, its architecture, and components (RDDs, DataFrames, Datasets). • Strong knowledge of SQL for data querying and manipulation. • PExperience in ETL (Extract, Transform, Load) processes using PySpark. • Ability to analyze and interpret complex datasets and derive insights. • Strong analytical skills to troubleshoot issues in data processing pipelines. • Good command of both spoken and written English & Cantonese. Ability to speak Putonghua is an advantage. Click 'Apply Now' to apply for this position or call Stella Tang at +852 3180 4977 for a confidential discussion. All information collected will be kept in strict confidence and will be used for recruitment purpose only.

Data Engineer (Python, PySpark)

KOS International Limited