[12 months contract, renewable]
What you will be working on:
● Translate data requirements from business users and data scientists into technical specifications.
● Collaborate with partner agency’s IT teams on the following tasks:
○ Architect and build ingestion pipelines to collect, clean, merge, and harmonize data from different source systems.
○ Day-to-day monitoring of databases and ETL systems, e.g., database capacity planning and maintenance, monitoring, and performance tuning; diagnose issues and deploy measures to prevent recurrence; ensure maximum database uptime;
○ Construct, test, and update useful and reusable data models based on data needs of end users.
○ Design and build secure mechanisms for end users and systems to access data in data warehouse.
○ Research, propose and develop new technologies and processes to improve agency data infrastructure.
○ Collaborate with data stewards to establish and enforce data governance policies, best practices and procedures.
○ Maintain data catalogue to document data assets, metadata and lineage.
○ Implement data quality checks and validation processes to ensure data accuracy and consistency.
○ Implement and enforce data security best practices, including access control, encryption, and data masking, to safeguard sensitive data.
What we are looking for:
● A Bachelor’s Degree, preferably in Computer Science, Software Engineering, Information Technology, or related disciplines.
● At least 4 years of relevant experience
● Deep understanding of system design, data structure and algorithms, data modelling, data access, and data storage.
● Candidate must have experience using AI for code development in production environment.
● Proficiency in writing SQL for databases such as Postgres, MSSQL.
● Demonstrated ability in using cloud technologies such as AWS, Azure, and Google Cloud.
● Experience with orchestration frameworks such as Airflow, Azure Data Factory.
● Experience with distributed data technologies such as Spark, Hadoop.
● Proficiency in programming languages such as Python, Java, or Scala.
● Familiarity with building and using CI/CD pipelines.
● Familiarity with DevOps tools such as Docker, Git, Terraform.
Preferred requirements:
● Experience in architecting data and IT systems.
● Experience in designing, building, and maintaining batch and real-time data pipelines.
● Experience with Databricks.
● Experience with implementing technical processes to enforce data security, data quality, and data governance.
● Familiarity with government systems and government's policies relating to data governance, data management, data infrastructure, and data security