Key Responsibilities
• Design, construct, install, test and maintain highly scalable data architecture.
• Develop and maintain databases by acquiring data from primary and secondary sources.
• Use Python and SQL to manipulate data and build data products for analysis
• Work with stakeholders to assist with data-related technical issues and support their data infrastructure needs.
Requirements:
• Proficiency in Python and SQL
• Experience with Data warehouse (e.g., Snowflake, Redshift) and Lakehouse structures (e.g., Databricks)
• Experience managing semi-structured data sources (JSON)
• Experience in creating and managing data architectures and pipelines
• Familiarity with AWS particularly services S3, Lambda, EMR, EC2
• Knowledge of Apache Spark, Hadoop, Kafka
• Experience building CDC ingestion and streaming pipelines e.g. Kafka
• Belief in CI/CD, modular coding, and documentation of code
• Excellent teamwork and communication skills
Desired Skills
• Experience with Databricks and Mage
• Experience with languages (SAS, R, Scala, C++, etc.)
• A passion for data and information analysis.
• Experience working within an Agile environment
• Interest in machine learning or data science
• Has numerical and analytical skills.
Preferred Qualifications
• A degree in STEM or Computer Science