Key Responsibilities
• Architect and build a robust data platform on AWS using best practices in cloud infrastructure, data engineering, and security.
• Design and implement data pipelines, data lakes, and data warehouses to support analytics, reporting, and machine learning use cases.
• Collaborate with stakeholders across engineering, analytics, and business teams to understand data requirements and translate them into scalable solutions.
• Establish data governance frameworks including data quality, metadata management, lineage, and access controls.
• Optimize performance and cost-efficiency of data storage and processing solutions.
• Evaluate and integrate third-party tools and services to enhance the data platform capabilities.
• Mentor and guide junior engineers and contribute to building a high-performing data engineering team.
Required Qualifications
• Proven experience in designing and implementing data platforms on AWS (e.g., S3, Glue, Redshift, Athena, EMR, Lambda, Kinesis).
• Strong proficiency in data modeling, ETL/ELT processes, and distributed data processing frameworks (e.g., Apache Spark, Apache Flink).
• Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation).
• Solid understanding of data security, compliance, and governance in cloud environments.
• Proficiency in Python, SQL, and other relevant programming languages.
• Experience with CI/CD pipelines and DevOps practices in data engineering.
• Excellent problem-solving, communication, and stakeholder management skills.
Preferred Qualifications
• AWS Certified Solutions Architect or AWS Certified Data Analytics certification.
• Experience with real-time data streaming and event-driven architectures.
• Familiarity with data cataloging tools (e.g., AWS Glue Data Catalog, Amundsen).
• Exposure to machine learning workflows and integration with data platforms.