Overview
Position Name – Advanced Data Engineer (Databricks, Python, PySpark)
Type of hiring – Fulltime
Location – Remote Canada
Need strong Sr profiles. Add more for this.
Job Description
Must have:
Programming Languages: Proficient in Python and PySpark with a strong understanding software engineering best practices. Cloud Computing: Utilize Azure cloud-based data platforms, specifically leveraging Databricks and Delta Live Tables for data engineering tasks, while effectively utilizing services related to storage, compute, and security. Data Pipelines: Design, build, and maintain robust and scalable and automated data pipelines for batch and streaming data ingestion of data and processing (Data bricks workflow). Data Architecture and Modeling: Design and implement robust data models and architectures that align with business requirements and support efficient data processing, analysis, and reporting. Orchestration: Utilize workflow orchestration tools to automate data pipeline execution and dependency management. Monitoring and Alerting: Integrate monitoring and alerting mechanisms to track pipeline health, identify performance bottlenecks, and proactively address issues. Strong Agile principles: Utilize Agile development methodologies, actively participating in sprint planning, daily stand-ups, sprint reviews, and retrospectives. Be flexible and adaptable to changing requirements and priorities throughout the project lifecycle. Unity Catalog Nice to Have
Github Action Data Quality: Implement data quality checks and balances throughout the data pipeline, including profiling, validation, and root cause analysis, to ensure data accuracy, completeness, and consistency. CI/CD: Implement continuous integration and continuous delivery (CI/CD) practices for automated testing and deployment of data pipelines.
#J-18808-Ljbffr