Role : Sr Big Data Engineer
Location : Toronto (hybrid)
Technical Skills
• Databricks / Spark (SQL & PySpark) : Delta Lake, Structured Streaming, performance tuning.
• Snowflake : SQL, warehouses, tasks/streams, dynamic tables, role-based access.
• Databases: Oracle / SQL Server Strong SQL development, data extraction, CDC/migration patterns.
• Data Modeling : Dimensional modeling, 3NF, SCD types, time-series/event modeling.
• Orchestration : Databricks Workflows, Airflow, ADF.
• Security & Governance : IAM, RBAC, data masking/tokenization, encryption practices
Responsibilities
Data Engineering & Pipeline Development
• Build, maintain, and optimize ETL/ELT pipelines (batch, streaming) using Databricks, Spark, SQL and PySpark.
• Implement Delta Lake–based architectures, CDC patterns, and reusable pipeline frameworks (configuration driven IO, logging, metrics, error handling).
• Develop and maintain streaming data pipelines using Structured Streaming or other streaming frameworks.
Data Modeling & Quality
• Implement conceptual, logical, and physical data models provided by the Data Architect.
• Apply modeling patterns such as dimensional modeling (star/snowflake), SCDs, 3NF.
• Build data quality checks, profiling routines, schema validation, and monitoring.
Data Platforms & Integration
• Develop and deploy pipelines across Databricks, Snowflake, and relational databases (Oracle, SQL Server).
• Implement ingestion frameworks for APIs, files, databases, and streaming sources.
• Work with orchestration tools such as Databricks Workflows, Airflow, ADF.
Security, IAM & Compliance
• Apply IAM principles including RBAC, fine grained access control, and secure data handling.
• Implement data masking, tokenization, and encryption based on organizational standards.
• Ensure compliance with regulatory/security frameworks (GDPR, DPDP, PCI, KYC/AML awareness).
Collaboration & DevOps
• Work with Data Architects to align pipeline designs with reference architecture (lakehouse, streaming, CDC).
• Contribute to CI/CD pipelines for automated deployments and dataset/catalog registrations.
• Collaborate with analysts, scientists, and business stakeholders for data delivery and enhancements.
.
Soft Skills
• Strong problem-solving and debugging skills.
• Ability to work in cross-functional teams with architects, analysts, and business stakeholders.
• Excellent communication and documentation abilities.