We are looking for an experienced Databricks Data Engineer with strong DevOps expertise to join our data engineering team. The ideal candidate will design, build, and optimize large-scale data pipelines on the Databricks Lakehouse platform while implementing robust CI/CD and deployment practices. This role requires strong skills in PySpark, SQL, Azure cloud services, and modern DevOps tooling. You will collaborate with cross-functional teams to deliver scalable, secure, and high‑performance data solutions. Technical Skills • Strong hands-on experience with Databricks, including: • Delta Lake • Unity Catalog • Lakehouse Architecture • Delta Live Pipelines • Databricks Runtime • Table Triggers • Proficiency in PySpark, Spark, and advanced SQL. • Expertise with Azure cloud services (ADLS, ADF, Key Vault, Functions, etc.). • Experience with relational databases and data warehousing concepts. • Strong understanding of DevOps tools: • Git/GitLab • CI/CD pipelines • Databricks Asset Bundles • Familiarity with infrastructure-as-code (Terraform is a plus). Key Responsibilities 1. Data Pipeline Development • Design, build, and maintain scalable ETL/ELT pipelines using Databricks. • Develop data processing workflows using PySpark/Spark and SQL for large‑volume datasets. • Integrate data from ADLS, Azure Blob Storage, and relational/non-relational data sources. • Implement Delta Lake best practices including schema evolution, ACID transactions, OPTIMIZE, ZORDER, and performance tuning. 2. DevOps & CI/CD • Implement CI/CD pipelines for Databricks using Git, GitLab, Azure DevOps, or similar tools. • Build and manage automated deployments using Databricks Asset Bundles. • Manage version control for notebooks, workflows, libraries, and configuration artifacts. • Automate cluster configuration, job creation, and environment provisioning. 3. Collaboration & Business Support • Work with data analysts and BI teams to prepare datasets for reporting and dashboarding. • Collaborate with product owners, business partners, and engineering teams to translate requirements into scalable data solutions. • Document data flows, architecture, and deployment processes. 4. Performance & Optimization • Tune Databricks clusters, jobs, and pipelines for cost efficiency and high performance. • Monitor workflows, debug failures, and ensure pipeline stability and reliability. • Implement job instrumentation and observability using logging/monitoring tools. 5. Governance & Security • Implement and manage data governance using Unity Catalog. • Enforce access controls, data security, and compliance with enterprise policies. • Ensure best practices around data quality, lineage, and auditability. Preferred Experience • Knowledge of streaming technologies like Structured Streaming or Spark Streaming. • Experience building real-time or near real-time pipelines. • Exposure to advanced Databricks runtime configurations and tuning. Certifications (Optional) • Databricks Certified Data Engineer Associate / Professional • Azure Data Engineer Associate

Data Engineer with Azure

Rivago infotech inc