Sr. Backend Python Developer ( DAG Structures).

Toronto 13 days agoFull-time External
Negotiable
Our Client which is a Large Investment Bank is urgently looking to hire a Sr. Backend Python Developer ( DAG Structures). Lead the design, development, and optimization of Directed Acyclic Graph (DAG) based data orchestration systems. Drive innovation in scheduling, latency reduction, and system efficiency, with proven experience building production-grade custom DAG server solutions. Responsibilities DAG Architecture & Solution Development Architect and implement large-scale Python-based DAG orchestration systems for data/compute workflows Own the end-to-end development lifecycle of a home-grown DAG server, including core engine, scheduling, and execution logic System Performance & Latency Optimization Analyze and continuously improve system throughput, latency, and resource utilization for mission-critical workloads Design for reliability, high concurrency, and minimal downtime Scalability & Efficiency Enhancements Scale DAG server solutions for ever-increasing data and task dependencies, ensuring efficient parallel execution Introduce, benchmark, and implement innovations to optimize scheduling, dependency resolution, and error recovery Technical Leadership & Best Practices Mentor and provide technical guidance to engineering teams on workflow design, Python best practices, and system debugging Establish and promote code quality, architecture, and documentation standards Collaboration & Stakeholder Engagement Work with data engineering, analytics, and platform teams to gather requirements and integrate DAG systems into broader architecture Communicate designs, trade-offs, and results to technical and business audiences Required Skills & Experience 8+ years of Python engineering, with a strong focus on backend and system architecture Deep expertise in DAG structures, workflow scheduling, and high-performance system design Proven experience designing and building custom (home-grown) DAG engines/servers, not limited to off-the-shelf solutions (e.g., Airflow, Luigi) Prior work optimizing for system latency, efficiency, and resource management Strong problem-solving skills, analytical mindset, and experience with profiling, tuning, and debugging large Python codebases