Software Engineer, Data Engineer

New York 2 days agoFull-time External
Negotiable
Location: New York Overview Design, develop, and maintain scalable data pipelines for ingesting and processing logistics and telematics datasets. Build and support APIs using Python and Go, enabling secure and efficient communication across internal and external systems. Integrate third-party platforms, including Telematics Service Providers (TSPs) and Transportation Management Systems (TMS), into the Catenas ecosystem to ensure interoperability and data fidelity. Apply advanced entity resolution techniques to unify disparate records, detect duplicates, and establish accurate relationships across large datasets. Design data models that reflect real-world supply chain structures, balancing normalization, denormalization, and extensibility for analytics. Contribute to the orchestration of data workflows using tools such as Airflow or Prefect to automate ETL (Extract, Transform, Load) and ELT processes. Monitor, deploy, and maintain containerized applications using Docker and Kubernetes across cloud environments. Utilize streaming message platforms such as Kafka or Redpanda to support real-time data processing pipelines. Collaborate with the CTO and other engineering team members to define technical specifications, code review standards, and Dev Ops procedures. Write technical documentation and manage tasks using Git-based workflows and project management tools such as Jira or Plane. Perform other software engineering and data integration duties as assigned. Responsibilities • Design, develop, and maintain scalable data pipelines for ingesting and processing logistics and telematics datasets. • Build and support APIs using Python and Go, enabling secure and efficient communication across internal and external systems. • Integrate third-party platforms, including Telematics Service Providers (TSPs) and Transportation Management Systems (TMS), into the Catenas ecosystem to ensure interoperability and data fidelity. • Apply advanced entity resolution techniques to unify disparate records, detect duplicates, and establish accurate relationships across large datasets. • Design data models that reflect real-world supply chain structures, balancing normalization, denormalization, and extensibility for analytics. • Contribute to the orchestration of data workflows using tools such as Airflow or Prefect to automate ETL (Extract, Transform, Load) and ELT processes. • Monitor, deploy, and maintain containerized applications using Docker and Kubernetes across cloud environments. • Utilize streaming message platforms such as Kafka or Redpanda to support real-time data processing pipelines. • Collaborate with the CTO and other engineering team members to define technical specifications, code review standards, and Dev Ops procedures. • Write technical documentation and manage tasks using Git-based workflows and project management tools such as Jira or Plane. • Perform other software engineering and data integration duties as assigned. Job Requirements A bachelor's degree or its foreign equivalent in Computer Science, Mathematics, Software Engineering, or a closely related field, plus 2 years of experience as a Software Engineer, Data Engineer, or in a related role/occupation. In addition, the required prior experience must include: • 2 years of experience with programming in Python and SQL. • 2 years of experience building APIs using Python frameworks. • 2 years of experience developing data pipelines using open-source orchestration tools such as Airflow, Prefect, Dagster, or Astronomer. • 2 years of experience applying data modeling principles to design relational and semi-structured schemas. • 2 years of experience with version control and issue tracking using Git and tools such as Jira, , or Plane. • 2 years of experience with cloud-specific and cloud-agnostic databases and services. • 2 years of experience deploying and managing containerized applications using Docker and Kubernetes. • 2 years of experience implementing streaming data pipelines using platforms such as Kafka, Solace, or Redpanda. • 2 years of experience designing systems using monolithic and service-oriented architectures. • 2 years of experience conducting entity resolution and integrating third-party supply chain and telematics systems. To apply, send resume or CV to jobs and reference SE. #J-18808-Ljbffr