At Sepal AI, we're pushing the boundaries of AI testing by developing some of the most challenging assessments grounded in real-world software systems. We are in search of a talented Data Engineer with over 3 years of experience and a keen systems perspective to join our team. You will play a crucial role in establishing evaluation environments for AI in high-throughput log analysis settings.
What You'll Do:
• Design and create analytical schemas and data pipelines utilizing high-performance tools such as BigQuery, ClickHouse, Snowflake, and Redshift.
• Engage in complex, distributed queries over extensive log and telemetry datasets.
• Develop and manage synthetic datasets that emulate real-world DevOps, observability, or cloud infrastructure logs.
• Optimize distributed query execution plans to minimize timeouts and enhance scanning efficiency.
Who You Are:
• Possess 3+ years of experience in data engineering or backend systems roles.
• Have deep expertise in analytical databases and OLAP engines, focusing on large-scale query optimization and performance tuning.
• Experienced with log ingestion pipelines (e.g., FluentBit, Logstash, Vector) and schema design for observability systems.
• Strong SQL skills with the ability to diagnose performance issues and identify inefficient query patterns.
• Bonus: Experience with Python, Docker, or synthetic data generation.
Pay: $50 - 85/hr depending on experience
Remote, flexible hours
Project timeline: 5-6 weeks