Sepal AI builds the world's hardest tests for AI grounded in real-world software systems.
We're hiring a Data Engineer with 3+ years of experience and a strong systems mindset to help us build evaluation environments for AI in high-throughput log analysis contexts.
What You'll Do
• Design and implement analytical schemas and pipelines using tools like BigQuery, ClickHouse, Snowflake, Redshift, and other highperformance columnar databases.
• Work on complex, distributed queries over massive log and telemetry datasets.
• Create and manage synthetic datasets that simulate realworld DevOps, observability, or cloud infrastructure logs.
• Tune and optimize distributed query execution plans to avoid timeouts and reduce overscanning.
Who You Are
• 3+ years of experience in data engineering or backend systems roles.
• Deep expertise in analytical databases and OLAP engines with a focus on largescale query optimization, schema design, and performance tuning.
• Handson with log ingestion pipelines (e.g., FluentBit, Logstash, Vector) and schema design for observability systems.
• Strong SQL skills: you know how to reason through performance problems and spot inefficient query patterns.
• Bonus: Experience with Python, Docker, or synthetic data generation.
Pay
$50 - 85/hr depending on experience
Remote, flexible hours
Project timeline: 5-6 weeks