Data Engineer for AI Evaluation

Los Angeles 1 days agoFull-time External
Negotiable
Join Sepal AI, where we craft the most challenging tests for AI based on real-world software systems! We are looking for a skilled Data Engineer with over 3 years of experience and a strong systems mindset to assist in building evaluation environments for AI in dynamic log analysis contexts. What You'll Do: • Design and implement analytical schemas and pipelines utilizing tools such as BigQuery, ClickHouse, Snowflake, Redshift, and other high-performance columnar databases. • Work on complex, distributed queries across large log and telemetry datasets. • Create and manage synthetic datasets that reflect real-world DevOps, observability, or cloud infrastructure logs. • Tune and optimize distributed query execution plans to prevent timeouts and minimize over-scanning. Who You Are: • 3+ years of experience in data engineering or backend systems roles. • Deep expertise in analytical databases and OLAP engines with a specialization in large-scale query optimization, schema design, and performance tuning. • Experienced with log ingestion pipelines such as FluentBit, Logstash, or Vector, and skilled in schema design for observability systems. • Strong SQL skills: able to analyze performance issues and identify inefficient query patterns. • Bonus: Experience with Python, Docker, or synthetic data generation. Pay: $50 - 85/hr based on experience Remote, flexible hours Project timeline: 5-6 weeks