About the Job
We're seeking experienced AI infrastructure Engineers to design and implement robust, scalable pipelines for massive data workloads. Join Tether's applied research team, where you'll contribute to high-impact projects that run across thousands of GPUs and drive cutting-edge video generation foundation development.
Responsibilities:
• Build and scale high-throughput data infrastructure optimized for video and multimodal content processing across large GPU clusters (e.g., H100/H200).
• Design core preprocessing algorithms for video, audio, text, and image modalities, enabling efficient extraction, synchronization, and normalization of temporal data.
• Build automated acquisition pipelines for sourcing large-scale video datasets, handling diverse formats, frame rates, annotations, and embedded audio.
• Architect robust systems for scalable evaluation and annotation, including prompt-based scoring, perceptual metrics, caption generation, and retrieval-based diagnostics.
• Collaborate with model researchers to co-design video model architectures (e.g. DiTs, VAEs, spatio-temporal transformers) and training schedules across pretraining and fine-tuning stages.
• Optimize distributed data loading and pipeline throughput for training at scale, ensuring robustness across model variants and modality combinations.
• Manage infrastructure to support experiment tracking, model versioning, and cross-team deployment workflows, integrating with production and research platforms.
• Support backend engineering across research, product, and creative teams to ensure seamless integration of data and model workflows from prototyping to inference.