Title: Senior Computer Vision Engineer – VLMs, Multimodal ML, Edge AI (Jetson) | Relocation Support Tech Stack: Python, PyTorch, TensorRT, DeepStream, OpenCV, Jetson, CLIP, LLaVA, VLMs, Transformers, MLOps What You’ll Do: You’ll join an elite R&D team at a rapidly scaling industrial tech company building the future of smart job sites. Your mission: design, train, and deploy production-grade computer vision and vision-language systems that operate in challenging, real-world conditions. You’ll work on cutting-edge models like CLIP, SAM, and LLaVA, and deploy to the edge using NVIDIA Jetson, DeepStream, and TensorRT. Expect to build detection, OCR, segmentation, and retrieval pipelines that don’t just work in the lab — they solve safety, tracking, and optimization problems for live deployments. Company Details: They're building an integrated platform that brings together wearable IoT, edge AI, and computer vision to improve safety, productivity, and real-time decision-making on massive work sites. With operations across the GCC and global recognition for innovation, they’re trusted by some of the world’s largest contractors. Benefits: • Salary up to 40,000 SAR/month (tax-free) • Relocation support to Riyadh or flexible remote setup (must be currently in or ready to relocate to KSA within 1 month) • Opportunity to own end-to-end systems used at massive scale in the real world • Collaborate with hardware, embedded, and data science teams in a fast-growing company • Fast track to technical leadership if you're the right fit • Work on CV + GenAI projects with meaningful industry impact Requirements: • 7–10+ years experience in computer vision and deep learning • Proven hands-on work with object detection, segmentation, classification using models like YOLOv8, UNet, Mask R-CNN • Hands-on with Vision-Language Models (VLMs) like CLIP, BLIP, or LLaVA — fine-tuning or prompt-based retrieval • Strong grasp of Multimodal ML — integrating image + text or sensor fusion • Production deployment experience on NVIDIA Jetson (TX2, Xavier) using DeepStream and TensorRT • Comfortable with Python, PyTorch, and media pipelines (e.g. GStreamer) • Understanding of MLOps: retraining, model drift, CI/CD, monitoring • Track record of solving real-world problems — logistics, infrastructure, safety, or similar domains • Ideally have publications, patents, OSS contributions, or technical talks

Senior Computer Vision Engineer

UMATR