Machine Learning Engineer

Los Angeles 2 days agoFull-time External
Negotiable
JOB TITLE Senior Applied Machine Learning Engineer (audio / music generation) ABOUT THE ROLE We're building an AI-powered music system focused on commercial-ready audio generation. Our initial priority is getting the music generation quality right - structure, musicality, consistency, and production readiness. We are looking for a Senior Applied ML Engineer to own the end-to-end audio generation pipeline for our MVP. This role is hands-on and pragmatic: you'll fine-tune open-source music models, integrate inference pipelines, and work closely with audio and backend engineers to deliver usable results quickly and efficiently. This role starts as a contract engagement (details below), with a path to full-time position for the right fit. ROLE DETAIL Terms: Fixed-term (5 months) | Potential full-time conversion Compensation: $30,000 (Full 5 Month Term) Location: Hybrid/On-site (Monrovia, CA) WHAT YOU'LL WORK ON Fine-tuning open-source music generation models. Implement conditioning controls (beats per minute, key, mood, section, density). Training and deploying parameter-efficient fine-tunes (LoRA / adapters). Building reference-conditioned generation. Support long-form generation via chunking and continuation. Integrating with Backend inference pipelines and APIs. Collaborating with audio DSP engineers to ensure outputs are production ready. REQUIRED QUALIFICATIONS Strong experience with Python and PyTorch. Hands-on experience with audio or speech generation models. Familiarity with diffusion or autoregressive generative models. Experience using or fine-tuning open-source ML models, familiar with HF Interfaces. Understanding of audio representations. Experience deploying ML models to production or API environments. NICE-TO-HAVE SKILLS Familiarity with CLAP / audio embeddings or retrieval-assisted generation. Experience working with LoRA / PEFT methods. Basic understanding of audio production workflows (tempo, key, stems, loudness). Experience Optimizing inference cost and latency. ROLE GOALS & OBJECTIVES Reliably generate musically coherent, commercial-friendly cues (30 ~ 120 seconds) The model responds correctly to conditioning inputs like tempo, key and mood Outputs are stable, repeatable and usable downstream by post-production tools The system is modular and ready to be integrated with downstream models. Seniority level Mid-Senior level Employment type Contract Job function Engineering and Information Technology Industries IT System Custom Software Development