Credflow AI
Linkedin · Posted 16d ago
Sr. ML Engineer
Continue to application
Add your email once, then Caio opens the original posting.
Indexed description
We are hiring a Senior ML Systems Engineer to own the execution and optimisation of PrimaLabs' platform on real customer hardware. This role sits at the core of our delivery engine, focusing on tuning inference runtimes, running large-scale benchmarks, and integrating optimisation pipelines. You will work directly on customer deployments across modern accelerators and act as a key technical counterpart in performance-critical engagements.
Responsibilities
- Own and execute optimisation of ML workloads on customer hardware (NVIDIA, AMD, CUDA).
- Tune and optimise inference runtimes such as vLLM and SGLang.
- Design and run large-scale benchmarking and performance evaluation pipelines.
- Build and manage configuration sweep infrastructure for performance exploration.
- Integrate and extend optimisation pipelines (DeepHyper or similar frameworks).
- Profile system performance and identify bottlenecks across compute, memory, and I/O.
- Work closely with customers to deliver measurable performance improvements.
- Collaborate with research and infrastructure teams to productionise optimisations.
- 5+ years of experience in ML infrastructure, ML systems, or performance engineering.
- Strong experience with model inference systems and runtime optimisation.
- Hands-on experience with profiling tools and performance tuning.
- Deep understanding of GPU/accelerator-based systems and ML workloads.
- Proficiency in Python and system-level debugging.
- Experience working with large-scale benchmarking or performance testing systems.
- Ability to work directly with customers and translate requirements into solutions.
- Experience with large-scale distributed systems or model serving platforms.
- Familiarity with low-level performance optimisation (memory, compute, and I/O bottlenecks).
- Experience working with hardware-software co-design or system-level tuning.
- Background in high-performance computing (HPC).
- Contributions to open-source ML systems or infrastructure projects.
- Experience in customer-facing or solution engineering roles.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search
Want help applying to roles like this?
Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent