Back to search
Strativ Group Linkedin · Posted 1mo ago

Machine Learning Engineer

San Francisco, California, United States

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

We are looking for a high-potential Junior Machine Learning Engineer to join a fast-moving AI team building the infrastructure that powers large-scale model training, evaluation, and inference.


This role is best suited to someone early in their career who has already demonstrated strong technical ability through a top academic background, excellent internships, or early commercial experience. You should be excited by the systems behind modern AI: distributed training, inference optimisation, model serving, GPU performance, data pipelines, and the tooling that enables researchers and engineers to move quickly.


Responsibilities

  • You will work closely with senior ML engineers and researchers to build, scale, and improve the infrastructure used to train and serve large models.
  • The work will sit heavily on the ML systems, inference, and training pipeline side rather than product-focused applied ML.

You may contribute to:

  • Building and improving training pipelines for large-scale models
  • Supporting distributed training workflows across GPU clusters
  • Optimising inference performance, latency, throughput, and cost
  • Developing model serving infrastructure for production workloads
  • Improving evaluation, experiment tracking, and model deployment tooling
  • Debugging training and serving bottlenecks across compute, networking, memory, and data
  • Working with frameworks such as PyTorch, JAX, CUDA, Triton, vLLM, Ray, Kubernetes, or similar systems
  • Collaborating with research and engineering teams to turn new model ideas into scalable systems


Qualifications

  • Degree from a top-tier university in Computer Science, Machine Learning, Mathematics, Engineering, or a related technical field


Required Skills

  • Strong Python and software engineering skills
  • Experience with PyTorch, JAX, TensorFlow, or similar ML frameworks
  • Exposure to large model training pipelines, distributed training, or GPU-based workloads
  • Interest in inference optimisation, model serving, and production ML systems
  • Strong understanding of data structures, algorithms, systems, and performance trade-offs
  • Excellent internships, research experience, or early commercial experience in ML engineering, infrastructure, distributed systems, or AI platforms
  • Ability to learn quickly, operate independently, and work with ambiguity



Preferred Skills & Experience

  • Experience with distributed training frameworks such as DeepSpeed, FSDP, Megatron-LM, Ray, or similar
  • Exposure to inference tooling such as vLLM, TensorRT-LLM, Triton Inference Server, CUDA, or Kubernetes-based serving
  • Experience working with LLMs, transformers, or large-scale recommendation/search models
  • Familiarity with GPU performance, memory optimisation, batching, quantisation, caching, or serving latency
  • Previous internship or work experience at a strong AI lab, ML infrastructure company, big tech ML team, or high-calibre startup
  • 6 months - 2 years commercial experience at a high performing start ups


The role is SF based - Salary is up to $250,000 per annum & would include significant equity.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent