Strativ Group Linkedin · Posted 2mo ago

Machine Learning Engineer

San Francisco, California, United States

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

We are looking for a high-potential Junior Machine Learning Engineer to join a fast-moving AI team building the infrastructure that powers large-scale model training, evaluation, and inference.

This role is best suited to someone early in their career who has already demonstrated strong technical ability through a top academic background, excellent internships, or early commercial experience. You should be excited by the systems behind modern AI: distributed training, inference optimisation, model serving, GPU performance, data pipelines, and the tooling that enables researchers and engineers to move quickly.

Responsibilities

You will work closely with senior ML engineers and researchers to build, scale, and improve the infrastructure used to train and serve large models.
The work will sit heavily on the ML systems, inference, and training pipeline side rather than product-focused applied ML.

You may contribute to:

Building and improving training pipelines for large-scale models
Supporting distributed training workflows across GPU clusters
Optimising inference performance, latency, throughput, and cost
Developing model serving infrastructure for production workloads
Improving evaluation, experiment tracking, and model deployment tooling
Debugging training and serving bottlenecks across compute, networking, memory, and data
Working with frameworks such as PyTorch, JAX, CUDA, Triton, vLLM, Ray, Kubernetes, or similar systems
Collaborating with research and engineering teams to turn new model ideas into scalable systems

Qualifications

Degree from a top-tier university in Computer Science, Machine Learning, Mathematics, Engineering, or a related technical field

Required Skills

Strong Python and software engineering skills
Experience with PyTorch, JAX, TensorFlow, or similar ML frameworks
Exposure to large model training pipelines, distributed training, or GPU-based workloads
Interest in inference optimisation, model serving, and production ML systems
Strong understanding of data structures, algorithms, systems, and performance trade-offs
Excellent internships, research experience, or early commercial experience in ML engineering, infrastructure, distributed systems, or AI platforms
Ability to learn quickly, operate independently, and work with ambiguity

Preferred Skills & Experience

Experience with distributed training frameworks such as DeepSpeed, FSDP, Megatron-LM, Ray, or similar
Exposure to inference tooling such as vLLM, TensorRT-LLM, Triton Inference Server, CUDA, or Kubernetes-based serving
Experience working with LLMs, transformers, or large-scale recommendation/search models
Familiarity with GPU performance, memory optimisation, batching, quantisation, caching, or serving latency
Previous internship or work experience at a strong AI lab, ML infrastructure company, big tech ML team, or high-calibre startup
6 months - 2 years commercial experience at a high performing start ups

The role is SF based - Salary is up to $250,000 per annum & would include significant equity.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

Strativ Group Company profile preview

Source: Linkedin
Location: San Francisco, California, United States
Compensation: Not listed
Open on Caio: 17 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for Strativ Group, based on roles Caio has indexed from public sources.

17open roles 1sources 5markets Posted 11d agolatest role