Machine Learning Engineer
Indexed description
We are looking for a high-potential Junior Machine Learning Engineer to join a fast-moving AI team building the infrastructure that powers large-scale model training, evaluation, and inference.
This role is best suited to someone early in their career who has already demonstrated strong technical ability through a top academic background, excellent internships, or early commercial experience. You should be excited by the systems behind modern AI: distributed training, inference optimisation, model serving, GPU performance, data pipelines, and the tooling that enables researchers and engineers to move quickly.
Responsibilities
- You will work closely with senior ML engineers and researchers to build, scale, and improve the infrastructure used to train and serve large models.
- The work will sit heavily on the ML systems, inference, and training pipeline side rather than product-focused applied ML.
You may contribute to:
- Building and improving training pipelines for large-scale models
- Supporting distributed training workflows across GPU clusters
- Optimising inference performance, latency, throughput, and cost
- Developing model serving infrastructure for production workloads
- Improving evaluation, experiment tracking, and model deployment tooling
- Debugging training and serving bottlenecks across compute, networking, memory, and data
- Working with frameworks such as PyTorch, JAX, CUDA, Triton, vLLM, Ray, Kubernetes, or similar systems
- Collaborating with research and engineering teams to turn new model ideas into scalable systems
Qualifications
- Degree from a top-tier university in Computer Science, Machine Learning, Mathematics, Engineering, or a related technical field
Required Skills
- Strong Python and software engineering skills
- Experience with PyTorch, JAX, TensorFlow, or similar ML frameworks
- Exposure to large model training pipelines, distributed training, or GPU-based workloads
- Interest in inference optimisation, model serving, and production ML systems
- Strong understanding of data structures, algorithms, systems, and performance trade-offs
- Excellent internships, research experience, or early commercial experience in ML engineering, infrastructure, distributed systems, or AI platforms
- Ability to learn quickly, operate independently, and work with ambiguity
Preferred Skills & Experience
- Experience with distributed training frameworks such as DeepSpeed, FSDP, Megatron-LM, Ray, or similar
- Exposure to inference tooling such as vLLM, TensorRT-LLM, Triton Inference Server, CUDA, or Kubernetes-based serving
- Experience working with LLMs, transformers, or large-scale recommendation/search models
- Familiarity with GPU performance, memory optimisation, batching, quantisation, caching, or serving latency
- Previous internship or work experience at a strong AI lab, ML infrastructure company, big tech ML team, or high-calibre startup
- 6 months - 2 years commercial experience at a high performing start ups
The role is SF based - Salary is up to $250,000 per annum & would include significant equity.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search