Machine Learning Engineer

Canada

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Location: Bay area (frequent customer interaction)

Team: Inference & Reinforcement Learning Platform

About the Role

We’re looking for a Machine Learning Engineer (MLE) to work directly with customers and partners to design, deploy, and validate inference and reinforcement learning (RL) proof-of-concepts on GMI’s GPU infrastructure.

This is a high-impact, hybrid engineering role that sits at the intersection of platform engineering, applied ML, and customer success. You’ll be embedded with customers during early-stage deployments—turning research ideas, datasets, and business requirements into working, performant systems on real GPU clusters.

If you enjoy being close to users, debugging real systems, and shipping results fast (not just writing docs), this role is for you.

What You’ll Do

Own customer POCs end-to-end

Deploy and optimize LLM inference, RL training, and post-training workflows on GMI clusters
Translate customer requirements into concrete system designs and experiments

Forward-deploy with customers

Work hands-on with research teams, startups, and enterprise customers
Debug performance, stability, and correctness issues in real environments

Inference deployment

Stand up and tune inference stacks (e.g. vLLM / SGLang / Ray Serve–style architectures)
Optimize latency, throughput, GPU utilization, and cost efficiency

RL & post-training POCs

Support RLHF / RFT / SFT workflows using customer-provided datasets
Integrate SDKs, training APIs, and cluster resources to shorten “idea → experiment” cycles

Performance & reliability

Diagnose GPU, networking, and distributed system bottlenecks
Run benchmarks, profiling, and stress tests on multi-GPU / multi-node setups

Feedback loop to product

Feed real-world customer learnings back into GMI’s platform, SDKs, and APIs
Help shape reference architectures, cookbooks, and best practices

What We’re Looking For

Core Requirements

Strong software engineering background (Python required; Go / Rust a plus)
Hands-on experience with ML inference or training systems
Familiarity with distributed systems and GPUs (multi-GPU, multi-node)
Comfort working directly with customers and ambiguous requirements
Ability to debug end-to-end systems (code, infra, networking, performance)

Nice to Have

Experience with:
LLM inference frameworks (vLLM, SGLang, Ray Serve, Triton, etc.)
RL or post-training workflows (RLHF, RFT, SFT)
PyTorch, DeepSpeed, Megatron-LM, or similar
Kubernetes-based ML platforms
GPU performance profiling and optimization
Prior experience as:
Forward Deployed Engineer
Solutions Engineer
ML Platform Engineer
Applied Research Engineer

What Makes This Role Special

You’re close to real users and real GPUs—not abstract roadmaps
You’ll work on cutting-edge inference and RL workloads, not toy demos
You’ll influence product direction through direct customer feedback
Fast iteration, high ownership, and visible impact

Who Thrives Here

Engineers who like shipping over theorizing
People who enjoy being the “last mile” problem solver
Builders who want exposure to both deep systems and applied ML
Those excited by early-stage POCs that turn into real production systems

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

GMI Cloud Company profile preview

Source: Linkedin
Location: Canada
Compensation: Not listed
Open on Caio: 5 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for GMI Cloud, based on roles Caio has indexed from public sources.

5open roles 1sources 1markets Posted 14d agolatest role