AI Inference Infrastructure Software Engineer (Kubernetes / Cloud)
Indexed description
AI Inference Infrastructure Software Engineer (Kubernetes / Cloud)
Seattle, WA (Hybrid – 3 days/week onsite)
Compensation: Targeting $170K - $210K, meaningful start up equity
A fast‑moving AI engineering team is looking for an Inference Infrastructure Software Engineer to build and operate the Kubernetes and cloud backbone behind large‑scale accelerated inference workloads. If you thrive at the intersection of distributed systems, cloud infrastructure, and high‑performance AI, this role puts you right at the core of next‑generation inference platforms. If you want to help push AI inference to its performance limits and build the infrastructure that makes it possible, we’d love to connect.
What You’ll Do
- Build and operate Kubernetes infrastructure powering large‑scale inference services
- Run accelerated workloads with strict latency, throughput, and reliability requirements
- Manage AWS, GCP, and on‑prem environments across networking, storage, IAM, and observability
- Develop automation and tooling in Python, Bash, and Go to streamline deployments and scaling
- Partner with ML, runtime, and hardware teams to productionize new inference capabilities
- Contribute to capacity planning, cost optimization, and reliability engineering
- Participate in on‑call rotation for critical services
What You Bring
- 3–5 years of hands‑on Kubernetes experience (EKS, GKE, or self‑hosted)
- 2–3 years operating production workloads on AWS or GCP
- Experience running ML or accelerated inference services at scale
- Strong skills in Python, Bash, and Go
- Deep understanding of GPU/accelerator scheduling, device plugins, and cluster performance
- Experience with IaC (Terraform/Pulumi), config management (Ansible/Puppet/Salt), and GitOps (Argo/Flux)
- Comfortable operating in fast‑moving, early‑stage environments
Bonus Points
- Experience with inference servers (Triton, vLLM, TGI)
- Exposure to non‑GPU accelerators (FPGAs, ASICs)
- Background in SRE, observability, or performance engineering
- Experience building customer‑facing API platforms
Prime Team Partners is an equal opportunity employer. Prime Team Partners does not discriminate on the basis of race, color, religion, national origin, pregnancy status, gender, age, marital status, disability, medical condition, sexual orientation, or any other characteristics protected by applicable state or federal civil rights laws. For contract positions, hired candidates will be employed by Prime Team for the duration of the contract period and be eligible for our company benefits. Benefits include medical, dental and vision. Employees are covered at 75%. We offer a 401K after 6 months, we do not provide paid holidays or PTO, sick time is offered in accordance with local laws. This position is open until filled.
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search