Software Engineer- Kubernetes
Indexed description
- Please note: The first step in the interview process requires candidates to join a Microsoft Teams meeting with the video turned on.**
Why Cadre5?
- Working with highly talented team members
- 3 weeks’ vacation
- Excellent medical insurance, including employer-paid benefits
We are seeking Software Engineers with deep Kubernetes expertise to design and develop custom Kubernetes Operators that extend the orchestration of high-performance workloads and secure data workflows at scale. These roles are central to enabling AmSC’s AI and HPC platforms, ensuring that containerized research applications run seamlessly across heterogeneous compute and data environments.
Key Responsibilities:
- Custom Kubernetes operator development
- Design, implement, maintain, modify, and test custom Kubernetes operators written in Go and/or Ansible
- Enhance existing software development processes, practices, and standards. test environments to evaluate tooling based on performance, feature set, and maintainability—especially for components that must work reliably with on-premise hardware and OS requirements.
- Support the use and understanding of in-house Kubernetes operators and serve as a maintainer for those controllers.
- Architecture & Infrastructure as Code and Tooling
- Develop and implement an Architecture as Code process for the Slate platform
- Write and maintain infrastructure and deployment code using tools such as ArgoCD (GitOps), Puppet (OS management), Go, Python, Bash, Ansible, Terraform, and GitLab CI.
- Engage with development teams to understand platform needs and tailor the cluster experience to meet evolving requirements.
- Technical Leadership for Software Engineering
- Provide software development, guidance, code reviews, and pair programming support to a team of 11 engineers.
- Contribute to onboarding, team documentation, and process improvement initiatives.
- Act as a go-to technical expert for all Kubernetes custom operator questions across the engineering organization.
- Collaboration
- Partner closely with internal cybersecurity and development teams to ensure the platform custom operators meets security, compliance, and usability expectations.
- Participate in cross-functional projects related to platform enhancements, cluster lifecycle automation and infrastructure provisioning.
- Experience with the following key technologies and tools:
- Languages: Go, Python, Bash
- CI/CD: GitLab CI, ArgoCD
- IaC/Config Management: Puppet, Helm, Ansible
- Kubernetes & Ecosystem: On-prem K8s, Custom Operators, Service Mesh, k8s architecture
- Operating Systems: Linux-based OS management at the hardware level, strong Linux sysadmin skills
- The ability to obtain and maintain a Department of Energy "Q" clearance may be required. This requires US Citizenship.
- Prior Istio operator development or service mesh integration experience.
- Familiarity with WebAssembly plugin development for Istio or Kubernetes.
- Background in HPC platforms, GPU-based AI training environments, or large-scale distributed systems.
- Exposure to DOE computing ecosystems (ALCF, OLCF, NERSC, ESnet, HPDF).
- Experience with containerized scientific workflows and secure data-sharing architectures.
Cadre5 is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. Cadre5 is an E-Verify Employer.
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search