Lead Associate Principal, Cloud Engineering
Indexed description
Primary Duties and Responsibilities:
To perform this job successfully, an individual must be able to perform each primary duty satisfactorily.
- Reports to the Director of Platform Automation and Cloud Engineering
- Design, configure, implement and manage Kubernetes clusters and maintain a fully automated workflow for provisioning and managing a complex, highly available container orchestration environment using infrastructure as code
- Develop and maintain Kubernetes operators, controllers, and custom resources to extend cluster functionality and automate application lifecycle management
- Manage DevOps development activities and complex development tasks that will involve working with tools such as Docker, Kafka, container runtimes, and Kubernetes ecosystem tools
- Lead and participate in Kubernetes cluster build-outs, upgrades, software installation, maintenance and support, including but not limited to, patches, security fixes, end-of-life preparation, and version upgrades
- Implement and manage Kubernetes networking solutions, service mesh architectures, runtime security policies, and RBAC configurations to ensure secure and efficient cluster operations
- Ensure the reliability of Kubernetes platforms and containerized services your area of responsibility provide and manage to both specific and implied SLAs to help the organization achieve both internal and external quality standard excellence for the cloud platform
- Assess and plan for capacity needs within Kubernetes clusters and the underlying cloud platform and forecast accordingly
- Implement and manage initiatives within your assigned area of responsibility with accountability for results and compliance with all controls and security requirements
- Lead in the development of technology roadmaps and end-of-life technology plans for Kubernetes versions, container runtimes, and related cloud-native technologies
- Write and maintain documentation of relevant Kubernetes architectures, systems, procedures and processes
- Effectively communicate project and operational service issues to senior management promptly with observations, decisions, and recommendations for corrective measures
- Manage and participate in the implementation of production changes during defined maintenance windows and support on call rotation
- Maintain appropriate work/personal balance within your team
- Serve as a point of escalation within the team for Kubernetes and containerization support issues
- Implement and manage rotational support schedules for afterhours and weekend work for area of responsibility
- Foster an atmosphere of trust, respect, and high performance while displaying strong ethics and integrity
- Manage project and daily task planning and prioritization and meeting project deadlines while also maintaining a high quality of work
- Institutes corrective actions to address audit and other regulatory or compliance findings
- Operate within budget; Establish and assure adherence to schedules, work plans, and performance requirements
- Other duties as assigned
- None
- [Required] Good consultative, communication, team player and analytical skills are a must, as you will be regularly interacting between various teams distributed across the US
- [Required] Working knowledge of Kubernetes architecture, container orchestration, and cloud-native infrastructure design and components, such as: etcd, networking, storage, and container runtimes
- [Required] Extensive hands-on experience with Kubernetes cluster creation, maintenance, support, and administration in production environments
- [Required] Deep understanding and practical implementation experience with Kubernetes networking (CNI plugins, service types, ingress controllers), runtime security (Pod Security Standards, OPA/Gatekeeper, network policies), and Role-Based Access Control (RBAC)
- [Required] Experience with architecting, implementing and maintaining highly available mission critical Kubernetes environments for 24/7 availability
- [Required] Experience working in an environment with a defined production change control process
- [Required] Demonstrates history of working within deadlines and ability to work well under pressure
- [Required] Production-level hands-on experience with AWS cloud services and implementing Kubernetes on AWS (EKS or self-managed clusters)
- [Required] Extensive experience with Infrastructure as Code using Terraform for provisioning and managing cloud infrastructure and Kubernetes resources
- [Required] Strong hands-on development skills with demonstrable coding experience in Go or Python (Go strongly preferred for Kubernetes operator/controller development). Candidates must be able to provide specific examples of production code they have written.
- [Required] Hands-on experience with Kubernetes ecosystem tools including: Helm, kubectl, container runtimes (containerd, CRI-O), and monitoring/observability tools
- [Required] Experience with CI/CD tools such as Jenkins, GitLab CI, or GitHub Actions.
- [Required] Experience with version control using GitHub or similar platforms
- [Required] Experience with configuration management tools such as Ansible, Puppet, or Chef
- [Strongly Preferred] Hands-on experience with Kubernetes operator/controller development using operator frameworks (Kubebuilder, Operator SDK, or similar). This can be demonstrated through either contributions to open-source Cloud Native Computing Foundation (CNCF) projects, OR Development of in-house Kubernetes operators/controllers. Note: If you have contributed to open-source CNCF projects, please include your GitHub profile link or links to notable Pull Requests in your resume.
- [Preferred] Experience with Rancher and RKE2 (Rancher Kubernetes Engine 2) Kubernetes distribution
- [Preferred] Experience with service mesh technologies (Istio, Linkerd) and Envoy proxy configuration and management
- [Preferred] Experience designing and implementing multi-tenancy architectures in Kubernetes environments
- [Preferred] Experience with GitOps-based continuous deployment using FluxCD, ArgoCD, or Rancher Fleet
- [Preferred] Experience with Kafka and event-driven architectures
- [Preferred] CKA, CKS certifications strongly desired
- [Preferred] AWS Solutions Architect Associate Certification or higher
- [Preferred] Relevant industry certifications such as Microsoft Azure or Google Cloud Platform
- [Required] Bachelor's degree, preferably in a technical discipline (Computer Science, Mathematics, Engineering, etc.), or equivalent combination of education and experience required
- [Required] 7+ years’ experience in IT systems installation, operations, administration, and maintenance of cloud systems / virtualized servers, with demonstrated significant experience in Kubernetes and container orchestration platforms
- [Preferred] Experience working in a financial services or highly regulated environment preferred
Benefits
A highly collaborative and supportive environment developed to encourage work-life balance and employee wellness. Some of these components include:
- A hybrid work environment, up to 2 days per week of remote work
- Tuition Reimbursement to support your continued education
- Student Loan Repayment Assistance
- Technology Stipend allowing you to use the device of your choice to connect to our network while working remotely
- Generous PTO and Parental leave
- 401k Employer Match
- Competitive health benefits including medical, dental and vision
Compensation
- The salary range listed for any given position is exclusive of fringe benefits and potential bonuses. If hired at OCC, your final base salary compensation will be determined by factors such as skills, experience and/or education.
- In addition, we believe in the importance of pay equity and consider internal equity of our current team members as part of any final offer.
- We typically do not hire at the maximum of the range in order to allow for future and continued salary growth. We also offer a substantial benefits package as noted on www.theocc.com/careers
- All employees may be eligible for a discretionary bonus. Discretionary bonuses are based on various factors, including, but not limited to, company and individual performance and are not guaranteed.
Incentive Range
8% to 15%
This position is eligible for an annual discretionary incentive compensation award, for which the target range is listed above (see Incentive Range). The amount of such award, if any, will be based on various factors, including without limitation, both individual and company performance.
Step 1
When you find a position you're interested in, click the 'Apply' button. Please complete the application and attach your resume.
Step 2
You will receive an email notification to confirm that we've received your application.
Step 3
If you are called in for an interview, a representative from OCC will contact you to set up a date, time, and location.
For more information about OCC, please click here.
OCC is an Equal Opportunity Employer
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search