Staff Software Engineer, Compute Architecture
Indexed description
About The Role
As a Staff Software Engineer within our Compute Architecture organization, you will help build the software systems that operate the backbone of our large-scale GPU data centers. The METALDEV team builds Go-based distributed services that bring new infrastructure online, manage hardware lifecycle workflows, monitor production health, and automate safe operations across fleets of GPU servers and rack-scale systems. This is a software-first role at the intersection of distributed systems, production reliability, and hardware-aware automation, where your work directly improves the reliability, safety, and scalability of real-world infrastructure.
What You’ll Do
- Design, build, and operate Go-based services that manage the lifecycle of large-scale GPU data center infrastructure.
- Build automation for data center bring-up, hardware discovery, health monitoring, remediation, and production operations.
- Develop reliable APIs, services, and workflows for managing BMCs, firmware state, server health, and rack-level infrastructure.
- Improve observability, alerting, and operational tooling so production issues can be detected, understood, and resolved quickly.
- Translate incidents and hardware failure modes into software improvements that make the platform more resilient.
- Partner with hardware-adjacent, infrastructure, operations, and software teams to design systems that work safely at fleet scale.
- Provide technical leadership through design reviews, code reviews, architectural guidance, and mentorship.
- Make pragmatic architecture decisions that balance reliability, simplicity, scalability, and operational burden.
- B.S., M.S., or PhD in Computer Science or related field, or equivalent experience.
- 8+ years of software engineering experience with a strong focus on infrastructure, cloud engineering, and distributed databases—particularly within large-scale datacenter and cloud environments.
- Expertise in Go and proven experience building REST/gRPC APIs for mission-critical platforms.
- Strong background in architecting and scaling cloud-native Kubernetes infrastructure and distributed services.
- Proven success in mentoring engineers, leading technical projects, and influencing engineering strategy across teams.
- Experience contributing to and collaborating with open source communities.
- Skilled in applying a data-driven approach to reliability, optimization, and continuous improvement.
- Excellent communicator able to work effectively with both technical and non-technical stakeholders.
- Hands-on experience with observability stacks (Prometheus, Grafana, PromQL), CI/CD pipelines, and operating large fleets of GPU servers.
- Track record of leading incident response, postmortems, and driving robust service reliability.
- Working knowledge of Kafka, ClickHouse and CRDB.
- DMTF, RedFish APIs, and GPU servers.
- Be Curious at Your Core
- Act Like an Owner
- Empower Employees
- Deliver Best-in-Class Client Experiences
- Achieve More Together
The base salary range for this role is $188,000 to $275,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).
What We Offer
The range we’ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location.
In addition to a competitive salary, we offer a variety of benefits to support your needs, including:
- Medical, dental, and vision insurance - 100% paid for by CoreWeave
- Company-paid Life Insurance
- Voluntary supplemental life insurance
- Short and long-term disability insurance
- Flexible Spending Account
- Health Savings Account
- Tuition Reimbursement
- Ability to Participate in Employee Stock Purchase Program (ESPP)
- Mental Wellness Benefits through Spring Health
- Family-Forming support provided by Carrot
- Paid Parental Leave
- Flexible, full-service childcare support with Kinside
- 401(k) with a generous employer match
- Flexible PTO
- Catered lunch each day in our office and data center locations
- A casual work environment
- A work culture focused on innovative disruption
California Consumer Privacy Act - California applicants only
CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.
As part of this commitment and consistent with the Americans with Disabilities Act (ADA), CoreWeave will ensure that qualified applicants and candidates with disabilities are provided reasonable accommodations for the hiring process, unless such accommodation would cause an undue hardship. If reasonable accommodation is needed, please contact: [email protected].
Export Control Compliance
This position requires access to export controlled information. To conform to U.S. Government export regulations applicable to that information, applicant must either be (A) a U.S. person, defined as a (i) U.S. citizen or national, (ii) U.S. lawful permanent resident (green card holder), (iii) refugee under 8 U.S.C.
- 1157, or (iv) asylee under 8 U.S.C.
- 1158, (B) eligible to access the export controlled information without a required export authorization, or (C) eligible and reasonably likely to obtain the required export authorization from the applicable U.S. government agency. CoreWeave may, for legitimate business reasons, decline to pursue any export licensing process.
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search