Back to search
openai Ashby · Posted 2d ago

Software Engineer, Build Systems / CI

San Francisco, CA, United States Fulltime

Applied AI FullTime Ashby
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

About the Role The Engineering Acceleration team builds and operates the foundational systems that engineers use to build, test, and ship ChatGPT, the API, and OpenAI's infrastructure. We are looking for an engineer to help evolve OpenAI's build and continuous integration systems for a fast-growing engineering organization. This role sits at the intersection of developer productivity, build systems, distributed infrastructure, and software quality. You will work on the systems that determine how quickly and confidently engineers can move: Bazel-based builds, Buildkite pipelines, test selection, remote caching and execution, CI observability, and tooling that helps engineers understand and fix failures quickly. Our mission is to make OpenAI one of the most productive engineering organizations in the world while preserving a high bar for correctness, reliability, and safety. The best version of this work is invisible when it succeeds: builds are fast, tests are trusted, CI failures are understandable, and engineers can focus on shipping useful systems instead of fighting infrastructure. In This Role, You Will Own and evolve Bazel-based build and test workflows across a large, polyglot monorepo. Design and maintain Starlark rules, macros, toolchains, and integrations that make builds reproducible, hermetic, and easy for product teams to adopt. Improve CI performance and reliability across Buildkite pipelines, including queue time, build time, cache hit rates, test sharding, retry behavior, and flake isolation. Build systems that reduce unnecessary CI work through affected-target detection, dependency graph analysis, test selection, caching, batching, and smarter scheduling. Improve local development workflows so engineers can reproduce CI behavior, debug build failures, and iterate quickly without learning every detail of the build stack. Operate and optimize build infrastructure across Docker/OCI images, Kubernetes-based runners, cloud resources, and remote cache/execution systems. Instrument build and CI systems with metrics, logs, traces, dashboards, and analytics so we can measure speed, reliability, cost, and developer impact. Partner directly with product, infrastructure, and research engineering teams to understand pain points, onboard projects, debug hard build issues, and remove systemic bottlenecks. Use modern AI tools to rethink CI failure analysis, flaky test debugging, PR triage, automatic remediation, and developer-facing explanations. Own the reliability of the systems you build, including participating in an on-call rotation for critical developer infrastructure. Technologies Commonly Used In This Environment Include Bazel and Starlark for build and test workflows Buildkite for CI orchestration Docker and OCI images for build and runtime packaging Kubernetes for CI runners and infrastructure orchestration Python, Go, TypeScript, Rust, C++, and other languages in a large monorepo Terraform for infrastructure as code Remote caching, remote execution, artifact storage, and build telemetry systems Postgres, Kafka, and internal services used to power engineering platforms You May Be A Strong Fit If You Have 5+ years of software engineering experience, including significant experience building infrastructure or tooling for developers. Have hands-on experience with Bazel, Buck, Pants, Gradle, or similar build systems, and understand the tradeoffs of hermetic builds, dependency graphs, caching, sandboxing, and remote execution. Have built or operated CI systems at scale, especially in environments where build time, queue time, test flakiness, and developer trust materially affect engineering velocity. Are comfortable writing production software for internal platforms, not just configuring tools. We expect this role to involve code, design, debugging, operations, and long-term ownership. Can debug distributed build and CI failures across source control, dependency management, containers, runners, remote caches, test frameworks, and service infrastructure. Care deeply about developer experience and have empathy for the small sources of friction that slow teams down or create operational toil. Are pragmatic about platform adoption: you know how to build paved paths that teams want to use because they are faster, clearer, and more reliable. Communicate clearly across teams and can turn ambiguous productivity problems into concrete technical plans. Are excited to apply AI to developer infrastructure in ways that make engineers faster without weakening quality, reliability, or safety. About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.  We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement . Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, for US-based candidates. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations. To notify OpenAI that you believe this job posting is non-compliant, please submit a report through this form . No response will be provided to inquiries unrelated to job posting compliance. We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link . OpenAI Global Applicant Privacy Policy At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.

Unlock free search