Back to search
jobgether Lever · Posted 27d ago

Senior Site Reliability Engineer - GCP

US Full-time

IT Security & IT Lever
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer - GCP in the United States.

This role sits at the core of building and scaling highly reliable, cloud-native systems that power complex, data-driven applications used at enterprise scale. You will help design and evolve autonomous reliability systems that reduce operational friction and ensure performance, security, and availability across production environments. Working within a cross-functional engineering organization, you will influence architecture decisions, improve CI/CD and observability frameworks, and drive automation-first reliability practices. The environment is fast-evolving, with foundational SRE capabilities actively being built and refined, offering significant impact and ownership. You will act as a technical leader and mentor while shaping how reliability is engineered across the full software lifecycle. This is a highly hands-on role for someone who thrives on solving systemic infrastructure challenges in modern cloud ecosystems, particularly within Google Cloud Platform environments.

Accountabilities

In this role, you will be responsible for defining and advancing site reliability engineering practices across cloud infrastructure and application systems. You will design scalable, automated frameworks that ensure high availability, performance, and resilience while reducing operational toil. You will collaborate closely with engineering teams to embed reliability into every stage of the software lifecycle and ensure systems are observable, secure, and recoverable.

    • Design and maintain autonomous systems for deployment, testing, monitoring, and operations of production environments
    • Act as a reliability authority across the SDLC, ensuring best practices are embedded in engineering workflows
    • Enhance CI/CD pipelines, automation tooling, and operational playbooks to improve speed and reliability
    • Build and maintain observability systems including monitoring, logging, dashboards, and alerting frameworks
    • Proactively identify and resolve performance, scalability, availability, and security risks
    • Participate in incident response and on-call rotations, ensuring rapid mitigation of production issues
    • Mentor engineers and contribute to technical leadership across reliability initiatives
    • Document architectures, processes, and operational standards to improve engineering efficiency

    Requirements

    This role requires deep hands-on expertise in site reliability engineering, infrastructure automation, and cloud-native system design, with a strong focus on Google Cloud Platform. You should be comfortable operating in complex distributed environments and driving reliability through automation, observability, and engineering discipline. Strong communication and leadership skills are essential to influence cross-functional teams and guide technical decisions.

      • 8+ years of experience in software engineering, infrastructure, or operations, including 4+ years in SRE roles
      • Strong expertise in Google Cloud Platform (GCP), including GKE, Compute Engine, IAM, Logging, and Monitoring
      • Proficiency in scripting and automation using Python, Bash, PowerShell, or similar tools
      • Experience building autonomous systems for CI/CD, deployment, testing, and production operations
      • Deep understanding of observability, incident response, capacity planning, and performance optimization
      • Experience reducing operational toil through automation and scalable engineering solutions
      • Ability to make architectural decisions balancing reliability, scalability, and security
      • Strong collaboration skills and ability to mentor engineers in fast-paced environments
      • Bachelor’s degree in Computer Science or equivalent experience; cloud certifications are a plus

      Benefits

        • Competitive compensation package ($130,000 – $180,000 base salary range depending on experience and location)
        • Comprehensive medical, dental, and vision insurance (for eligible full-time employees)
        • Flexible remote work environment for engineering roles
        • Paid time off, parental leave, and disability coverage
        • 401(k) retirement plan options
        • Opportunities for continuous learning, certifications, and leadership development
        • Hackathons and innovation initiatives
        • Dynamic, fast-growing environment focused on large-scale technical impact
How Jobgether works: We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Why Apply Through Jobgether? Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1
Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent