Software Engineer, SRE
Indexed description
- Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
- 3 years of experience with software development in one or more programming languages.
- Experience in one or more of the following: C, C++, Java, Python or Go.
- Master's degree in Computer Science or Engineering, or a related field.
- Experience in analyzing and troubleshooting large-scale distributed systems, cloud computing, and large databases.
- Knowledge of database internals and Google infrastructure.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to Google, while using your expertise in coding, algorithms, complexity analysis and large-scale system design.
SRE's culture of intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
To learn more: check out our books on Site Reliability Engineering or read a career profile about why a Software Engineer chose to join SRE.
Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible.
Responsibilities
- Lead the design, implementation, and testing of reliability-focused improvements to systems and processes. Identify and carry out improvements to automation, monitoring/alerting, and infrastructure.
- Create, influence and review ongoing design, architecture, standards and methods for services and systems.
- Write postmortems and lead incident analysis, with a focus on broad patterns and potential fixes.
- Triage, mitigate, and resolve common incidents, and coordinate incident response for complex ones.
- Work effectively with other Site Reliability Engineers (SREs), developers, and cross-functional teams. Assist in training new team members on operational procedures and best practices.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search