Mid SRE – Cloud Product Reliability
Indexed description
Role Overview
The mission of a Product Reliability Engineer is to ensure cloud products and services are reliable, observable, scalable, secure, and resilient, so customers can depend on EMS consistently while engineering teams can innovate. Protect the customer through high availability, fast response times, and consistent performance with minimal service disruptions. Success is measured by uptime, latency, reliability, and customer satisfaction.
What You Will Do
Build reliability into the product. The mission is to identify reliability risks early and to influence architect decisions. Establish reliability requirements and validate system behavior under failure conditions. Create operational excellence, increase system resilience, and drive continuous improvements.
Why It Might Be a Fit
We are looking for a candidate with experience with observability, knowledge of distributed system architecture, and experience supporting high-availability environments. The ideal candidate will have experience with designing and implementing reliability engineering practices for cloud-native applications, and experience with cloud platforms (AWS, and GCP).
Requirements
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience)
- 2–5 years of experience in IT operations, Site Reliability Engineering (SRE), DevOps, Platform Engineering, Production Engineering, or Cloud Operations
- Experience with designing and implementing reliability engineering practices for cloud-native applications
- Experience with cloud platforms (AWS, and GCP)
- Familiarity with operating systems (Linux and/or Windows)
- Basic scripting skills (PowerShell, Python, Bash, etc.)
- Experience working with incident management processes and tools
- Understanding of distributed systems and cloud-based architectures
- Knowledge of system performance tuning and troubleshooting techniques
- Experience with reliability tools
- Strong troubleshooting, data analysis, and technical communication skills
- Experience with Infrastructure as Code tools such as Terraform
- Knowledge of monitoring and observability platforms
- Advanced English
Benefits
- equal opportunity employer
- affirmative action employer
- disability accommodation
- diverse and inclusive work environment
- affinity groups for underrepresented groups
- professional development opportunities
Originally posted on Himalayas
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search