Back to search
Experian Himalayas · Posted today

Mid SRE – Cloud Product Reliability

Brazil Full time

Mid Level Site Reliability Engineer Mid Level Full Stack Engineer (Support & Observability) Site Reliability Engineer II DevOps & Site Reliability Engineering Jobs
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Role Overview

The mission of a Product Reliability Engineer is to ensure cloud products and services are reliable, observable, scalable, secure, and resilient, so customers can depend on EMS consistently while engineering teams can innovate. Protect the customer through high availability, fast response times, and consistent performance with minimal service disruptions. Success is measured by uptime, latency, reliability, and customer satisfaction.

What You Will Do

Build reliability into the product. The mission is to identify reliability risks early and to influence architect decisions. Establish reliability requirements and validate system behavior under failure conditions. Create operational excellence, increase system resilience, and drive continuous improvements.

Why It Might Be a Fit

We are looking for a candidate with experience with observability, knowledge of distributed system architecture, and experience supporting high-availability environments. The ideal candidate will have experience with designing and implementing reliability engineering practices for cloud-native applications, and experience with cloud platforms (AWS, and GCP).

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience)
  • 2–5 years of experience in IT operations, Site Reliability Engineering (SRE), DevOps, Platform Engineering, Production Engineering, or Cloud Operations
  • Experience with designing and implementing reliability engineering practices for cloud-native applications
  • Experience with cloud platforms (AWS, and GCP)
  • Familiarity with operating systems (Linux and/or Windows)
  • Basic scripting skills (PowerShell, Python, Bash, etc.)
  • Experience working with incident management processes and tools
  • Understanding of distributed systems and cloud-based architectures
  • Knowledge of system performance tuning and troubleshooting techniques
  • Experience with reliability tools
  • Strong troubleshooting, data analysis, and technical communication skills
  • Experience with Infrastructure as Code tools such as Terraform
  • Knowledge of monitoring and observability platforms
  • Advanced English

Benefits

  • equal opportunity employer
  • affirmative action employer
  • disability accommodation
  • diverse and inclusive work environment
  • affinity groups for underrepresented groups
  • professional development opportunities

Originally posted on Himalayas

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If CV tailoring and application tracking get heavy, Full Caio Agent adds a human specialist.
View Full Agent