Site Reliability Engineer - Observability
Indexed description
We’re proud to be an ambitious, fast-growing technology scale-up with a dynamic and diverse international team representing more than 20 nationalities. Collaboration, flexibility, and continuous learning are part of our DNA.
At CluePoints, you’ll find a culture where you can grow, make an impact, and have fun along the way.Guided by our values of Care, Passion, and Smart Disruption, we’re united by a shared mission: to create smarter ways to run efficient clinical trials and deliver AI-powered insights that improve human outcomes worldwide.
Role:
The Site Reliability Engineer, Observability & RUM is responsible for improving end-to-end observability across our platforms and customer-facing applications, with a particular focus on frontend and Real User Monitoring (RUM). This role combines core SRE practices with ownership of monitoring, logging, tracing, alerting, and user-experience telemetry in production.
You will help evolve our observability capabilities across Azure and Kubernetes environments, improve incident detection and diagnosis, and support decisions around managed versus self-managed observability tooling. You will partner closely with Engineering, Support, QA, and Security teams to ensure systems ship with actionable telemetry, dashboards, alerts, and operational runbooks.
Job Requirements
- 5+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or Observability Engineering roles.
- Strong hands-on experience with observability and monitoring platforms, including several of the following:Elastic, Grafana, Prometheus, OpenTelemetry, Sentry, monitoring agents, and managed APM/observability platforms.
- Experience implementing and supporting Real User Monitoring (RUM) and frontend/application observability in production environments.
- Ability to work across frontend, backend, and platform teams to improve telemetry, alerting, and incident diagnosis.
- Experience evaluating or operating managed observability platforms and understanding the trade-offs versus self-managed stacks.
- Experience supporting ML, AI, or LLM-backed services in production (RAG, LangSmith, Arize Phoenix, LangChain, LangGraph, Azure OpenAI, OpenAI, or Anthropic APIs).
- Own and improveReal User Monitoring (RUM) for customer-facing applications, including browser performance, client-side errors, user journeys, and frontend service dependencies.
- Partner with frontend, product, and engineering teams to improve visibility into user experience, JavaScript/runtime failures, page performance, and customer-impacting issues.
- Establish and maintain end-to-end observabilityacross frontend, backend, infrastructure, and Kubernetes environments using metrics, logs, traces, dashboards, and alerting.
- Evaluate, implement, and operate managed and self-managed observability solutions, helping guide the evolution of the observability stack. Support and improve observability tooling such as Sentry, Elastic, Grafana, Prometheus, OpenTelemetry, monitoring agents, and related APM platforms. Define and maintain SLIs, SLOs, and alerting strategies that improve service reliability, reduce noise, and enable faster detection of production issues. Lead or support incident detection, alert triage, live production troubleshooting, and service restoration across outage, latency, batch, file transfer, and degradation scenarios, in partnership with Support and Production teams.
- Comprehensive Health Insurance (medical, dental, and online consultations, 100% employee coverage)
- Life Insurance through UNUM
- Cafeteria Plan with flexible monthly credits for wellness, entertainment, and travel
- MultiSport Card, co-financed 50/50
- Employee Capital Plans (PPK) with 4% employer contribution
- A hub-based hybrid model that blends flexibility with purpose — connecting teams through collaboration, learning, and a vibrant social culture.
Your personal data will be processed by CluePoints for recruitment purposes in accordance with the Regulation (EU) 2016/679 (GDPR).
If you wish for your data to be retained for future opportunities, please include the following statement in your CV:
“I consent to the processing of my personal data by CluePoints for the purposes of future recruitment processes.”
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search