Senior TechOps Engineer
Indexed description
For more information about Momentum Group, please visit our website at www.mgh.ae.
Job Overview
We are seeking a Senior TechOps Engineer to join the Technical Operations team at Momentum Corporate Services. This role is responsible for maintaining and operating our regulated platforms, working within a 24/7 NOC environment to ensure system reliability and rapid incident response while meeting compliance requirements.
Key Responsibilities
- Build, operate, and support UAT and production environments on Microsoft Azure
- Perform hands-on RHEL administration, including hardening, patching, tuning, and troubleshooting
- Act as senior escalation points for L2/L3 production incidents, driving root cause analysis and permanent fixes
- Support and troubleshoot Java microservices platforms built on Spring Boot
- Diagnosing JVM-level performance issues (memory, GC, threads, latency) using metrics and logs
- Support service-to-service communication frameworks (e.g. Dubbo), including service discovery and client-side load balancing
- Drive security vulnerability remediation across OS, middleware, and cloud infrastructure
- Collaborate with Security, SOC, and NOC teams on incident response and post-incident reviews
- Support and mentor L1 engineers, ensuring quality handover and resolution
- Participate in 24/7 on-call rotation
- Manage incidents, problems, and changes using Jira Service Management
- Build and maintain runbooks, dashboards, and operational documentation
- Improve platform reliability through automation, CI/CD pipelines, and scripting
- 4+ years of experience in TechOps / Infrastructure / SRE / Production Operations
- Strong hands-on Azure and RHEL administration experience
- Proven experience supporting Java microservices and middleware platforms in production
- Advanced Elasticsearch and JVM observability experience
- Hands-on experience with CI/CD pipelines, automation, and Python scripting
- Strong incident management and root cause analysis capability
- Familiar with ITIL processes (Incident, Problem, Change)
- Comfortable operating in 24/7 production environments
- Management, development, and support of observability platforms (e.g., Prometheus, Grafana, ELK) to ensure system reliability and performance monitoring.
- Business-level proficiency in English is required
- Proficiency in Mandarin or Chinese is desirable for this role, though not a mandatory requirement.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search