Back to search
CommIT Remotejobs · Posted 7d ago

Tech Ops - Production Support & Reliability (AWS)

Remote Full-time Remote

general General Remotejobs
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Description

We are looking for Tech Ops - Production Support & Reliability Lead

Front-line production support for Braviant's AWS multi-account stack. Monitor systems, triage alerts, execute runbooks, escalate cleanly to developers. Defensive ownership role - not a developer role despite "Lead" in title.

Stack:

- AWS - VPC, ECS, Lambda (SAM/CloudFormation), IAM, NAT, security groups - PostgreSQL on Amazon RDS (~15 instances) - Datadog + CloudWatch (APM, logs, alerting) - Java microservices / API-heavy app stacks - Jira (ITSM) + Slack (ops channels) - Nice-to-have: AWS data services (Glue, S3, Athena, EventBridge), Metaplane

Requirements

Must-have:

- 3+ years production support / SRE / NOC / ops engineering - Hands-on AWS - EC2/ECS, VPC networking, IAM - Operational PostgreSQL / RDS - slow query reading, basic tuning, vacuum awareness - Incident triage across infra + app layers - Structured incident response (ITIL, NIST, or equivalent) - SLA management in a ticketed environment (Jira or similar) - Strong written English for escalation + post-incident write-ups Nice-to-have:

- Datadog / CloudWatch fluency - AWS data services (Glue, S3, Athena, EventBridge) - Basic IaC (CloudFormation, SAM, Terraform) - Financial services or other regulated-environment background - AWS SysOps Administrator or Solutions Architect cert - Scripting / automation

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.

Unlock free search