Software Development Engineer
Indexed description
Payrate: $45.00- $50.00/hr.
Summary:
We are seeking a Senior Operations / Reliability Engineer to support live operations, service reliability, release stability, and prototype device monitoring for new hardware and software products. This role will focus on monitoring telemetry, diagnosing live issues, validating software releases, supporting incident response, and helping improve operational readiness across services, applications, and prototype device environments.
The role will be strongly supported by experienced engineers on the team, who will provide technical guidance on service architecture, prototype device workflows, telemetry interpretation, release processes, and complex debugging. The engineer will collaborate closely with these senior team members while taking ownership of day-to-day monitoring, release validation, live issue triage, documentation, and operational reporting.
Responsibilities:
- Live Monitoring & Telemetry
- Monitor telemetry from services, applications, and prototype devices to assess operational health.
- Observe dashboards, alerts, logs, and metrics to identify anomalies, failures, performance degradation, or emerging reliability risks.
- Analyze real-time metrics and logs to support troubleshooting across cloud, on-premises, and prototype device environments.
- Triage operational issues and communicate findings clearly to engineering, QA, PM, and product teams.
- Provide actionable insights based on telemetry trends, system behavior, and recurring failure patterns.
- Help improve monitoring coverage, alert quality, dashboard usefulness, and operational visibility.
- Release & Service Operations
- Support software releases by validating deployments, monitoring live systems, and assessing post-deployment stability.
- Track service health during rollouts, ring deployments, updates, and release validation windows.
- Identify, debug, and help resolve live issues affecting services, devices, internal users, or product readiness.
- Partner with engineering teams to support mitigations, fixes, rollbacks, or follow-up validation.
- Assist with post-release verification and stabilization reporting.
- Document release observations, risks, incidents, and readiness concerns.
- Incident Response & Reliability Support
- Support incident response by gathering data, summarizing impact, identifying suspected causes, and tracking mitigation progress.
- Participate in post-incident reviews and help document lessons learned.
- Recommend improvements to monitoring, alerting, operational procedures, and service reliability practices.
- Maintain clear records of incidents, recurring issues, known risks, and follow-up actions.
- Help reduce operational toil by identifying repeatable troubleshooting steps, documentation gaps, and automation opportunities.
- On-Site Hardware & Environment Support
- Perform in-person troubleshooting for self-hosted systems, prototype devices, or test environments when telemetry or dashboards indicate issues.
- Assist with device configuration, deployment, validation, and live verification.
- Run smoke checks or readiness checks to confirm device, service, and environment health.
- Maintain documentation of hardware configurations, operational procedures, environment setup, and observed issues.
- Coordinate with engineering and infrastructure teams to resolve environmental or device-level reliability problems.
- Collaboration & Communication
- Work closely with software, QA, infrastructure, PM, and product teams to support operational readiness and release reliability.
- Communicate operational status, risks, and technical findings clearly and promptly.
- Provide concise summaries of system health, release readiness, incident status, and recommended next steps.
- Operate independently on assigned areas while escalating appropriately when issues require deeper engineering involvement.
- Deliverables
- Real-time telemetry dashboards, monitoring views, and actionable alerting improvements.
- Release verification and stabilization reports.
- Incident reports, issue summaries, and operational analysis for live events.
- Documentation of hardware configurations, device workflows, operational procedures, and troubleshooting steps.
- Service health summaries, risk assessments, and recommendations for reliability improvements.
- Clear communication of live issues, suspected causes, mitigation status, and follow-up actions.
- Recommendations for improving monitoring, alerting, release validation, and operational readiness.
Requirements:
- Bachelor’s degree in computer science, Computer Engineering, Software Engineering, or a related technical field, or equivalent practical experience.
- 5-7 years of relevant experience in software engineering, DevOps, SRE, production operations, infrastructure, service reliability, or related technical operations roles.
- Experience monitoring live services, applications, infrastructure, or device environments.
- Experience using dashboards, alerts, logs, metrics, and telemetry to diagnose system health and troubleshoot issues.
- Experience supporting software releases, deployments, production validation, or service rollouts.
- Ability to investigate technical issues, summarize findings, and communicate risks clearly to engineering and product teams.
- Experience documenting incidents, operational procedures, known issues, and troubleshooting steps.
- Familiarity with CI/CD workflows, cloud or hybrid infrastructure, release validation, and incident response practices.
- Strong problem-solving skills, communication skills, and ability to work independently in a fast-moving engineering environment.
Pay Transparency: The typical base pay for this role across the U.S. is: $45.00- $50.00/hour. Non-exempt positions are eligible for overtime at a rate of 1.5 times the base hourly rate for all hours worked in excess of 40 in a work week, or as required by state or local law. Final offer amounts, within the base pay set forth above, are determined by factors including your relevant skills, education and experience. Full-time employees are eligible to select from different benefits packages. Packages may include medical, denmatch, lifeion benefits, health savings accounts with qualified medical plan enrollment, 10 paid days off, 3 days paid bereavement leave, 401(k) plan participation with employer match, life and disability insurance, commuter benefits, dependent care flexible spending account, accident insurance, critical illness insurance, hospital indemnity insurance, accommodations and reimbursement for work travel, and discretionary performance or recognition bonus. Sick leave and mobile phone reimbursement provided based on state or local law.
Consent to Communication and Use of AI Technology: By submitting your application for this position and providing your email address(es) and/or phone number(s), you consent to receive text (SMS), email, and/or voice communication whether automated (including auto telephone dialing systems or automatic text messaging systems), pre-recorded, AI-assisted, or individually initiated from Aditi Consulting, our agents, representatives, or affiliates at the phone number and/or email address you have provided. These communications may include information about potential opportunities and information. Message and data rates may apply. Message frequency may vary.
You represent and warrant that the email address(es) and/or telephone number(s) you provided to us belong to you and that you are permitted to receive calls, text (SMS) messages, and/or emails at these contacts. You also acknowledge and agree to Aditi Consulting LLC’s use of AI technology during the sourcing process, including calls from an AI Voice Recruiter. AI is used solely to gather data and does not replace human-based decision-making in employment decisions. Calls may be recorded.
Consent is not a condition of purchasing any property, goods, or services. You may revoke your consent at any time by replying “STOP” to messages or by contacting [email protected].
For information about our collection, use, and disclosure of applicant's personal information as well as applicants' rights over their personal information, please see our Privacy Policy .
#AditiConsulting
#26-02862
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search