Senior Software Engineer
Indexed description
Your Role And Responsibilities
- Design, build, and operate production-grade Python backend services using FastAPI, Pydantic, and Uvicorn (ASGI architecture) with a focus on low-latency, high-reliability systems.
- Own services end-to-end in production, including architecture design, implementation, testing with pytest, deployment, monitoring, and incident response.
- Design API contracts and service boundaries for distributed systems, ensuring clear data flow, backward compatibility, and scalable integration patterns.
- Build and operate LLM and agent-based production systems, integrating frameworks such as LangChain/LangGraph and model providers (e.g., AWS Bedrock, LiteLLM) into real workflows.
- Distinguish and maintain separation between experimentation and production AI systems, ensuring safe rollout of model-driven features.
- Design and evolve CI/CD pipelines that enforce automated testing, security validation, and deterministic, rollback-safe deployments.
- Debug and resolve production incidents in distributed systems using logs, metrics, and distributed tracing (OpenTelemetry / ddtrace patterns), with clear ownership of service health.
- Operate cloud-native services on AWS, including secure configuration, secrets management, IAM-controlled access, and production runtime reliability.
- Design and maintain containerized workloads using Docker and Kubernetes, including rollout strategies (blue/green or rolling), health checks, autoscaling, and resource tuning.
- Build systems with observability standards (metrics, tracing, logging) to ensure measurable reliability, performance, and failure diagnosis.
- Participate in on-call or production support rotation with ownership for service-level outcomes and incident resolution.
- Collaborate in Agile teams to translate stakeholder requirements into production systems, with emphasis on technical design ownership and delivery accountability.
Required Technical And Professional Expertise
- Experience leading technical design discussions, influencing architecture decisions, and mentoring engineers on production system design and operations
- Proven experience building and operating production backend services in Python using FastAPI, Pydantic, and ASGI-based architectures.
- Demonstrated ownership of systems in production, including deployment, maintaining CI/CD pipelines with automated testing, release gating, and rollback-safe deployment strategies, monitoring, on-call support, and incident resolution
- Strong experience designing and integrating distributed systems, including API contracts, service boundaries, and data flow between components.
- Experience debugging complex production issues using observability tooling (logs, metrics, distributed tracing such as OpenTelemetry or ddtrace).
- Experience building or integrating LLM-based or AI-driven systems, including orchestration frameworks and external model providers.
- Strong testing and code quality discipline, including use of pytest, linting, and automated quality enforcement in CI pipelines.
- Experience operating LLM-based systems in production, including agent orchestration, tool integration, and safe rollout of model-driven features.
- Experience designing event-driven or asynchronous architectures, including message queues, background processing, and workflow orchestration.
- Experience contributing to or owning internal platforms or developer tooling, improving engineering velocity and system reliability.
- Experience implementing security best practices in production, including least-privilege access, secrets rotation, and secure service-to-service communication.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search