IQuest Solutions Corporation
Linkedin · Posted 3mo ago
AI Engineer - Autonomous Agents & Model Infrastructure
Continue to application
Add your email once, then Caio opens the original posting.
Indexed description
Location: Hyderabad, India (US Timings) | Experience Level: 3-4 years (AI/Data Science) + 2 years (MLOps/LLMOps/AIOps)About The RoleWe're seeking an experienced AI Engineer to design, deploy, and manage autonomous agent systems on proprietary infrastructure. You'll own the full lifecycle—from optimizing model weights to building production-grade agents with fine-tuning and reinforcement learning on on-premises or private cloud environments.Key Responsibilities
- Design and deploy autonomous agent architectures on AWS VPC and on-premise environments
- Manage model weights and optimize for inference; implement LoRA and QLoRA fine-tuning for domain-specific tasks
- Develop reinforcement learning pipelines for agent training with reward modeling and policy optimization
- Implement MLOps/LLMOps infrastructure: model versioning, A/B testing, rollbacks, and evaluation frameworks
- Architect RAG systems integrating vector databases with proprietary and fine-tuned models
- Optimize model serving infrastructure (vLLM, TorchServe, TensorRT) for production inference
- Build monitoring and observability systems for agent behavior and RL training quality
- Ensure model security, data privacy, and audit compliance in enterprise deployments
- 3-4 years hands-on experience in AI/ML/Data Science with at least 2 projects shipped to production
- 2+ years dedicated experience in MLOps, LLMOps, or AIops (model deployment, inference optimization, pipeline automation, model management)
- AWS proficiency across AI services: EC2, VPC, S3, IAM, SageMaker, Bedrock, Lambda, or custom ML infrastructure
- Strong software engineering fundamentals: containerization (Docker), orchestration (Kubernetes), CI/CD, and API design
- Hands-on experience deploying and serving large language models or foundation models in production environments
- Practical experience with LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA) fine-tuning techniques for efficient model adaptation
- Understanding of reinforcement learning fundamentals and experience implementing RL-based training: policy gradients, reward shaping, or preference-based optimization
- Working knowledge of vector databases and RAG implementation
- Solid understanding of model optimization techniques and inference constraints (GPU memory, latency, throughput)
- Experience building autonomous agents with RL frameworks (DPO, PPO, RLHF) and fine-tuning frameworks (Hugging Face Transformers, PEFT)
- QLoRA experience on consumer-grade GPUs in memory-constrained environments
- Migration experience from cloud APIs (OpenAI, Anthropic) to self-hosted models
- On-premises or VPC-only deployment experience
- Familiarity with agent frameworks (LangChain, LlamaIndex, AutoGen) and MLOps tools (MLflow, W&B, DVC)
- Strong debugging and systems thinking approach with evidence-based problem-solving
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search
Want help applying to roles like this?
Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent