Back to search
EXL Linkedin · Posted 10d ago

Senior Machine Learning Engineer

Gurugram, Haryana, India

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

About the Company



We are looking for an experienced LLM Ops Engineer to own the end-to-end lifecycle of LLM applications in production - from model selection and pipeline design through fine-tuning, deployment, observability, and continuous improvement. This role sits at the intersection of ML Engineering, DevOps, and Data Engineering, and is critical to ensuring that GenAI systems are reliable, cost-efficient, and scalable in enterprise environments. You will partner closely with AI Research, Product, Platform, and Data Engineering teams.



About the Role



We are looking for an experienced LLM Ops Engineer to own the end-to-end lifecycle of LLM applications in production.



Responsibilities



  • Design, build, and maintain end-to-end LLM pipelines - from data ingestion and pre-processing through model training, fine-tuning, and deployment into production.
  • Implement and manage CI/CD pipelines for ML/LLM workflows using tools such as MLflow, Kubeflow, GitHub Actions, etc., ensuring reproducibility and fast iteration cycles.
  • Own model lifecycle management: versioning, A/B testing, canary deployments, rollbacks, and governance - ensuring models are always production-safe.
  • Architect and operate LLM serving infrastructure on cloud or on-premises with high availability, low latency, and cost efficiency.
  • Build robust monitoring, observability, and alerting frameworks for model drift, hallucinations, latency, token costs, and quality regressions (LangSmith, Weights & Biases, others).
  • Experience with RAG pipelines with vector databases, drive model fine-tuning initiatives for domain-specific applications.
  • Establish and enforce LLMOps best practices including prompt versioning, evaluation frameworks, guardrails, PII policies, and audit trails.
  • Manage AI Gateway and model routing across multiple LLM providers (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Vertex AI) with unified auth, rate limiting, and fallback logic.
  • Optimise inference costs through quantisation, batching strategies, hardware (GPU/TPU) optimisation, and model compression.
  • Mentor junior engineers and contribute to internal documentation, and platform tooling.


Qualifications



  • B.Tech / M.Tech in CS, AI/ML, Mathematics or equivalent.


Required Skills



  • Languages: Python (advanced)
  • Frameworks: LangChain, LangGraph, Hugging Face, PyTorch, TensorFlow
  • MLOps / Pipeline Tools: MLflow, Kubeflow, Apache Airflow, Prefect
  • DevOps / Infra: Docker, Kubernetes, GitHub Actions
  • Cloud Platforms: AWS Bedrock, Azure OpenAI, Google Vertex AI


Preferred Skills



  • Experience with RAG & Vector DBs, Fine tuning (LoRA, PEFT), LLM Observability (LangSmith, Weights & Biases, others), prompt evaluation.
  • Good to have: Security governance (LLM red-teaming, PII redaction, AI safety guardrails), streaming (event driven architecture).


Pay range and compensation package



6 – 10+ Years Overall in software / ML engineering



3+ Years Hands-on production LLM/ML lifecycle



Equal Opportunity Statement



We are committed to diversity and inclusivity.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.

Unlock free search