EXL Linkedin · Posted 1mo ago

Senior Machine Learning Engineer

Gurugram, Haryana, India

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

About the Company

We are looking for an experienced LLM Ops Engineer to own the end-to-end lifecycle of LLM applications in production - from model selection and pipeline design through fine-tuning, deployment, observability, and continuous improvement. This role sits at the intersection of ML Engineering, DevOps, and Data Engineering, and is critical to ensuring that GenAI systems are reliable, cost-efficient, and scalable in enterprise environments. You will partner closely with AI Research, Product, Platform, and Data Engineering teams.

About the Role

We are looking for an experienced LLM Ops Engineer to own the end-to-end lifecycle of LLM applications in production.

Responsibilities

Design, build, and maintain end-to-end LLM pipelines - from data ingestion and pre-processing through model training, fine-tuning, and deployment into production.
Implement and manage CI/CD pipelines for ML/LLM workflows using tools such as MLflow, Kubeflow, GitHub Actions, etc., ensuring reproducibility and fast iteration cycles.
Own model lifecycle management: versioning, A/B testing, canary deployments, rollbacks, and governance - ensuring models are always production-safe.
Architect and operate LLM serving infrastructure on cloud or on-premises with high availability, low latency, and cost efficiency.
Build robust monitoring, observability, and alerting frameworks for model drift, hallucinations, latency, token costs, and quality regressions (LangSmith, Weights & Biases, others).
Experience with RAG pipelines with vector databases, drive model fine-tuning initiatives for domain-specific applications.
Establish and enforce LLMOps best practices including prompt versioning, evaluation frameworks, guardrails, PII policies, and audit trails.
Manage AI Gateway and model routing across multiple LLM providers (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Vertex AI) with unified auth, rate limiting, and fallback logic.
Optimise inference costs through quantisation, batching strategies, hardware (GPU/TPU) optimisation, and model compression.
Mentor junior engineers and contribute to internal documentation, and platform tooling.

Qualifications

B.Tech / M.Tech in CS, AI/ML, Mathematics or equivalent.

Required Skills

Languages: Python (advanced)
Frameworks: LangChain, LangGraph, Hugging Face, PyTorch, TensorFlow
MLOps / Pipeline Tools: MLflow, Kubeflow, Apache Airflow, Prefect
DevOps / Infra: Docker, Kubernetes, GitHub Actions
Cloud Platforms: AWS Bedrock, Azure OpenAI, Google Vertex AI

Preferred Skills

Experience with RAG & Vector DBs, Fine tuning (LoRA, PEFT), LLM Observability (LangSmith, Weights & Biases, others), prompt evaluation.
Good to have: Security governance (LLM red-teaming, PII redaction, AI safety guardrails), streaming (event driven architecture).

Pay range and compensation package

6 – 10+ Years Overall in software / ML engineering

3+ Years Hands-on production LLM/ML lifecycle

Equal Opportunity Statement

We are committed to diversity and inclusivity.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If CV tailoring and application tracking get heavy, Full Caio Agent adds a human specialist.

View Full Agent

EXL Company profile preview

Source: Linkedin
Location: Gurugram, Haryana, India
Compensation: Not listed
Open on Caio: 87 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for EXL, based on roles Caio has indexed from public sources.

87open roles 4sources 6markets Posted 5d agolatest role