GL Global Linkedin · Posted 2mo ago

Senior GPU & LLM Infrastructure Engineer (NVIDIA, vLLM, OpenShift AI)

New Caledonia

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Our banking client is building a large-scale private GenAI environment and is seeking experienced engineers to support enterprise-grade on-prem inference platforms powered by NVIDIA H200 GPU clusters and OpenShift AI. This role is focused entirely on high-performance LLM inferencing and runtime optimization - not model training or fine-tuning.

What You’ll Do

Optimize large-scale LLM inference performance across NVIDIA GPU environments.
Drive runtime efficiency across token generation pipelines, including KV cache and prefill/decode optimization.
Deploy and operate modern inference frameworks including vLLM and TensorRT-LLM.
Manage GPU throughput, batching strategies, latency tuning, and workload orchestration using RunAI and Kubernetes.
Oversee the full Hugging Face model lifecycle including onboarding, deployment, versioning, and retirement.
Operate and maintain OpenShift AI as the core enterprise GenAI platform.
Support production-grade self-hosted open-source LLM environments, including Llama models.

Experience

Strong background in AI infrastructure, GPU platforms, or LLM runtime engineering.
Hands-on experience with NVIDIA H200 GPU clusters and large-scale inference optimization.
Deep understanding of KV cache management, token serving pipelines, and inference latency optimization.
Expertise with OpenShift AI, Kubernetes GPU orchestration, and RunAI.
Strong experience with vLLM and TensorRT-LLM in production environments.
Proven experience managing Hugging Face model deployment lifecycles.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If CV tailoring and application tracking get heavy, Full Caio Agent adds a human specialist.

View Full Agent

GL Global Company profile preview

Source: Linkedin
Location: New Caledonia
Compensation: Not listed
Open on Caio: 3 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for GL Global, based on roles Caio has indexed from public sources.

3open roles 1sources 2markets Posted 1mo agolatest role