ML Engineer

New York City, New York, United States

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Title: Machine Learning Engineer

Location: Remote

Client: Global newswire and media organization.Their content reaches more than half the global population daily. The tech org is modern and investing heavily in ML infrastructure to support large-scale media processing across text, image, and video.

Role Description

This is a senior hands-on ML engineering job focused on building and optimizing inference systems that run in production at scale. You'd be working across text, image, and video pipelines - processing millions of media assets to power news intelligence products. Think DistilBERT for NER, SBERT for embeddings, TransNetV2 for video shot detection, and external multimodal APIs for captioning.

This is not an MLOps or platform role. They need someone who can profile a transformer, rewrite its serving path for a 2-3x latency improvement, tune an HNSW index, and make smart infrastructure decisions on SageMaker instance selection to hit p95 targets at the lowest cost. If your background is primarily Terraform, Kubernetes admin, or CI/CD pipelines - this isn't the right fit.

You'll partner closely with MLOps, platform engineering, data scientists, and product teams - but ownership of model performance, inference logic, and pipeline efficiency lives here.

Required Skills

5+ years building production ML inference systems
Python - core to everything in this role
PyTorch (TorchScript, ONNX, FastAPI/TorchServe) and TensorFlow (SavedModel, tf.data, XLA, TFLite) - both required
Deep hands-on experience with transformer-based models (BERT family - DistilBERT, SBERT, etc.) in production
Inference optimization at scale - quantization, distillation, compilation, kernel/profile-level performance work
AWS infrastructure - EC2, Batch, Lambda, SageMaker across different media workload types
Hybrid search architecture experience - BM25 + vector search + cross-encoder reranking
Asynchronous processing systems - reliability, caching, deduplication, observability
Data pipeline and workflow orchestration (Airflow or similar)
Video frameworks - FFmpeg, large-scale frame-level inference
Must have experience in the media industry
Must have experience working with large amounts of data, including text, images and videos

Nice to Have

Experience with TransNetV2 or similar video shot boundary detection
Familiarity with HuggingFace open source LLMs
OpenAI API or other foundation model provider experience
Hybrid CPU/GPU environment experience at scale

Compensation

Base salary up to 150,000.00 + 15% bonus target

For additional information or to apply, please contact Bethany Moulthrop at [email protected]

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

Talener Company profile preview

Source: Linkedin
Location: New York City, New York, United States
Compensation: Not listed
Open on Caio: 15 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for Talener, based on roles Caio has indexed from public sources.

15open roles 1sources 4markets Posted 10d agolatest role