Back to search
Talener Linkedin · Posted 19d ago

ML Engineer

New York City, New York, United States

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Title: Machine Learning Engineer

Location: Remote

Client: Global newswire and media organization.Their content reaches more than half the global population daily. The tech org is modern and investing heavily in ML infrastructure to support large-scale media processing across text, image, and video.

Role Description

This is a senior hands-on ML engineering job focused on building and optimizing inference systems that run in production at scale. You'd be working across text, image, and video pipelines - processing millions of media assets to power news intelligence products. Think DistilBERT for NER, SBERT for embeddings, TransNetV2 for video shot detection, and external multimodal APIs for captioning.

This is not an MLOps or platform role. They need someone who can profile a transformer, rewrite its serving path for a 2-3x latency improvement, tune an HNSW index, and make smart infrastructure decisions on SageMaker instance selection to hit p95 targets at the lowest cost. If your background is primarily Terraform, Kubernetes admin, or CI/CD pipelines - this isn't the right fit.

You'll partner closely with MLOps, platform engineering, data scientists, and product teams - but ownership of model performance, inference logic, and pipeline efficiency lives here.

Required Skills

  • 5+ years building production ML inference systems
  • Python - core to everything in this role
  • PyTorch (TorchScript, ONNX, FastAPI/TorchServe) and TensorFlow (SavedModel, tf.data, XLA, TFLite) - both required
  • Deep hands-on experience with transformer-based models (BERT family - DistilBERT, SBERT, etc.) in production
  • Inference optimization at scale - quantization, distillation, compilation, kernel/profile-level performance work
  • AWS infrastructure - EC2, Batch, Lambda, SageMaker across different media workload types
  • Hybrid search architecture experience - BM25 + vector search + cross-encoder reranking
  • Asynchronous processing systems - reliability, caching, deduplication, observability
  • Data pipeline and workflow orchestration (Airflow or similar)
  • Video frameworks - FFmpeg, large-scale frame-level inference
  • Must have experience in the media industry
  • Must have experience working with large amounts of data, including text, images and videos

Nice to Have

  • Experience with TransNetV2 or similar video shot boundary detection
  • Familiarity with HuggingFace open source LLMs
  • OpenAI API or other foundation model provider experience
  • Hybrid CPU/GPU environment experience at scale

Compensation

Base salary up to 150,000.00 + 15% bonus target

For additional information or to apply, please contact Bethany Moulthrop at [email protected]

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent