Machine Learning Engineer

New York, United States

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Hi,

Good day!

If you are intrested with the below job role then please reply with updated resume and Contact details.

Role: ML Engineer

Hybrid: 3 Days onsite – NYC, NY

Type: Contract

Duration: Long Term

Candidates whose primary background is MLOps platform work (DAG orchestration, Terraform, Kubernetes administration, generic CI/CD pipelines) will not be a fit. We need a senior level engineer who can profile a transformer, rewrite its serving path for a 2–3x latency reduction, tune an HNSW index, and tell us which SageMaker instance type will hit our p95 target at the lowest cost.

Roles & Responsibilities

Design, build, and scale ML-powered inference systems that process large volumes of text, image, and video data to power news-based intelligence products.
Productionize and optimize state-of-the-art models and inference pipelines. These models include, but are not limited to:
DistilBERT for Named Entity Recognition (NER) over hundreds of thousands of search queries/day
TransNetV2 for video shot boundary detection at scale for archival video as well as real-time
SBERT for embedding generation from textual descriptions
External multimodal APIs for image/video captioning
Support hybrid search architectures by defining embedding/re-ranking interfaces, evaluation metrics, and inference performance requirements; partner with search/platform engineers on index configuration, sharding, and cluster tuning.
Design and implement scalable data processing pipelines across hybrid CPU/GPU environments to handle millions of media assets.
Partner with MLOps and platform engineering to enable the deployment and operation of ML systems reliably, contributing to:
Distributed inference architectures
Cloud-based execution (e.g., AWS EC2, Batch, Lambda, SageMaker)
Efficient resource utilization across workloads
Optimize inference latency and throughput across distributed workloads using cloud-based resources (AWS EC2, Batch, Lambda, SageMaker, etc.)
Build resilient asynchronous processing systems for large-scale workloads, ensuring:
Reliability (retries, fault tolerance)
Efficiency (caching, deduplication)
Observability (metrics, logging, traceability)
Work closely with data scientists and product teams to iterate on models, improve performance, and deliver measurable impact in production.

Requirements:

8+ years of experience building ML inference systems.
Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.
Experience with TensorFlow (SavedModel, tf.data, XLA, TFLite) & PyTorch (TorchScript, ONNX, FastAPI/TorchServe)
Hands-on experience optimizing inference pipelines on AWS infrastructure, across different types of media assets.
Experience with video frameworks/tools (e.g., FFmpeg) and working with large-scale frame-level inference.
Demonstrated experience monitoring and debugging model latency, memory, and pipeline throughput.
Experience with hybrid search architectures (BM25 + vector search + cross-encoder reranking).
Familiarity with OpenAI APIs or other foundation model providers.
Familiarity with open source HuggingFace LLMs.
Experience with data pipeline and workflow orchestration tools (e.g., Airflow)

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

ALIS Software LLC Company profile preview

Source: Linkedin
Location: New York, United States
Compensation: Not listed
Open on Caio: 1 role

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for ALIS Software LLC, based on roles Caio has indexed from public sources.

1open roles 1sources 0markets Posted 2mo agolatest role