Back to search
WOW Remote Teams Himalayas · Posted yesterday

AI Engineer - Classifiers, Media Intelligence & Voice R&D

USD Full time Remote

AI Engineering Machine Learning Engineering Computer Vision Engineering Voice AI Engineering
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

This is a remote position.

Our client is looking for an innovative and driven AI Engineer to join their team. A leader in media intelligence and AI-driven content creation, they have recently expanded their work in AI voice and image technologies, driving the development of the next generation of cutting-edge products. This role will focus on the creation, classification, and organization of massive volumes of AI-generated media, along with spearheading R&D into AI voice and audio generation and advanced image intelligence capabilities.

Job Description

Responsibilities:

  • Design, train, and deploy classification models for content pipeline, including style detection, quality scoring, content moderation, filtering, and semantic categorization of generated media.
  • Develop and maintain automated tagging and organization systems for the media library: extracting attributes, detecting visual features, clustering similar content, and enabling intelligent search.
  • Build and optimize training data pipelines: create annotation tooling, curate datasets, establish active learning loops, and ensure high-quality labeled data.
  • Lead R&D into AI voice and audio generation, including voice cloning, text-to-speech, and audio synthesis; prototype integrations and create a production-ready pathway from research to features.
  • Research and prototype image intelligence technologies such as face/body analysis, pose estimation, style transfer, and image-to-image consistency.
  • Develop evaluation frameworks to measure the accuracy of classifiers, the quality of generation models, and model drift over time.
  • Optimize inference pipelines for performance, cost, and latency—incorporating batching, quantization, caching, and model serving strategies.
  • Integrate with GPU compute infrastructure and deliver models via production APIs.

Requirements

  • 3+ years of experience building and deploying machine learning models in production, particularly in classification, tagging, or content understanding.
  • Hands-on experience with model training, including dataset curation, experimenting with architectures, tuning hyperparameters, and debugging.
  • Strong background in image classification and computer vision techniques (e.g., CNNs, vision transformers, CLIP).
  • Experience or demonstrated interest in voice/audio AI (e.g., text-to-speech, voice cloning, audio classification).
  • Proficiency in Python, with experience in PyTorch or TensorFlow.
  • Experience with building data labeling pipelines, annotation workflows, or active learning systems.
  • Understanding of model serving in production environments, including REST APIs and latency optimization.

Qualifications:

  • Bachelor’s degree or higher in Computer Science, Engineering, or related field.
  • Experience in AI/ML, particularly in content classification, tagging, and media organization systems.
  • Proven experience with Python and ML frameworks like PyTorch or TensorFlow.
  • Strong communication skills to collaborate with R&D teams and integrate new technologies into production.

Benefits

  • 100% remote, full-time role.
  • Flexible work hours.
  • Competitive salary and comprehensive benefits package.
  • Opportunity for career advancement and personal development.
  • Work on high-impact projects with cutting-edge AI technologies.
  • Originally posted on Himalayas

    Free. 20 seconds. No password. See every match in this search.

    Create a free Caio profile to unlock more results and save your role and location preferences.

    Unlock free search
    Want help applying to roles like this? Search Caio for free. If CV tailoring and application tracking get heavy, Full Caio Agent adds a human specialist.
    View Full Agent