Back to search
Rise Technical Linkedin · Posted 19d ago

Machine Learning Engineer (Model Evaluation)

Canada

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

ML Engineer (Model Evaluation & Benchmarking)

San Francisco, California (Hybrid)

$160,000 - $220,000 + Equity + Healthcare + 401(k) + PTO


Are you a Machine Learning Engineer passionate about AI reliability, looking to build the critical systems that ensure multimodal generative models are production-ready?


This is an opportunity to join a high-growth AI startup at the forefront of multimodal technology. As we move from research into production, this role is dedicated to building the sophisticated evaluation frameworks that measure realism, consistency, and performance for next-generation vision models.


In this role, you will sit at the intersection of applied science, infrastructure, and product. You will design and deploy automated benchmarking pipelines that act as the definitive gate for model quality, ensuring that our generative and vision-based architectures behave predictably at scale.


This role would suit an engineer who enjoys solving complex "black-box" testing problems and building the production-ready infrastructure that defines how quality is measured in generative AI.


The Role:

*Build automated evaluation pipelines to validate multimodal AI models at scale.

*Develop benchmarking workflows specifically for diffusion models and generative workflows.

*Integrate evaluation tooling into CI/CD workflows to detect regressions across model checkpoints.

*Collaborate with Research and Infrastructure teams to bridge the gap between experiment and deployment.


The Person:

*Strong programming skills in Python with experience in machine learning frameworks.

*Experience with ML experimentation workflows and model validation techniques.

*Familiarity with Generative AI (Diffusion, VLMs, or Stable Diffusion) and image/video benchmarking.

*Solid software engineering fundamentals, including OOP and scalable data structures.

*US Citizen or Green Card Holder

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent