Back to search
Cynnovative Linkedin · Posted 1mo ago

ML Engineer (Evaluation and Experimentation)

Vatican City State (Holy See)

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Company Overview

At Cynnovative, we leverage machine learning, computer science, and software

engineering to address high-impact problems in the cyber domain, specifically

those which are critical to U.S. national security. We primarily extend

fundamental research to invent, design, develop, and deploy prototype solutions

that support persistent problems in this domain.


Job Overview

As a Machine Learning Engineer (Evaluation & Experimentation) at Cynnovative, you will build and maintain systems that run large-scale experiments and evaluate LLM outputs. This role is crucial to rapid, experiment-driven iteration on LLM systems in support of U.S. national security efforts.


NOTE: This role requires an active TS/SCI security clearance and is located on-site in Northern Virginia.


Responsibilities \ May Include


Design and implement evaluation pipelines for LLM experimentation

  • Implement and apply metrics over model outputs at scale
  • Build automated evaluation workflows across large experiment sets
  • Execute statistical analysis and testing over experimental results
  • Ensure consistency and comparability of results across runs, configurations, and datasets

Develop experiment tracking and logging specifications

  • Define schemas for capturing prompts, perturbations, outputs, and configurations
  • Specify and validate logging of token-level probabilities, scores, and derived metrics
  • Ensure experiment data is structured, complete, and queryable for downstream analysis

Build and maintain datasets and evaluation inputs

  • Curate prompt sets, perturbation strategies, and test cases provided by the research team
  • Maintain versioned datasets and experiment inputs
  • Enable rapid iteration on experiment configurations and evaluation coverage

Collaborate cross-functionally

  • Work closely with ML systems engineers to ensure correct data capture at scale
  • Provide feedback on experiment execution, data quality, and metric behavior
  • Support interpretation of experimental results through reliable measurement


Requirements \ Must Have


  • B.S. in Computer Science, Data Science, or related field (M.S. or Ph.D. preferred)
  • Strong communication skills and ability to collaborate cross-functionally
  • Proficiency in Python and data processing
  • Experience building experiment, evaluation, or analytics pipelines
  • Familiarity with experiment tracking tools (MLflow or similar)
  • Experience working with large-scale or batch data processing workflows
  • Understanding of statistical methods
  • Experience working with structured and semi-structured data
  • Experience with version control systems, particularly Git
  • U.S. Citizenship and active TS/SCI security clearance


Desired Skills \ Nice To Have


  • Familiarity with prompt sensitivity, perturbation analysis, or robustness testing
  • Prior experience in a research-to-product environment
  • Understanding of A/B testing and large-scale experimentation
  • Familiarity with cyber-related data, tools, and techniques


Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent