ML Engineer (Evaluation and Experimentation)

Vatican City State (Holy See)

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Company Overview

At Cynnovative, we leverage machine learning, computer science, and software

engineering to address high-impact problems in the cyber domain, specifically

those which are critical to U.S. national security. We primarily extend

fundamental research to invent, design, develop, and deploy prototype solutions

that support persistent problems in this domain.

Job Overview

As a Machine Learning Engineer (Evaluation & Experimentation) at Cynnovative, you will build and maintain systems that run large-scale experiments and evaluate LLM outputs. This role is crucial to rapid, experiment-driven iteration on LLM systems in support of U.S. national security efforts.

NOTE: This role requires an active TS/SCI security clearance and is located on-site in Northern Virginia.

Responsibilities \ May Include

Design and implement evaluation pipelines for LLM experimentation

Implement and apply metrics over model outputs at scale
Build automated evaluation workflows across large experiment sets
Execute statistical analysis and testing over experimental results
Ensure consistency and comparability of results across runs, configurations, and datasets

Develop experiment tracking and logging specifications

Define schemas for capturing prompts, perturbations, outputs, and configurations
Specify and validate logging of token-level probabilities, scores, and derived metrics
Ensure experiment data is structured, complete, and queryable for downstream analysis

Build and maintain datasets and evaluation inputs

Curate prompt sets, perturbation strategies, and test cases provided by the research team
Maintain versioned datasets and experiment inputs
Enable rapid iteration on experiment configurations and evaluation coverage

Collaborate cross-functionally

Work closely with ML systems engineers to ensure correct data capture at scale
Provide feedback on experiment execution, data quality, and metric behavior
Support interpretation of experimental results through reliable measurement

Requirements \ Must Have

B.S. in Computer Science, Data Science, or related field (M.S. or Ph.D. preferred)
Strong communication skills and ability to collaborate cross-functionally
Proficiency in Python and data processing
Experience building experiment, evaluation, or analytics pipelines
Familiarity with experiment tracking tools (MLflow or similar)
Experience working with large-scale or batch data processing workflows
Understanding of statistical methods
Experience working with structured and semi-structured data
Experience with version control systems, particularly Git
U.S. Citizenship and active TS/SCI security clearance

Desired Skills \ Nice To Have

Familiarity with prompt sensitivity, perturbation analysis, or robustness testing
Prior experience in a research-to-product environment
Understanding of A/B testing and large-scale experimentation
Familiarity with cyber-related data, tools, and techniques

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

Cynnovative Company profile preview

Source: Linkedin
Location: Vatican City State (Holy See)
Compensation: Not listed
Open on Caio: 3 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for Cynnovative, based on roles Caio has indexed from public sources.

3open roles 1sources 1markets Posted 3mo agolatest role