Gramian Consulting Linkedin · Posted 2mo ago

AI Evaluation Engineer - Mathematics & Algorithms

Colombia, Huila, Colombia

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

About Us

Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions. With a strong background in software engineering and leadership, we help companies build high-performing teams by matching them with professionals who truly fit their needs.

Role Overview

We are looking for a highly analytical and computationally strong professional with a solid research background in mathematics or quantitative fields.

In this role, you will design advanced benchmark tasks for multi-agent AI systems, focusing on complex mathematical reasoning, algorithmic problem-solving, and verifiable computational outputs. You will contribute by crafting challenging problems, building validation systems, and structuring tasks that require decomposition into coordinated sub-solutions.

Commitments Required: 8 hours per day with an overlap of 4 hours with PST.

Employment type: Contractor assignment (no medical/paid leave)

Duration of contract: 4 weeks+

Location: Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Indonesia, Kenya, Nigeria,Turkey, Vietnam

Interview: take home assessment (60min) + short interview

Responsibilities

Design and build multi-agent benchmark tasks requiring multi-step mathematical reasoning and algorithmic problem-solving
Create complex, decomposable problems across domains such as:

Competition mathematics
Numerical analysis
Combinatorial optimization
Statistical inference

Develop verification scripts to validate:

Numerical outputs (with tolerance thresholds)
Proof correctness and logical steps
Algorithmic outputs and constraints

Write clear, structured problem statements with precise notation and defined outputs
Design task decomposition strategies for parallel or multi-agent execution
Implement computational solutions and validation pipelines using Python
Work with containerized environments (Docker) for reproducibility and evaluation

Requirements

5+ years in mathematics, quantitative research, or computational science
Strong Python skills for scientific computing (NumPy, SciPy, SymPy or similar)
Experience solving or designing complex mathematical / algorithmic problems
Ability to create precise, verifiable outputs (no subjective problems)
Experience with mathematical proofs or formal reasoning
Familiarity with AI benchmarks or evaluation frameworks (e.g., SWE-bench)
Comfortable working with Docker environments
Solid understanding of numerical methods (precision, convergence, error bounds)

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

Gramian Consulting Company profile preview

Source: Linkedin
Location: Colombia, Huila, Colombia
Compensation: Not listed
Open on Caio: 15 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for Gramian Consulting, based on roles Caio has indexed from public sources.

15open roles 2sources 5markets Posted 17d agolatest role