Python Insfrastructure Engineer - Model Evaluation

Colombia

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Python Infrastructure Engineer — Model Evaluation (AI Training)

About The Role

What if your Python expertise could directly shape how the world's most advanced AI models are built, tested, and improved? We're looking for a senior Python engineer to design and build the data pipelines, evaluation harnesses, and annotation tooling that sit at the heart of cutting-edge AI development.

This is a fully remote, flexible contract role working alongside leading AI research labs on real production systems. If you're a strong Python engineer who wants to do meaningful, high-impact work at the frontier of AI — this is the role for you.

Organization: Alignerr
Type: Hourly Contract
Location: Remote
Commitment: 20–40 hours/week

What You'll Do

Design, build, and optimize high-performance Python systems supporting AI data pipelines and model evaluation workflows
Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control
Build and maintain evaluation harnesses that integrate with ML inference frameworks
Improve reliability, performance, and safety across existing Python codebases
Instrument systems with observability and metrics collection to monitor reliability and model performance
Identify bottlenecks and edge cases in data and system behavior, and implement scalable fixes
Collaborate with data, research, and engineering teams to support model training and evaluation workflows
Participate in synchronous design reviews to iterate on architecture and implementation decisions

Who You Are

Native or fluent English speaker with clear written and verbal communication skills
Full-stack developer with a strong systems programming background
3–5+ years of professional experience writing production-grade Python
Experienced building evaluation harnesses for ML models and integrating with inference frameworks
Solid background in observability, metrics collection, and monitoring for production systems
Self-motivated and reliable — able to commit 20–40 hours per week

Nice to Have

Prior experience with data annotation, data quality, or evaluation systems
Familiarity with AI/ML workflows, model training, or benchmarking pipelines
Experience with distributed systems or developer tooling
Background in MLOps or AI infrastructure

Why Join Us

Work directly on cutting-edge AI projects alongside leading research labs
Fully remote and flexible — structure your work week around your life
Freelance autonomy with the depth and consistency of meaningful, long-term technical work
Make a tangible impact on how next-generation AI models are evaluated and improved
Potential for ongoing work and contract extension as new projects launch

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

Alignerr Company profile preview

Source: Linkedin
Location: Colombia
Compensation: Not listed
Open on Caio: 233 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for Alignerr, based on roles Caio has indexed from public sources.

233open roles 1sources 22markets Posted 4d agolatest role