Applied AI Engineer – Systems & Reliability (remote/Berlin-based)
Indexed description
We work with some of the world's leading brands, including the NFL, Zapier, Celonis, and DAZN. and are backed by leading investors and operators such as: Moonfire founder Mattias Ljungman, Capnamic, Cherry, André Christ (LeanIX, an SAP company), Mirko Novakovic (Founder Instana/Dash0), Micha Hernandez (Fiberplane), and others.
We’re hiring an Applied AI Engineer to build the backbone of how we ensure quality, reliability, and trust in our AI systems as we scale toward $10M ARR and beyond.
You’ll work directly with founders and play a central role in making sure our AI products are robust, measurable, and enteprise-production-ready. This role is for people who care deeply about quality, enjoy working on hard system problems, and want to build AI that actually works in the real world.
We are an extremely lean team and plan to reach $10M ARR with fewer than 20 people. Every hire materially changes the company. This role has direct exposure to founders and real responsibility from day one.
What you’ll do
Own evaluation systems and quality standards
- Build and maintain evaluation pipelines for core AI workflows across screening, interviews, assessments, and references
- Define metrics, benchmarks, and acceptance criteria for AI outputs
- Track performance over time (quality trends, drift, regressions) and make results visible across the team
- Identify issues across prompts, workflows, and data pipelines using both quantitative analysis and deep dives into real cases
- Design and implement improvements across:
- prompting strategies
- model selection, configuration, and fine-tuning
- input data quality and preprocessing
- orchestration and workflow design
- Push new systems from “working” (80%) to reliable and high-quality (95%+)
- Build and improve monitoring for AI systems (e.g. dashboards, alerts, tracing)
- Detect and prevent failure modes, breakdown risks, and performance degradation
- Monitor usage, rate limits, and capacity to ensure stable operation at scale
- Integrate AI and prompt testing into CI (e.g. regression tests, golden datasets, staging environments)
- Define standards and tooling so product and engineering teams can safely ship without introducing regressions
- Act as a quality gate for AI-related changes
- Prepare and support internal and external audits (e.g. SOC 2 and beyond)
- Provide evidence, documentation, and artifacts for AI system behavior and controls
- Translate audit findings into concrete improvements in systems and processes
- Build and productionize AI workflows that meet defined quality and reliability standards
- Support product and engineering teams in integrating AI cleanly into product logic and user experience
- Ensure new AI capabilities are robust, measurable, and maintainable before release
- 100% alignment with our Ops Principles (if you feel this isn’t you, do not apply)
- Excitement for building in Go
- Experience working with AI/ML systems, LLMs, or data-intensive applications
- High ownership mindset and attention to detail
- Strong interest in quality, reliability, and system performance, not just building features
- Ability to debug complex systems across prompts, models, and data pipelines
- Clear communication and documentation skills
- Comfort improving systems and processes, not just using them
- Experience with evaluation methods, metrics, or experimentation is a strong plus
- Familiarity with monitoring, CI/CD, and production systems is a plus
Strong candidates often come from:
- AI/ML engineering or applied AI roles
- Backend or systems engineering roles with exposure to AI/ML
- Data science roles with strong engineering and production experience
- Other paths that demonstrate building and improving real-world systems with rigor
This role is remote or on-site in our Berlin office. We do not offer any Visa support for Germany at this time.
Benefits
- Direct ownership of one of the most critical parts of the company: AI quality and reliability
- Work closely with founders on core product and technical decisions
- Competitive salary and meaningful stock options
- Educational stipend to support ongoing learning and development
- The best team to work with (true story!)
- Step 1: AI Application Screen (immediate)
- Step 2: AI Recruiter Interview (right after successful AI Application Screen)
- Step 3: AI Skills-Assessment (right after successful AI Recruiter Interview)
- Step 4: Interview with Co-founder
- Step 5: Interview with the team (incl. Live Case Study)
- Step 6: References + Offer
- Duration: 1 week, end-to-end
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search