turbalance Linkedin · Posted 2mo ago

AI Trace Generation Engineer

Heidelberg, Baden-Württemberg, Germany

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Your mission

To support our growing team, we are looking for an experienced AI Trace Generation Engineer to join us. In this role, you will take both a strategic and hands-on approach to designing and building systems that enable deep visibility into distributed AI workloads. This includes developing trace collection, instrumentation, and simulation capabilities that help optimize performance across large-scale, multi-GPU environments. You will work at the intersection of machine learning and systems engineering, contributing to the core infrastructure powering next-generation AI workloads.

Your mission

Design and implement a trace collection system for distributed LLM workloads, capturing compute operations, communication primitives, memory usage, and cluster topology across multi-GPU and multi-node setups
Validate that collected traces accurately reflect real workload behavior - verifying operation completeness, timing consistency, and data integrity across inference and training pipelines
Integrate with and instrument major LLM frameworks (vLLM, TensorRT-LLM, DeepSpeed, Megatron-LM and others) to extract meaningful execution data without disrupting performance
Use collected traces as input to discrete event simulations that model and replay distributed AI workload behavior at scale
Analyze trace data to surface bottlenecks and inefficiencies across the stack, from individual kernel execution to cluster-wide communication patterns

Your profile

3+ years of experience in AI systems, ML infrastructure, or a closely related area
Hands-on experience with at least one major LLM serving or training framework
Strong proficiency in Python and C++, with a solid understanding of GPU architecture, memory bandwidth, and the difference between compute-bound and memory-bound operations
Solid understanding of distributed communication
Familiarity with parallelism strategies and how they shape execution behavior across large clusters

Nice to have

Open source contributions or published research in relevant areas
Experience in startup environments, with the ability to move quickly, navigate ambiguity, and take ownership

Why us?

Build something big: Help build and scale a fast-growing AI infrastructure startup
Pay & perks: Competitive compensation with a performance-based incentive, subsidized Deutschlandticket, and access to a discount portal
Work your way: Flexible hours with hybrid and remote-friendly options
Fast lanes, no red tape: Flat hierarchies and rapid decision-making mean ideas ship quickly
Global team: Work with a diverse, international team across Germany and the USA
Modern headquarters: Well-stocked office near the Heidelberg Hauptbahnhof, available on a hybrid basis or as a place to connect during our quarterly team workshops
Top setup: Your choice of high-quality hardware and equipment
Relocation support: We’ll help make your move to join us as smooth as possible

About Us

turbalance is an innovative, emerging startup that transforms AI laws. We are a team of passionate problem-solvers who believe in what we’re building. We constantly push boundaries and embrace our inner nerds as we find new ways to tackle complex challenges. You will find a dynamic work environment here, with flat or even non-existent hierarchies and the chance to take on responsibility from day one.

Apply for this job

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If CV tailoring and application tracking get heavy, Full Caio Agent adds a human specialist.

View Full Agent

turbalance Company profile preview

Source: Linkedin
Location: Heidelberg, Baden-Württemberg, Germany
Compensation: Not listed
Open on Caio: 16 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for turbalance, based on roles Caio has indexed from public sources.

16open roles 1sources 3markets Posted 1mo agolatest role