Back to search
mercor Himalayas · Posted 9d ago

CUDA Kernel Optimization Specialist - AI Trainer

USD Full time Remote

AI Performance Optimization Engineer AI Optimization Engineer GPU Programming Engineer GPU Software Engineer
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Role Overview

Analyze and optimize GPU kernels for performance, efficiency, and hardware utilization. Use profiler metrics to guide kernel improvements. Review GPU kernel implementations to identify bottlenecks without needing extensive algorithmic background.

What You Will Do

Write, modify, and reason about C++17, Python, and GPU programming code. Apply CUDA, HIP, and shader programming expertise to improve performance outcomes. Document optimization decisions clearly.

Why It Might Be a Fit

Must have at least 1 year of professional or graduate-level research experience with GPUs. Strong understanding of GPU profiler performance metrics for kernel optimization. Ability to optimize GPU kernels without deep prior context on every algorithm.

Requirements

  • Available to work at least 20 hrs/wk.
  • Fluent in core C++ features through C++17.
  • Working knowledge of Python and Git.
  • Fluent in at least one GPU programming model like CUDA, HIP, Slang, HLSL, or GLSL.
  • At least 1 year of professional or graduate-level research experience with GPUs.
  • Strong understanding of GPU profiler performance metrics for kernel optimization.
  • Ability to optimize GPU kernels without deep prior context on every algorithm.

Originally posted on Himalayas

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent