Software Engineer – AI Inference Engine
Indexed description
We are looking for an exceptional engineer with a strong foundation in GPU programming and compiler infrastructure. The ideal candidate enjoys pushing performance boundaries and has experience supporting production-scale machine learning applications.
Key Responsibilities
- Design and optimize custom GPU kernels for AI (e.g., transformer and diffusion) workloads
- Contribute to the development of FriendliAI’s kernel compiler, memory planner, runtime, and other core components.
- Collaborate with cloud and infrastructure engineers to ensure end-to-end inference performance
- Analyze performance bottlenecks across the software and hardware stack, and implement targeted optimizations
- Drive support for new model architectures and tensor compute patterns
- Maintain production-grade performance infrastructure, including profiling, benchmarking, and validation tools
- 5+ years of experience in production or high-impact research environments
- Production-level expertise in Python and C++
- Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
- Experience developing machine learning frameworks or performance-critical runtime systems
- Hands-on experience writing and optimizing GPU kernels
- Hands-on experience profiling GPU kernels
- Experience working with generative AI models such as transformer and diffusion models
- Experience developing machine learning compilers or code generation systems
- Familiarity with dynamic shape compilation, memory planning, and kernel fusion
- Contributions to inference engines, compilers, or high-performance numerical libraries
- Understanding of multi-GPU and distributed inference strategies
- Flexible working hours
- Daily lunch and dinner provided; unlimited snacks and beverages
- Supportive and highly collaborative work environment
- Health check-up support and top-tier equipment/hardware support
- A front-row seat to the generative AI infrastructure revolution
- Competitive compensation, startup equity, health insurance, and other benefits.
We are a small, fast-moving team doing work that matters at one of the most exciting moments in the history of technology. With our world-class inference engine, we are building a platform that the AI industry can actually rely on.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search