Software Engineering Manager, Cloud ML Compute Services (Mandarin, English)
Indexed description
- Bachelor's degree or equivalent practical experience.
- 8 years of experience in software development.
- 5 years of experience leading ML design and optimizing ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
- 3 years of experience in a technical leadership role.
- 2 years of experience in a people management or team leadership role.
- Ability to communicate in Mandarin and English fluently to partner with local clients and regional partners.
- Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
- 3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
- Experience collaborating with customers and field teams.
- Experience developing internal quality and repro testing to cover critical user journeys.
- Experience working in AI/ML and Infrastructure technologies.
- Ability to drive product improvement through bug fixes and feature enhancements.
With technical and leadership expertise, you manage engineers across multiple teams and locations, a large product budget and oversee the deployment of large-scale projects across multiple sites internationally.
The AI and Infrastructure team is redefining what’s possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.
We're the driving force behind Google's groundbreaking innovations, empowering the development of our cutting-edge AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.
Responsibilities
- Recruit, provide mentorship and technical guidance to the team including road map and direction for team deliverables as well as efficient execution.
- Partner with customers to optimize the performance of their AI/ML models on Google Cloud infrastructure. Lead performance profiling, debugging, and troubleshooting of customer training and inference workloads.
- Collaborate with internal infrastructure, ML teams to improve Google Cloud's ability to support AI workloads.
- Develop and deliver training materials and demos to empower customers and internal teams.
- Contribute to the continuous improvement of our products by identifying and reporting bugs and suggesting enhancements. Proactively identify and address technical bottlenecks hindering customer success.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search