AI/ML Engineer, AI Infrastructure

Canada

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

About Infrinia.ai, powered by SoftBank: SoftBank is making significant investments in

infrastructure for AI. Through its wholly owned US subsidiary, SoftBank Corp. has established

Infrinia team in Silicon Valley, focused on infrastructure software for AI and AI foundations for

mobile networks. Our goals are to challenge the norms and create products making use of our

SOTA infrastructure (like Nvidia GB200 NVL72, MGX and DGX Grace & Hopper platforms) and

cloud-native software. These products are geared towards centralized AI data centers as well

as distributed AI Radio Access Network (AI RAN) data centers. We are looking for experienced

practitioners who are inspired to bring innovation and build transformative products.

Minimum Qualifications:

Bachelor's degree in Computer Science, Electrical Engineering, Mathematics, Statistics, or related field.
3+ years of experience in machine learning, deep learning, and software engineering.
Proficiency in Python and experience with C/C++.
Experience with major AI/ML frameworks such as PyTorch, TensorFlow, or JAX.
Solid understanding of data structures, algorithms, and software design principles.

Preferred Qualifications:

Master's or PhD in a relevant field (CS, AI/ML, etc.).
Experience with Large Language Models (LLMs), Generative AI, or Computer Vision.
Familiarity with distributed training techniques and tools (e.g., Ray, DeepSpeed, Megatron).
Experience optimizing models for GPU inference (TensorRT, Triton Inference Server).
Knowledge of MLOps practices and tools (Kubeflow, MLflow).

Role: Be a key member of the AI engineering team responsible for developing and optimizing advanced AI models and workloads that run on our high-performance GPU systems. You will leverage our SOTA infrastructure to train, fine-tune, and serve large-scale models. Drive innovation in model architecture and training efficiency to maximize performance and resource utilization. Work closely with infrastructure engineers, product management, and researchers to bridge the gap between hardware capabilities and AI application requirements.

Responsibilities:

Design, implement, and train state-of-the-art machine learning models for various applications (e.g., NLP, Computer Vision, Network Optimization).
Optimize AI workloads for performance and scalability on large-scale GPU clusters such as GB200 NVL72 with Dynamo, vLLM etc.
Collaborate with the team to co-design software and hardware solutions for efficient AI processing.
Develop tools and pipelines for data processing, model evaluation, and deployment.
Stay up-to-date with the latest advancements in AI research and technology.
Contribute to Product Definition (PRD) and program execution (sprint) planning.
Role model and foster a culture of humility and innovation for product delivery.

Salary: The base salary for this position ranges from ($150,000-$250,000), with additional attractive biannual bonus and benefits.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

SB Telecom America Corp. Company profile preview

Source: Linkedin
Location: Canada
Compensation: Not listed
Open on Caio: 2 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for SB Telecom America Corp., based on roles Caio has indexed from public sources.

2open roles 1sources 2markets Posted 2mo agolatest role