Back to search
SB Telecom America Corp. Linkedin · Posted 19d ago

AI/ML Engineer, AI Infrastructure

Canada

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

About Infrinia.ai, powered by SoftBank: SoftBank is making significant investments in

infrastructure for AI. Through its wholly owned US subsidiary, SoftBank Corp. has established

Infrinia team in Silicon Valley, focused on infrastructure software for AI and AI foundations for

mobile networks. Our goals are to challenge the norms and create products making use of our

SOTA infrastructure (like Nvidia GB200 NVL72, MGX and DGX Grace & Hopper platforms) and

cloud-native software. These products are geared towards centralized AI data centers as well

as distributed AI Radio Access Network (AI RAN) data centers. We are looking for experienced

practitioners who are inspired to bring innovation and build transformative products.


Minimum Qualifications:

  • Bachelor's degree in Computer Science, Electrical Engineering, Mathematics, Statistics, or related field.
  • 3+ years of experience in machine learning, deep learning, and software engineering.
  • Proficiency in Python and experience with C/C++.
  • Experience with major AI/ML frameworks such as PyTorch, TensorFlow, or JAX.
  • Solid understanding of data structures, algorithms, and software design principles.


Preferred Qualifications:

  • Master's or PhD in a relevant field (CS, AI/ML, etc.).
  • Experience with Large Language Models (LLMs), Generative AI, or Computer Vision.
  • Familiarity with distributed training techniques and tools (e.g., Ray, DeepSpeed, Megatron).
  • Experience optimizing models for GPU inference (TensorRT, Triton Inference Server).
  • Knowledge of MLOps practices and tools (Kubeflow, MLflow).


Role: Be a key member of the AI engineering team responsible for developing and optimizing advanced AI models and workloads that run on our high-performance GPU systems. You will leverage our SOTA infrastructure to train, fine-tune, and serve large-scale models. Drive innovation in model architecture and training efficiency to maximize performance and resource utilization. Work closely with infrastructure engineers, product management, and researchers to bridge the gap between hardware capabilities and AI application requirements.


Responsibilities:

  • Design, implement, and train state-of-the-art machine learning models for various applications (e.g., NLP, Computer Vision, Network Optimization).
  • Optimize AI workloads for performance and scalability on large-scale GPU clusters such as GB200 NVL72 with Dynamo, vLLM etc.
  • Collaborate with the team to co-design software and hardware solutions for efficient AI processing.
  • Develop tools and pipelines for data processing, model evaluation, and deployment.
  • Stay up-to-date with the latest advancements in AI research and technology.
  • Contribute to Product Definition (PRD) and program execution (sprint) planning.
  • Role model and foster a culture of humility and innovation for product delivery.


Salary: The base salary for this position ranges from ($150,000-$250,000), with additional attractive biannual bonus and benefits.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent