Senior Software Engineer - Computer Vision Deployment
Indexed description
This is a deeply technical role at the intersection of machine learning, distributed systems, and cloud infrastructure. You'll design scalable GPU compute clusters, build robust orchestration pipelines, and optimize model serving for low-latency inference at scale. You'll work closely with our research scientists, computer vision engineers, and product teams to bridge the gap between experimental models and production-ready systems that operate across diverse warehouse environments. We've found tremendous value in collaborative problem-solving, thus our team works from our SF office three days a week.
Responsibilities
- Develop and maintain distributed cloud GPU infrastructure for large-scale world model training and low-latency inference.
- Build end-to-end computer vision pipelines — from data ingestion and preprocessing through model training, evaluation, and deployment — and integrate them into core product workflows.
- Deploy and optimize state-of-the-art machine learning models in the cloud using model serving platforms and inference optimization techniques, including VLMs and VLAs.
- Design and operate orchestration systems that enable both engineers and non-engineers to build and manage data and ML pipelines.
- Establish monitoring, benchmarking, and evaluation frameworks to ensure model performance and reliability in production environments.
- B.S. / M.S. in Computer Science, Robotics, or similar technical field, or equivalent practical experience.
- 7+ years of professional software engineering experience, with at least 3 years in machine learning infrastructure — developing, scaling, training, deploying, and optimizing large-scale ML systems from data to model.
- Track record of deploying computer vision models in production environments with real-world constraints.
- Experience with distributed messaging and compute systems (Kafka, gRPC, ROS2, or similar).
- Strong programming skills in Python with solid software engineering practices.
- Experience developing, running, and managing orchestration systems (Flyte, Temporal, Airflow, or similar) for ML and data pipelines.
- Proficiency with ML frameworks (PyTorch, TensorFlow, DeepSpeed) and model serving platforms (TorchServe, TensorFlow Serving, NVIDIA Triton Inference Server, or similar).
- Deep understanding of state-of-the-art machine learning models such as auto-regressive transformers and familiarity with inference optimization techniques (TensorRT, quantization, custom kernels).
- Experience with C++ or CUDA programming for GPU acceleration.
- Prior experience working at autonomous vehicles or robotics companies.
Benefits
At Claryo, we offer a competitive benefits package that supports your health and well-being, including — top-tier medical, dental, and vision coverage, 401k with employer matching, parental leave, and unlimited vacation.
Compensation Range: $170K - $190K
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search