Back to search
Drafted Linkedin · Posted 4mo ago

Senior Backend / ML Ops Engineer

Canada

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

About Drafted


Drafted is unlocking creativity in the physical world. We’re building foundational models and generative pipelines that create floor plans and renderings instantly, so anyone can imagine their dream home. Starting with single-family homes, we plan to verticalize across all dimensions of the pre-construction stack.


Generative architecture is a prime domain for applied research with abundant data, verifiable constraints, and a clear value proposition. Our team of second-time founders, engineers, and designers pairs exceptional product taste with deep technical rigor to turn real-world buildability constraints into an intuitive, creative experience.


Drafted's Values


We're a small team working fully in-person in San Francisco. We value high-ownership builders who want to be a part of a talented, highly motivated team. We're guided by the following values:


  • Own the mission. We take agency, act like owners, and see problems through to real outcomes.
  • Build in the open. We value direct feedback, fast learning, and growth through honest collaboration.
  • Move with care and speed. We iterate quickly while staying deeply respectful of our teammates.
  • Seek the why. We challenge assumptions, think from first principles, and never stop asking questions.
  • Design for everyone. We believe anyone should be able to design and build a home they love.
  • Solve what matters. We embrace hard problems and create new paths forward when none exist.


The Role


As a full stack engineer you’ll work across our entire software stack from model pipelines and infrastructure to scaling and optimizing user experiences. You’ll work closely with teammates across the engineering and research teams.


Example Projects


  • Building parallel generation pipelines where multiple workers race to fill output slots, with dynamic filtering based on post-processing results. Implementing claim coordination to prevent duplicate work, fallback logic to use best-available generations when hitting retry limits, and caching mechanisms to reuse generations across jobs (same user regenerating with the same prompt)
  • Developing coordination mechanisms for capacity-constrained pipelines where maximum concurrency is fixed (reserved GPU instances, instance quotas, API rate limits) and peak demand exceeds available capacity—implementing backpressure, admission control, and retry logic to prevent overwhelming downstream consumers
  • Implementing timeout and cleanup policies that account for (1) high variance of computational complexity (p99 is 10x p50) and (2) variable parallelism where completion time depends on concurrent worker count (which fluctuates dynamically based on queue dynamics and capacity constraints) without being overly conservative or prematurely terminating legitimately slow work


Ideal Experience


  • Building and scaling GPU-based inference services, optimizing for both low latency and high resource utilization
  • Job orchestration and load balancing with parallel generations, heterogeneous resource constraints (GPU, CPU, I/O), and multi-tiered queues
  • Implementing observability for latency attribution and failure diagnosis for multi-stage, asynchronous, and cross-platform pipelines
  • Designing fan-out architectures where upstream job completion triggers multiple independent downstream consumers that have mixed criticality, with some consumers blocking and others best-effort
  • Familiarity with modern cloud infrastructure: managed databases, job queues, edge compute/CDN, and PaaS deployment platforms


Desired Experience


  • 5+ years experience coding
  • Vibe coding (criticality on how it’s used, when, how much)
  • Varieties of backends (servers in Python ,Typescript, Rust)
  • Deployed to different infrastructures (AWS, Cloudflare, Railway, etc)
  • Tried fine-tuning an ML model
  • Knowledge of training infrastructure, especially distributed GPU training across multiple nodes


Salary Range

  • $150k - $300k + 0.5-1.5% equity
Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent