Lead Robotics Data Engineer
Indexed description
The Lead Robotics Data Engineer owns the technical soul of the operation: designing how data is captured, setting the standards every dataset must meet, and translating those standards into protocols a scaled team in Juárez can execute reliably.
This is not a research role and not a pure engineering role. It sits at the intersection of both: hands-on experimental design, rigorous data standards, and the operational discipline to make a cross-border collection pipeline actually work.
Responsibilities:
>Experiment Design:
- Author capture protocols for human task, manipulation, navigation, and interaction datasets
- Define dataset schemas, metadata standards, versioning conventions, and acceptance criteria for every dataset family
- Run 20–100 pilot demonstrations per task family in El Paso, review failure modes, and iterate
- Operate egocentric (head/chest-mounted), exocentric multi-camera, motion capture suit, and UMI-style demonstration rigs
- Calibrate and maintain all sensor systems before each capture session
- Implement and refine human teleoperation setups as the program matures
- Convert El Paso pilot protocols into operator-ready SOPs for the Juárez scaled capture team
- Train Juárez Supervisor on SOPs, certify the capture process, and approve equipment kits
- Monitor protocol drift and recertify operators as dataset families evolve
- Support training and evaluation of VLMs and VLAs on robot arms and other platforms in El Paso
- Close the loop between data collection and model evaluation outcomes — stop collecting low-value data quickly
- Chair weekly protocol review: new task families and failure analysis
- Collaborate with the Data Engineer / MLOps to build ingestion, versioning, metadata integrity, and dataset packaging workflows
- Monitor metadata completeness and time-from-capture-to-packaged-dataset KPIs
- Contribute to partner-grade dataset documentation and train/validation splits
- Present experimental findings internally and to external partners
- Produce technical reports and maintain a clear dataset release log
- Contribute to talks, demos, and video presentations as Voxelmaps grows its external profile
- Hands-on robotics experiment design & protocol authorship
- Capture architecture: egocentric, exocentric, mocap, UMI-style
- Python — strong, production-quality scripting
- ROS / ROS2 — practical experience
- Dataset standards: metadata schemas, versioning, QA acceptance criteria
- Ability to translate protocols into Juárez operator-ready SOPs
- Cross-functional communication: engineers, operators, partners
- Self-directed; owns outcomes without close supervision
- Responsible for interviewing, selecting, and building the robotics division. This division will consist of two teams (experimentation research and definition team and data acquisition team); assumes full ownership of team performance, execution, and overall success
- MSc in Robotics, CS, Mechanical Engineering, or related (strong industry background accepted)
- Imitation learning / robot learning pipeline experience
- Human teleoperation system operation
- Familiarity with open-source VLM/VLA models (LLaVA, OpenVLA, RT-X)
- Motion capture suit calibration & management
- Experience with humanoid robot platforms
- Prior experience writing training or QA documentation for operators
- Three protocol families are running with documented SOPs and measurable acceptance criteria
- Pilot datasets have been reviewed and validated against quality standards
- Juárez Supervisor has been trained and certified on at least one protocol
- Operating dashboard shows stable throughput and first-pass acceptance rate trends
- Data Engineer / MLOps has a clear and usable ingestion and packaging pipeline to work against
- First technical hire: architect the data factory, not just operate it
- Work at the frontier of Physical AI and humanoid robotics
- Partner access to some of the largest technology and robotics companies in the world
- Hands-on with robot arms, motion capture, and multimodal AI pipelines
- Meaningful equity in a company building the data infrastructure layer for Physical AI
- Direct line to VP of Data Collection — decisions made quickly, no layers
Reports To: Director, Robotics and Data Acquisition
Pay: $110,000-130,000 (depending on experience) yearly salary, paid semi-monthly
Location:
- El Paso, TX – Office based. (Required)
- Relocation may be offered for the right candidate
- Must be able to commute to office daily
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search