Data Engineer
Indexed description
We're hiring a Data Engineer to join our team in the US. You'll be part of the team that builds and maintains the data infrastructure — the pipelines, platforms, and systems that keep data flowing reliably from source to insight.
What You Will Do:
- Design, build, and maintain data pipelines that move and transform large volumes of data reliably, at scale, every day
- Embed AI into your engineering work — whether that's RAG pipelines, LLM-driven workflows, or model scoring built directly into the data systems you own
- Build the data infrastructure that Data Scientists need to train, retrain, and run ML models in production
- Work with cloud platforms (AWS, Azure, or GCP) to design and operate cloud-native data solutions
- Use orchestration tools like Apache Airflow to schedule, monitor, and manage pipeline workflows
- Set up data quality checks and automated testing so problems are caught before they reach downstream systems
- Collaborate closely with Data Scientists, Engineers, and Product teams to turn business requirements into data infrastructure
- Contribute to a codebase that is production-grade — reviewed, tested, documented, and built to last
- Hands-on Data Engineering experience building and owning production pipelines
- Strong Python skills used for real pipeline and data engineering work, not just analysis
- PySpark or Apache Spark experience for processing data at scale
- Solid SQL skills — schema design, query optimisation, working with relational databases
- Experience with at least one major cloud platform (AWS, Azure, or GCP) using data services, not just compute
- A track record of AI embedded in your data engineering work — RAG pipelines, LLM-driven workflows, ML model infrastructure, or similar. This is required, not a bonus.
- Hands-on experience with Databricks or Microsoft Fabric (either is fine — we care more about platform depth than the specific tool)
- Apache Airflow for pipeline orchestration
- MLOps experience — model versioning, monitoring, and deployment
- Exposure to LangChain, vector databases, or similar GenAI tooling
- A BS or MS in Computer Science, Engineering, or a related field
Additional Compensation: Eligible for a performance-based bonus tied to individual and company performance.
Benefits: Comprehensive health coverage (medical, dental, and vision), 401(k) plan with a 3% company match, paid time off, company holidays, and professional development opportunities.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search