Back to search
RoseBerry Linkedin · Posted 1mo ago

Data Engineer

Israel

Linkedin
Continue to application Add your email once, then Caio opens the original posting.

Indexed description

We’re looking for a strong Data Engineer to join our team and build the data infrastructure that powers our product, analytics, and future AI capabilities.


The Challenge

We have a successful, high-scale product generating massive amounts of B2C data. We are now at the pivotal stage of designing and building our formal Data Hub and infrastructure from the ground up. You will be the technical lead responsible for defining the architecture that replaces ad-hoc processes with a robust, centralized Source of Truth.


What you’ll do

● Foundational Architecture: Define and implement the blueprint for our new Data Hub, establishing the standards for data modeling, storage, and retrieval across the organization.

● End-to-End Pipeline Engineering: Design and orchestrate high-throughput ETL/ELT pipelines using Python, Apache Airflow, GCP, and Kafka to ingest data from diverse, multi-channel sources.

● Multi-Layered Data Strategy: Build the logic for Bronze-to-Gold processing, ensuring data is correctly partitioned, cleaned, and aggregated for real-time, daily/weekly/monthly, and retro-conciliation use cases.

● GCP Ecosystem Ownership: Architect our BigQuery environment for maximum performance and extreme cost-efficiency, managing high-volume datasets without compromising speed.

● Data Integrity & Governance: Implement rigorous validation frameworks to ensure data reliability and consistency, acting as the guardian of our "Source of Truth."

● Scalable AI Support: Design the infrastructure today that will support the AI-driven features of tomorrow, ensuring high-quality data availability for ML model training and inference.


What we’re looking for

● Architectural Vision: Proven experience designing and implementing data infrastructure from scratch in a high-scale environment.

● Stack Expertise: Deep mastery of GCP (BigQuery, Pub/Sub) and significant experience with Kafka and Apache Airflow.

● Software Engineering Mindset: Advanced Python skills with a focus on writing clean, modular, and production-grade code for data applications. ● Data Modeling Expert: Strong command of different modeling techniques (Data Vault, Star Schema, etc.) and their trade-offs in a B2C context. ● High-Volume Experience: Experience handling "big data"

challenges-managing multi-terabyte datasets, optimizing query costs, and handling complex retro-conciliation.

● Delivery Driven: The ability to take an existing product with raw data and transform it into a structured, reliable data ecosystem.


Nice to have

● Streaming & Real-Time: Experience with Flink, Spark Streaming, or similar for low-latency processing.

● Modern Tooling: Familiarity with dbt for transformation layers and Terraform for Infrastructure as Code (IaC).

● Startup DNA: Experience working in fast-paced environments where you need to build while the "plane is flying."


If you’re a builder who thrives on taking ownership of architecture and turning data chaos into a scalable system - we’d love to hear from you.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search
Want help applying to roles like this? Search Caio for free. If the repetitive CV tweaking gets heavy, Daniel can help set up Caio Agent.
Ask about Agent