Data Engineer

Bengaluru, Karnataka, India

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

Role - Data Engineer

Experience - 3-6 yrs

Location - Bangalore

We are looking for a Data Engineer II (SDE-2) to join our data team. The ideal candidate will be a play a key role to develop of high performant and scalable Data Lake-house, moving us toward a world of sub-minute data latency and unified batch/streaming compute. This is an engineering-heavy role where you will manage complex CDC flows, optimize distributed query engines and leverage AI to accelerate our development lifecycle.

Technical Priorities

Real-time CDC: Ownership of high-throughput ingestion from RDBMS to Lakehouse using Debezium, PeerDB.
Lakehouse Architecture: Designing and optimizing table formats (Iceberg, Delta, Hudi) for both performance and storage efficiency.
Unified Compute: Developing robust ETL/ELT frameworks in PySpark and Flink (handling both batch and streaming workloads).
Infrastructure & Ops: Managing data workloads on AWS (EMR, EKS, MSK, S3) and automating everything via Gitlab/Github Actions.
Query & BI: Tuning Trino or Clickhouse to power real-time dashboards in Metabase, Superset, and PowerBI.

Requirements

Experience: 3–5 years in Data Engineering, specifically with distributed systems and cloud-native architectures.
Coding: Expert-level Python/PySpark and SQL.
Familiarity with Go/Java/Scala is a plus
Infrastructure: Hands-on experience with AWS (S3, EKS, MSK) and Infrastructure-as-Code.
Orchestration: Experience with Airflow or Temporal for complex workflow management.
AI-Native: Proficiency in using AI tools (Claude, Codex, Copilot) to write, test, and document code efficiently.
Systems Thinking: Ability to explain the trade-offs between different storage formats and processing frameworks.
Tech LeaderShip : Drive key tech initiatives by preparing TRD and actively involve in design reviews.
Domain Modelling - Should be hands on in designing Domain models for OLAP like Fact, Dimension and types of SCD’s and OBT pattern tables.
Self Starter - Lead the team technically and bring in new ideas to contribute to the growth of the charter.
Customer First - Interact with the Product & Key Stakeholders & help them by adding value to the business workflow with data & analytics.

Our Tech Stack

Ingestion: Debezium, PeerDB, Olake
Storage: Delta, Iceberg, Hudi (S3-based Lakehouse)
Compute: PySpark, Flink, EMR, EKS
Streaming: MSK (Kafka)
Query Engines: Trino, Clickhouse
Orchestration: Airflow, Temporal
DevOps: Gitlab, Github Actions, Terraform

Visualization: Metabase, Superset, Tableau, PowerBI

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

Recro Company profile preview

Source: Linkedin
Location: Bengaluru, Karnataka, India
Compensation: Not listed
Open on Caio: 19 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Linkedin postings

Company stats

Current index details for Recro, based on roles Caio has indexed from public sources.

19open roles 1sources 1markets Posted 4d agolatest role