Lead Data Engineer
Indexed description
Title: Lead Data Engineer
Contract Duration: 12 Months+
Location: Dublin, Ireland(Hybrid)
We are seeking an experienced Lead Data Engineer to join a high-impact team supporting Financial Institutions. This role is central to building resilient, governed, and scalable data platforms that power advanced analytics and detection capabilities.
You will play a key role in designing and evolving a Databricks + AWS lakehouse, enabling investigators, data scientists, and product teams to uncover criminal behaviour and act with confidence. This is a hands-on leadership role combining deep technical expertise with strong ownership, mentoring, and stakeholder collaboration.
Responsibilities:
- Own the end-to-end design, build, optimisation, and support of scalable Spark / PySpark pipelines on Databricks (batch & streaming).
- Define and enforce Lakehouse & Medallion architecture standards (Bronze/Silver/Gold), including governance, lineage, quality SLAs, and cost controls.
- Architect and maintain secure, compliant AWS data infrastructure (S3, IAM, Glue, Lake Formation, KMS, Lambda, Step Functions, EKS/EC2).
- Lead data ingestion using Apache NiFi, APIs, SFTP/FTPS, onboarding diverse internal and external datasets.
- Implement robust orchestration using Airflow, Databricks Workflows, and Step Functions, with strong observability and reliability patterns.
- Champion data quality, reliability, and observability, including expectations, anomaly detection, SLIs/SLOs, alerting, and runbooks.
- Embed metadata and lineage (Unity Catalog, Glue, OpenLineage) to support auditability and regulatory transparency.
- Drive CI/CD and Infrastructure as Code practices for data assets across environments.
- Mentor engineers on Spark performance, Delta Lake optimisation, partitioning strategies, and cost/performance trade-offs.
- Collaborate closely with data science, product, security, and compliance teams to deliver trusted, production-grade data solutions.
- Lead technical design reviews, code reviews, incident response, and continuous improvement initiatives.
Experience Required:
- Expert-level SQL skills with strong hands-on experience in Databricks, Snowflake, Python, and PySpark.
- Proven production experience building and optimising large-scale Spark pipelines (Delta Lake, Photon, cluster tuning).
- Strong AWS data ecosystem expertise, including security, networking, encryption, and cost optimisation.
- Hands-on orchestration experience with Airflow, Databricks Workflows, and Step Functions.
- Solid experience with CI/CD, Git workflows, and IaC (Terraform / CloudFormation).
- Deep understanding of data governance, lineage, and compliance (PII/PCI, retention, access controls).
- Demonstrated ability to lead, mentor, and influence, working effectively with both technical and non-technical stakeholders.
- Pragmatic, delivery-focused mindset with experience in incident management and on-call readiness.
- Bonus: Financial Crime domain exposure, Python packaging, OpenTelemetry, advanced observability practices.
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search