DATA ENGINEER (Data Science & Big Data Analytics)
Indexed description
FUNCTIONS AND RESPONSIBILITIES OF THE JOB:
- Design, build, and maintain data pipelines (batch and streaming) that ingest data from heterogeneous sources into data lakes and warehouses, including metadata and lineage tracking.
- Contribute to the development of federated query and discovery systems over distributed datasets (UNCAN.eu), working with engines such as Trino and integrating query optimizers compliant with privacy requirements.
- Contribute to the deployment of European data spaces (DeployEMDS) using standard building blocks from IDSA, Gaia-X, and FIWARE, including data catalogues, brokers, and connectors.
- Build and maintain orchestration workflows using Airflow or Dagster, following software engineering best practices (tests, code review, CI/CD).
- Package and deploy services using Docker and Docker Compose or similar
- Support Machine Learning projects with data storage, serving, and versioning infrastructure (object storage, SQL/NoSQL databases, feature stores).
- Collaborate on multi-cloud and on-premise deployments (e.g. Hetzner, Azure, bare metal) and contribute to infrastructure-as-code practices.
- Support the preparation of technical sections in EU-funded project proposals (Horizon Europe and similar), and contribute to scientific dissemination (papers, prototypes, demos).
Studies
MSc in Computer Science, Data Engineering, Mathematics, Physics, or related technical field. A PhD or specialised Master's will be highly valued.
Experience
At least 2 years of professional experience as a Data Engineer or in a closely related role
Technical Skills
- Strong Python proficiency, including modern tooling for clean code (type hints, linters/formatters such as Ruff, testing with pytest).
- Solid SQL skills and experience with relational databases (PostgreSQL, MySQL)
- Experience with at least one NoSQL or document database (Redis, Elasticsearch, or similar)
- Experience building ETL/ELT data pipelines (Airflow, Dagster or similar)
- Working knowledge of object storage (S3, MinIO) and common serialization formats (Parquet, JSONL, Avro, BSON).
- Comfort on Linux and with the command line
- Docker and Docker Compose for packaging and local development
- Git and CI/CD workflows (GitHub Actions, GitLab CI, or similar)
- Understanding of batch vs. streaming paradigms and event-driven architectures
- Understanding of the difference between Data Lake and Data Warehouse architectures, and when to use each.
- Excellent written and spoken English
- Knowledge of Catalan and/or Spanish is a plus
- Experience with distributed query engines (Trino, Presto, Dremio) and the concept of federated queries over heterogeneous data sources.
- Familiarity with European data spaces initiatives: IDSA, Gaia-X, FIWARE, DSSC, Eclipse Dataspace Components; data catalogues (CKAN), brokers, and connectors.
- Big Data ecosystem: Apache Spark, Flink, Kafka, RabbitMQ, Hadoop
- Kubernetes and Helm for production deployments
- Infrastructure as Code with Terraform, Ansible, or similar
- Observability stacks: OpenTelemetry, Prometheus + Grafana, Loki, or equivalents
- Experience with cloud providers (Azure, AWS, GCP, Hetzner): serverless functions, managed storage, IAM.
- Graph databases (Neo4j) or time-series databases
- Machine Learning fundamentals and familiarity with ML lifecycle tooling (MLflow, feature stores, model versioning).
- Concurrency and backend knowledge: async programming, multithreading, actor model, message-driven systems.
- Additional programming languages: Java, Scala, Go, or Rust
- Participation in EU-funded research projects (Horizon Europe, Digital Europe) or scientific publications / conference presentations.
- Relevant certifications (cloud providers, Kubernetes CKA/CKAD, data platforms)
- Permanent contract.
- Hybrid work (home office/ work in the office).
- Flexible Schedule.
- Shorter workday on Friday and Summer Schedule.
- Flexible remuneration package (health insurance, transport, lunch, studies - training and kindergarten).
- Eurecat employees can join the Eurecat Academy courses.
- Language courses (English, Catalan and Spanish).
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search