Senior AI & Data Engineer
Indexed description
We are looking for a Senior AI & Data Engineer to join our AI team. This is a hands-on individual contributor role for someone who thinks in systems, builds with AI-native tools, and sees automation as a design discipline, not an afterthought.
The right person for this role sits at the intersection of three things: building and maintaining the data pipelines and infrastructure that power our AI platform; designing and shipping AI-enabled automations that remove friction from real workflows; and engineering AI agents and integrations that connect our data assets to decision-makers across the company. You will work closely with the Sr. Manager of AI Engineering and a small high-velocity team to deliver production systems across Alamar, including but not limited to commercial, R&D, and operations.
This is a role for someone who builds with AI as a core part of the engineering stack. You should have strong opinions about where AI belongs in a pipeline, when to automate versus when to keep a human in the loop, and how to build systems that are observable, auditable, and easy to iterate on.
Responsibilities
Data Engineering & Lakehouse
- Build and maintain ETL/ELT pipelines ingesting structured, semi-structured, and unstructured data from systems including our CRM, ERP, ELN, and more (both internal and external)
- Implement and optimize data workflows on our data lakehouse (Delta tables, workflows, Unity Catalog)
- Write clean, testable SQL and Python for data transformation, enrichment, and delivery to downstream consumers of data including BI tools and AI agents
- Apply data governance and lineage practices from day one, including tagging, access control, and metadata management
- Work with the broader data team to evolve the lakehouse schema, ensure data quality, and reduce pipeline fragility
- Contribute to the architectural direction of the data platform as it scales to new sources and use cases
- Design and build AI-enabled automations that eliminate manual steps from high-friction workflows across sales, operations, R&D, and more
- Identify automation opportunities through direct engagement with end users and translate workflow pain points into executable builds
- Build and deploy automations using modern tooling including LLM APIs, MCP integrations, workflow orchestration frameworks, and low-code/no-code layers where appropriate
- Instrument automations with observability hooks so usage, failures, and performance are visible from day one
- Maintain and iterate on deployed automations based on real usage data, not assumptions utilizing an ROI framework to deliver measurable value to business stakeholders
- Design for reuse: build components and patterns that can be composed into future automations rather than one-off tools
- Build and maintain AI agents that surface structured data to end users through natural language interfaces, covering use cases in commercial intelligence, R&D discovery, and operational reporting
- Implement RAG pipelines, tool-calling integrations, and memory/state patterns that make agents reliable and contextually accurate
- Integrate agents with internal systems through APIs, MCPs, and custom connectors as required
- Write evaluation frameworks and regression tests to measure agent accuracy, reliability, and drift over time
- Contribute to the Alamar AI Hub: shared libraries, governance tooling, versioning standards, and deployment patterns used across the agent portfolio
- Collaborate with all AI team members and stakeholders across Alamar on agent requirements, scoping, and delivery timelines
- Write production-quality Python across all workstreams: data pipelines, agent logic, automation scripts, and API services
- Operate with AI-native development practices: use AI coding assistants, generative tooling, and prompt engineering as standard parts of your workflow, not occasional shortcuts
- Contribute to code reviews, documentation, and engineering standards for the AI & Data team
- Stay current on the LLM and agent ecosystem; bring new tools and techniques to the team with concrete proposals for where they apply
- Debug and resolve production issues across the data and AI stack with appropriate urgency and rigor
- 3–5 years of hands-on experience in data engineering, AI/ML engineering, or a closely related software engineering role
- Strong Python programming skills; able to write clean, well-structured, production-grade code
- Practical experience building and maintaining data pipelines with modern orchestration tools (dbt, Airflow, Dagster, or equivalent)
- Hands-on experience with cloud data platforms (Databricks, Snowflake, BigQuery, or equivalent)
- Experience building with LLM APIs (Anthropic, OpenAI, Gemini, or equivalent) including prompt design, tool-calling, and RAG implementation
- Demonstrated ability to ship working automations or integrations against real systems, not just prototypes
- Solid understanding of REST APIs, data modeling, and the fundamentals of distributed systems
- Comfortable with ambiguity: able to scope work, prioritize independently, and ask the right questions without waiting for fully defined requirements
- Strong communication skills with non-technical stakeholders; able to translate workflow problems into engineering specs and explain technical tradeoffs in plain language
- Experience working with agentic frameworks (LangChain, LangGraph, CrewAI, AutoGen, or equivalent)
- Familiarity with vector databases and embedding-based retrieval (Pinecone, pgvector, or equivalent)
- Background in life sciences, biotech, or scientific software; familiarity with LIMS and/or ELN platforms
- Experience integrating CRM and ERP systems as data sources or automation targets
- Exposure to governance, audit logging, or compliance requirements for production AI systems
- Experience building internal tools or lightweight applications that end users actually adopt and use
- Contributions to open-source AI or data engineering projects
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search