Senior Sr AI Platform Engineer
Indexed description
We are hiring a Senior AI Platform Engineer to build and scale a next-generation AI serving layer for enterprise use cases. You will join a team of senior engineers and architects to take proven low-latency patterns into production.
The current architecture includes:
- DuckDB-based embedded cache
- FastAPI-powered serving layer
- Delta Lake Change Data Feed (CDF) sync from Lakehouse
This role is highly hands-on, focused on building production-grade systems—from pipelines to APIs and CI/CD.
Key Responsibilities
Platform Development
- Build and scale serving stores, sync pipelines, and API layers
- Develop high-performance REST APIs using Python (FastAPI or similar)
- Configure end-to-end solutions (data sources, schema, sync schedules, integrations)
- Deliver low-latency, high-availability systems
Data Pipelines & Streaming
- Build and manage batch and real-time pipelines (Kafka or similar)
- Implement Change Data Capture (CDC) and incremental data processing
- Ensure data freshness, consistency, and reliability
Caching & Serving Systems
- Design key-value serving layers (Redis/Valkey or similar)
- Implement cache invalidation and TTL strategies
- Optimize hot-path serving for sub-millisecond latency
AI/ML Data Infrastructure
- Build infrastructure supporting:
- RAG (Retrieval-Augmented Generation)
- Agent-based systems
- Feature serving platforms
- Work with vector databases and knowledge graphs (Pinecone, Weaviate, pgvector, Neo4j)
- Design embedding and retrieval pipelines
Data Platform (Lakehouse)
- Work with:
- Azure Databricks
- Delta Lake (CDF, time travel, transaction logs)
- Unity Catalog
- Design efficient data models for transactional and analytical workloads
DevOps & Quality
- Build and maintain CI/CD pipelines (GitHub Actions / GitLab)
- Implement automated testing for:
- Data pipeline validation
- API contracts
- Performance/latency benchmarks
- Ensure adherence to SLAs (latency, availability, freshness)
Collaboration & Enablement
- Partner with business teams (Marketing, Growth, Revenue, Analytics)
- Translate business requirements into scalable technical solutions
- Create reusable frameworks, blueprints, and documentation
- Communicate effectively with architects, product managers, and stakeholders
Innovation & Continuous Improvement
- Design scalable platform patterns and reusable architectures
- Improve system performance, cost, and scalability
- Explore emerging technologies including AI agents and advanced serving patterns
Required Skills & Experience
Core Technical Skills
- Strong experience in Python API development (FastAPI or similar)
- Expertise in:
- Kafka / streaming systems
- CDC / incremental data pipelines
- Hands-on experience with:
- Redis / Valkey (caching systems)
- Low-latency serving architectures
AI & Data Engineering
- Experience building AI/ML data infrastructure (RAG, agents, feature stores)
- Knowledge of vector databases / knowledge graphs
- Strong SQL and data modeling skills
Lakehouse & Cloud
- Experience with:
- Azure Databricks
- Delta Lake (CDF, transaction logs, time travel)
- Unity Catalog
DevOps & Engineering Practices
- CI/CD experience (GitHub Actions / GitLab)
- Strong understanding of:
- SDLC
- Agile (Scrum/Kanban)
- Distributed systems (fault tolerance, scalability, idempotency)
Professional Skills
- Strong problem-solving and analytical thinking
- Ability to work independently and deliver end-to-end solutions
- Excellent communication and stakeholder management
- Ability to work in fast-paced, high-pressure environments
Nice-to-Have
- Experience with DuckDB or embedded analytical engines
- Exposure to Microsoft Fabric / OneLake / Power BI Semantic Models
- Experience defining and managing SLAs/SLOs
- Familiarity with AI agent tooling (MCP-style systems)
- Experience building enterprise-scale data serving platforms
Key Expectations
- Develop high-quality, scalable, and maintainable code
- Optimize system performance, cost, and efficiency
- Identify and resolve defects using RCA
- Contribute to system design (HLD/LLD) and architecture decisions
- Support release and deployment processes
Performance Metrics
- Adherence to coding standards and timelines
- Code quality and defect reduction
- System performance (latency, availability, freshness)
- Contribution to reusable assets and documentation
- Stakeholder satisfaction and collaboration
About Brickred Systems:
Brickred Systems is a global leader in next-generation technology, consulting, and business process service companies. We enable clients to navigate their digital transformation. Brickred Systems delivers a range of consulting services to our clients across multiple industries around the world. Our practices employ highly skilled and experienced individuals with a client-centric passion for innovation and delivery excellence.
With ISO 27001 and ISO 9001 certification and over a decade of experience in managing the systems and workings of global enterprises, we harness the power of cognitive computing hyper-automation, robotics, cloud, analytics, and emerging technologies to help our clients adapt to the digital world and make them successful. Our always-on learning agenda drives their continuous improvement through building and transferring digital skills, expertise, and ideas from our innovation ecosystem.
Create a free Caio profile to unlock more results and save your role and location preferences.
Unlock free search