Senior Data Engineer (with AI/ML experience) India
Indexed description
We are looking for a skilled Senior Data Engineer (with AI/ML exposure) to join our BI team and take an active role in designing, building, and maintaining the end-to-end data pipeline, architecture and design that powers our warehouse, LLM-driven applications, and AI-based BI. If you're looking for a company that will give you the maximum flexibility in choosing a location to work, this opportunity is for you! * The interview process will be conducted in English.
Responsibilities:
- Design, develop, and maintain scalable data pipelines to support ingestion, transformation, and delivery into centralized feature stores, model-training workflows, and real-time inference services.
- Design, optimize, and maintain robust ETL/ELT pipelines and data structures within our cloud data warehouse (Snowflake/Redshift) to support core Business Intelligence.
- Build and optimize workflows for extracting, storing, and retrieving semantic representations of unstructured data to enable advanced search and retrieval patterns.
- Architect and implement lightweight analytics and dashboarding solutions that deliver natural language query experience and AI-backed insights.
- Define and execute processes for managing prompt engineering techniques, orchestration flows, and model fine-tuning routines to power conversational interfaces.
- Oversee vector data stores and develop efficient indexing methodologies to support retrieval-augmented generation (RAG) workflows.
- Partner with data stakeholders to gather requirements for language-model initiatives and translate into scalable solutions.
- Create and maintain comprehensive documentation for all data processes, workflows and model deployment routines.
- Should be willing to stay informed and learn emerging methodologies in data engineering, MLOps and LLM operations.
Requirements:
- Senior Data Engineer with 6-7+ years of experience in building scalable data infrastructure and 1-2 years of hands-on exposure to AI/ML workflows.
- Excellent English communication skills.
- Hands-on experience with big data technologies including Apache Spark, Hadoop, and Kafka for distributed processing and real-time data ingestion.
- Experience designing complex data pipelines extracting data from RDBMS, JSON, API and Flat file sources.
- Demonstrated skills in SQL and PLSQL programming, with advanced mastery in Business Intelligence and data warehouse methodologies, along with hands-on experience in one or more relational database systems and cloud-based database services such as Snowflake/Redshift.
- Effective oral and written communication skills with BI team and user community.
- Demonstrated experience in utilizing python for data engineering tasks, including transformation, advanced data manipulation, and large-scale data processing.
- Deep understanding of vector databases and RAG architectures, and how they drive semantic retrieval workflows.
- Skilled at integrating open-source LLM frameworks into data engineering workflows for end-to-end model training, customization, and scalable inference.
- Experience with cloud platforms like AWS or Azure Machine Learning for managed LLM deployments.
- Understanding of software engineering principles and skills working on Unix/Linux/Windows Operating systems, and experience with Agile methodologies.
- Proficiency in version control systems, with experience in managing code repositories, branching, merging, and collaborating within a distributed development environment.
- Interest in business operations and comprehensive understanding of how robust BI systems drive corporate profitability by enabling data-driven decision-making and strategic insights.
Pluses:
- Experience with vector databases such as DataStax AstraDB, and developing LLM-powered applications using popular open source frameworks like LangChain and LlamaIndex–including prompt engineering, retrieval-augmented generation (RAG), and orchestration of intelligent workflows.
- Familiarity with evaluating and integrating open-source LLM frameworks–such as Hugging Face Transformers/LLaMA-4 across end-to-end workflows, including fine-tuning and inference optimization.
- Knowledge of MLOps tooling and CI/CD pipelines to manage model versioning and automated deployments.
What we offer:
- Remote work opportunity!
- B2B Employment ($, gross) or full-time employment option.
- Stable job with long-term growth perspective.
- Competitive salary with annual performance review.
- Really good hardware.
- An exciting and challenging job with talented people around.
- Continuous learning and career growth opportunities.
- Compensation for professional training, seminars, and conferences.
- Referral program – get rewarded for helping us grow the team with talented people.
- Company-supported English classes to enhance your professional growth.
Create a free Caio profile to unlock the full index and keep your job-search signal for future recommendations.
Unlock free search