Overview
We are seeking a Data Engineer(AI-ML) to design and operate the data infrastructure that powers our RAG systems and ML-driven agents. You will own the full journey from raw, messy SME and enterprise source data through to clean, structured, retrieval-ready assets — enabling reliable, production-grade AI solutions.
Responsibilities
- Design and maintain scalable ETL/ELT pipelines ingesting structured and unstructured data from SME and enterprise sources (databases, APIs, documents, CRMs, ERPs)
- Transform raw business data into retrieval-optimised formats for RAG pipelines — including chunking strategies, embedding preparation, and vector store population
- Build and maintain data feeds that power ML-driven agents: tool-use context, knowledge bases, and real-time retrieval stores
- Analyse, sanitise, and normalise noisy source data to meet quality standards required for LLM fine-tuning and inference
- Manage data lakes and warehouses; automate ingestion, refresh, and validation workflows
- Enforce data governance, lineage tracking, and documentation across all pipeline stages
- Collaborate with ML engineers and product teams to iterate on data schemas as agent and RAG requirements evolve
- Monitor pipeline health, data drift, and retrieval quality metrics in production
Experience
- 3+ years in data engineering or data analysis
- Hands-on experience building pipelines for ML or LLM workflows
- Exposure to RAG architectures or vector databases (e.g. Pinecone, Weaviate, pgvector)
- Experience working with heterogeneous enterprise data sources
Education & Certification
- Degree in Computer Science, AI, Data Engineering, or related field
- Cloud or data certifications preferred (AWS, GCP, Azure, dbt)
Preferred Skills
1.Pipeline & Infrastructure:
- Python, SQL
- Apache Spark / Hadoop
- Kafka / Flink (streaming)
- Airflow / Prefect (orchestration)
- Docker, Kubernetes
- AWS / Azure / GCP
- Snowflake / BigQuery
2.RAG & LLM Tooling:
- LangChain / LlamaIndex
- RAG evaluation frameworks
- Vector databases: Pinecone, Weaviate, pgvector
- Embedding pipeline design and optimisation
3.ML & MLOps:
- TensorFlow / PyTorch
- MLOps tooling and model deployment
- Fine-tuning dataset preparation
- Data visualisation tools (Power BI, Tableau)
Compensation & Other Benefits
- Two Weekly holidays (Saturday & Sunday)
- Attractive salary and paid on time
- Annual salary review
- Annual Tour
- Two festival bonuses
- Performance bonus for extraordinary performance
- Consistent growth opportunities
- Direct impact on production RAG and agentic AI systems
- Career growth in a fast-moving AI engineering team
- Flexible and collaborative environment
How to Apply
Interested candidates are requested to submit their most recent updated resume to aiml.data@tulip-tech.odoo.com, along with a link to their Portfolio and GitHub showcasing relevant projects they are currently working on or have previously worked on. Please mention " Data Engineer " in the subject line.
Office Location
Level 6, Holland Centre, Pragati Sarani, Badda, Dhaka, Bangladesh.
Work Duration
Candidate will be working at UK hours, in local BD time from 2.00 PM to 11.00 PM.