Skip to main content

Foundations of AI Native Data Engineering

Intela

About this course

This course introduces the foundations of AI-native data engineering for experienced data practitioners who need to adapt traditional data platforms and pipelines for AI workloads. It explains how analytics-oriented systems evolve to support embeddings, vector retrieval, LLM integration, retrieval-augmented generation, semantic search, and broader AI lifecycle responsibilities. Learners examine how data engineers contribute across preparation, retrieval, evaluation, monitoring, governance, access control, drift management, and operational ownership. By the end of the course, learners will be able to analyze AI workload requirements and redesign a legacy pipeline into an AI-native architecture with documented tradeoffs, non-functional requirements, and governance considerations.

Level: Intermediate  ·  Total duration: 15h 52m

What you'll learn

  • Describe AI Native Data Engineering and explain how it differs from traditional data engineering.
  • Explain core AI, ML, generative AI, LLM, embedding, vector database, and RAG concepts from a data engineering perspective.
  • Map AI data lifecycle stages to practical data engineering responsibilities, including preparation, feature engineering, embedding, retrieval, evaluation, monitoring, governance, and drift management.
  • Evaluate LLM integration choices, including prompting, hosting, cost, latency, scale, fine-tuning, and RAG.
  • Design RAG and semantic search workflows that include source preparation, chunking, embeddings, vector retrieval, metadata, evaluation, governance, and access control.
  • Redesign an analytics-oriented pipeline into an AI-native target architecture with documented tradeoffs, non-functional requirements, observability, governance, security, and ownership boundaries.

Skills you'll gain

  • AI-native architecture (level 3)
  • Architecture diagramming (level 3)
  • AI workload analysis (level 3)
  • Embedding pipeline design (level 3)
  • Vector retrieval (level 3)
  • RAG pipeline design (level 3)
  • Semantic search design (level 3)
  • Chunking and metadata design (level 3)
  • LLM hosting tradeoff analysis (level 3)
  • Cost and latency evaluation (level 3)
  • Governance and security controls (level 3)
  • Access control (level 3)
  • Drift monitoring (level 3)
  • Retrieval quality monitoring (level 2)
  • Architecture report writing (level 2)
Enroll