AI-Driven Software Engineering for Data Science

Intela

AI-Driven Software Engineering for Data Scientists

A fully self-paced, practice-first course that bridges the gap between Junior and Middle Data Scientists by simulating real company work — without requiring any other humans. You will build and ship a production-style ML service while applying AI‑DSE: AI integrated into every SDLC stage as a productive co-executor, while you remain the Accountable decision-maker.

A key mechanism is the Independent AI Auditor — a separate verification step that checks artifacts produced by both you and the working AI for consistency, hallucinations, vulnerabilities, and missing tests. Every week ships a production increment, like a real team.

Who this course is for

Junior Data Scientists who can train models but lack production and team workflow experience
Early-career ML practitioners targeting ML Engineer / Applied Scientist responsibilities
Analysts transitioning into production ML work

Literature and methodological foundation

This course is informed by foundational software engineering literature, including:

SWEBOK Guide (Software Engineering Body of Knowledge), which provides a structured view of generally accepted software engineering knowledge across process, quality, testing, configuration management, and professional practice.
Software Development Lifecycle Models, which provides context on major SDLC approaches such as waterfall, spiral, V-model, RAD, and incremental development.

These references support the course perspective that a DS/ML repository should be treated not merely as an experimentation space, but as a controlled software-engineering system with traceability, quality gates, reproducibility, and auditability.

Prerequisites

Python: functions, classes, pandas, NumPy
Basic ML: train/test split, overfitting, common metrics
Basic Git: clone / commit / push

What you will build

Reproducible training pipeline + experiment tracking
Your own dataset and project theme — chosen in Week 1 and used throughout the entire course as the foundation for every artifact you build
Data validation tests + leakage checks
Architecture diagram (C4-style) of the ML service — designed by you, generated with Working AI assistance, verified by the Auditor
Inference API (FastAPI) + Docker
CI pipeline (tests / lint / build)
Monitoring hooks + drift signals
AI‑DSE audit trail (auditor reports + decision logs)
Promptbook — a living collection of approved prompt patterns, refined throughout the course based on feedback from production, incidents, and audits
Post-release artifact: tech debt backlog + product evolution plan

Learning outcomes

Translate vague objectives into testable requirements and acceptance criteria
Design an evaluation plan with metrics, slicing, and rollback conditions
Build reproducible pipelines for data prep, training, and inference
Track experiments and justify model choice with a decision memo
Design the architecture of an ML service: choose the component structure, document trade-offs, and have the result audited for compliance with NFRs
Serve a model via a stable API contract, containerize it, and ship with CI
Add monitoring + drift detection and respond to incidents with a runbook
Use AI copilots effectively while enforcing Independent AI Audits and quality gates
Maintain an auditable trail of decisions, risks, and evidence
Apply AI-RACI in practice: correctly assign Accountable / Responsible / Consulted / Informed roles across all key SDLC activities
Manage the post-release lifecycle: identify tech debt, assess change impact, and plan product evolution with AI assistance

Format & time budget

8 weeks × 10 h/week: 6h mandatory core + up to 4h optional stretch
Each week: Sprint Planning → Guided Lab → Build & PR → Audit & Gate
Gates: Requirements, Model Readiness, Merge, Release
Every PR must include: CI evidence, audit report, decision log update
Assessment: pass/fail gates + rubric-scored artifacts

Intela: AI-DSE AI-Driven Software Engineering for Data Science