The School of AI

EAG V3 Curriculum

How to build an autonomous, interoperable, production-ready agentic AI platform. 19 sessions. Three protocol layers. One integrated platform.

MCP A2A A2UI / AG-UI DAG Architecture Multi-Agent Browser & Desktop Autonomy Event-Driven Arcturus 2.0

The Five-Act Arc

Act 1
Foundations
Sessions 1 – 4
Act 2
Intelligence
Sessions 5 – 8
Act 3
Capabilities
Sessions 9 – 12
Act 4
Interoperability
Sessions 13 – 15
Act 5
Autonomy & Production
Sessions 16 – 19

Session-by-Session

Click any session to explore topics and assignments

Act 1 — Foundations
01 Foundations of Transformer Architecture

Understand neural networks, attention mechanisms, and positional encoding that power modern AI. You can't debug what you don't understand.

Core Topics
  • Neural networks and the Universal Approximation Theorem
  • Backpropagation walkthrough
  • The attention mechanism: why it works, what it computes
  • Multi-head attention and what different heads learn
  • Positional encoding: why position matters in sequences
  • Embeddings: from discrete tokens to continuous space
Sample Assignment

Chrome extension that interacts with an LLM API.

02 Modern LLM Internals & The 2026 Model Landscape

Learn tokenization, scaling laws, RLHF alignment, and the current state of reasoning and multi-modal models.

Core Topics
  • Tokenization deep dive (BPE, SentencePiece)
  • Scaling laws: Chinchilla, compute-optimal training
  • Causal language modeling, pre-training objectives
  • SFT, RLHF, DPO — aligning models to follow instructions
  • Emergent abilities: phase transitions at scale thresholds
  • NEW The 2026 model landscape — reasoning models, multi-modal models, local models
  • NEW What models can and cannot do — setting realistic expectations
Sample Assignment

Enhanced Chrome extension using Gemini Flash / Claude Haiku with streaming.

03 Developer Foundations & Your First Agent

Build Python and Node.js skills, then create your first goal-directed agent with a working web UI.

Core Topics
  • Python essentials: async/await, decorators, type hints, dataclasses
  • NEW Node.js/NPM basics: project setup, package.json, Express server
  • NEW UI fundamentals: React (or vanilla JS) + Vite
  • Three pillars of agency: goal-directed behavior, interactive capacity, autonomous decision-making
  • LLM vs RAG vs Agents — the spectrum
  • Build your first agent step by step (perception → decision → action)
Sample Assignment

Full-stack agent with a working web UI — backend in Python, frontend in Node.js/React. Agent takes a goal and executes it with at least one tool.

04 MCP — The Tool Protocol

Master the Model Context Protocol for tool registration, discovery, and invocation across servers.

Core Topics
  • The journey: Foundation LLMs → Function Calling → Agentic AI → MCP
  • MCP Server/Client architecture
  • stdio vs SSE transport, JSON-RPC communication
  • Tool registration, invocation, and result handling
  • Building MCP servers in Python AND TypeScript
  • Dynamic tool discovery: summary layer, hint-based filtering
  • NEW MCP as the first layer of the protocol stack (MCP → A2A → A2UI/AG-UI)
Sample Assignment

Build a custom MCP server wrapping a real-world API (weather, stocks, email, calendar — each student picks different). Connect it to your agent from Session 3.

Act 2 — Intelligence
05 Planning, Reasoning & Structured Prompting

Implement Chain-of-Thought, ReACT patterns, and self-validating task decomposition for intelligent agents.

Core Topics
  • Chain-of-Thought: activating latent reasoning circuits
  • ReACT: interleaving reasoning with tool actions
  • Structured prompting formats: input-output templates, step-labeled reasoning
  • Self-validation: agents that check their own work
  • Task decomposition and dependency-aware planning
  • NEW When to let the model reason vs. when to enforce structure
Sample Assignment

Multi-step agent that decomposes a complex goal, plans tool usage, executes, and self-validates results.

06 Cognitive Architecture & Adaptive Planning

Design the 4-layer cognitive pipeline with strategy profiles and adaptive retry loops.

Core Topics
  • 4-layer cognitive pipeline: Perception → Memory → Decision → Action
  • Pydantic for typed data flow between layers
  • Strategy profiles: Conservative, Exploratory, Fallback
  • Agent-written Python plans: LLM generates solve() functions
  • Adaptive planning loops: controlled retry, tool switching, bounded retries
  • Memory-aware planning and planner introspection
Sample Assignment

4-module agent with user preference input, strategy selection, and adaptive retry across different problem types.

07 Memory Systems & Modern RAG

Build 3-tier memory (preferences, episodic, factual) with hybrid retrieval and semantic chunking.

Core Topics
  • Hybrid retrieval (semantic + BM25 + RRF) — modernized beyond basic FAISS
  • When RAG beats huge context windows (and when it doesn't)
  • Embedding models (Gemini, Nomic, Ollama-based)
  • Semantic chunking strategies
  • 3-tier memory: REMME (Preferences), Episodic (Recipes), Factual (Knowledge)
  • Memory injection into agent prompts at planning time
  • NEW MarkItDown, Trafilatura, PyMuPDF4LLM for document processing
Sample Assignment

Agent with 3-tier memory that learns preferences, recalls past workflows, and retrieves facts. Demonstrate persistence across sessions.

08 Multi-Agent Systems & DAG Architecture

Coordinate multiple agents using directed acyclic graphs with parallel execution and session persistence.

Core Topics
  • Single vs multi-agent trade-offs
  • Coordination patterns: Parallel, Sequential, Loop, Router
  • “Don't build loops; build graphs” — the paradigm shift
  • NetworkX DiGraph as the execution substrate
  • Topological sorting for parallel-safe execution
  • Blackboard architecture via shared session state
  • NEW Fallback strategies — 3x code_variants per step, fallback nodes
Sample Assignment

Multi-agent DAG executor with 3+ agent types supporting parallel execution, session persistence, and resumption after interruption.

Act 3 — Capabilities
09 Browser Agents & Autonomous Web

Automate web browsing with Playwright, vision-capable navigation, and multi-source research pipelines.

Core Topics
  • Playwright-based browser automation (headless and headed)
  • Chrome DevTools Protocol (CDP) integration
  • Waterfall search strategy across 5+ engines
  • Triple extraction: Trafilatura vs Readability vs BeautifulSoup
  • Vision-capable browsing: screenshots + VLM analysis
  • Autonomous navigation: form filling, multi-page workflows
  • NEW Browser profiles, session persistence, cookie management
  • NEW Anti-detection patterns and ethical scraping
Sample Assignment

Agent that autonomously researches a topic across 5+ sources, extracts structured data, compares results, and generates a synthesis report.

10 Computer Use & Desktop Agents

Control desktop applications using screen understanding, accessibility trees, and OS-level automation.

Core Topics
  • Anthropic Computer Use API — the standard for screen interaction
  • Screen understanding with VLMs
  • UI element detection: YOLO/ONNX model for buttons, text fields, menus
  • Accessibility tree integration
  • Multi-modal perception pipeline: screenshot → YOLO → VLM → action
  • NEW Cross-platform considerations (macOS, Linux, Windows)
  • NEW Application-specific automation patterns
Sample Assignment

Agent that operates a desktop application to complete a real task using vision + accessibility tree, not just scripted clicks.

11 Channel Architecture, Voice & Gateway New

Connect agents to WhatsApp, Slack, Discord, voice, and 20+ channels through a unified adapter pattern.

Core Topics
  • The channel adapter pattern: one interface, many implementations
  • Gateway architecture: WebSocket control plane, session management
  • Building adapters for: WhatsApp, Telegram, Slack, Discord, Signal, Teams, LINE, IRC, Matrix, and more
  • Voice as first-class modality: STT, TTS, real-time bidirectional conversation
  • Multi-channel inbox: unified message handling
  • Daemon installation: launchd (macOS), systemd (Linux)
  • NEW Device nodes — companion apps exposing camera, screen, location
Sample Assignment

Each student/group picks a different channel to integrate. Build a complete adapter that connects to your agent pipeline with message ingress, formatting, and reply routing.

12 Error Correction, Safety & Container Isolation

Implement circuit breakers, JSON repair, and Docker-based sandboxing for safe agent execution.

Core Topics
  • JSON repair pipeline: fenced/balanced/json_repair
  • Code variants resilience: 3 attempts per step
  • State machine design: pending → running → completed | failed | stopped
  • Circuit breaker pattern: CLOSED → OPEN → HALF_OPEN
  • Container-first isolation: Docker, Apple Container
  • Per-agent isolated filesystem with explicit mount policies
  • Cost management: threshold enforcement, budget-aware execution
  • NEW Security logging and audit trails
Sample Assignment

Container-isolated agent execution with circuit breaker. Demonstrate that a misbehaving agent cannot access the host system.

Act 4 — Interoperability
13 A2A — Agent-to-Agent Protocol New

Enable cross-vendor agent collaboration using Google's Agent2Agent protocol with capability discovery and task delegation.

Core Topics
  • Why A2A? Cross-vendor agent coordination
  • Google's Agent2Agent protocol (50+ partners, Linux Foundation governance)
  • Agent Cards: JSON capability advertisements for discovery
  • JSON-RPC 2.0 over HTTP(S) communication
  • Three interaction modes: synchronous, streaming (SSE), async push
  • Building A2A servers and clients
  • Federated agent systems across organizations
  • NEW gRPC support, signed security cards
Sample Assignment

Build an A2A-compliant agent discovered and invoked by other students' agents. Demonstrate cross-agent task delegation to at least 2 other agents.

14 A2UI / AG-UI — Agent-to-User Interface New

Build agents that generate dynamic, interactive UIs at runtime using declarative and event-based protocols.

Core Topics
  • The third protocol layer: MCP + A2A + A2UI/AG-UI
  • A2UI (Google): declarative components, native rendering, security-first
  • AG-UI (CopilotKit/Oracle/Microsoft): event-based streaming, ~16 event types
  • Generative UI patterns: agents creating interfaces at runtime
  • Canvas/live visual runtime: WebSocket-synchronized surfaces
  • A2UI vs AG-UI — when to use which
  • NEW The Vercel v0 model — agents as UI generators
Sample Assignment

Agent that generates dynamic, interactive UIs — e.g., custom dashboards or comparison tables with interactive filters using A2UI or AG-UI protocol.

15 Model Routing, Agent Economics & Observability New

Implement intelligent multi-model routing with cost tracking, budget controls, and OpenTelemetry instrumentation.

Core Topics
  • Multi-model landscape: frontier (Opus, GPT-5), mid-tier (Sonnet, Flash), local (Llama, Phi, Qwen)
  • Role-based model selection and ModelManager
  • Intelligent routing: task complexity → automatic model selection
  • Cost tracking: per-request, per-agent, per-session metering
  • Prompt caching strategies for cost/latency optimization
  • OpenTelemetry: spans, traces, Jaeger visualization
  • NEW Budget-aware autonomous agents that optimize their own cost
Sample Assignment

Intelligent model router with cost dashboard. Auto-select between 3+ models based on task complexity and demonstrate cost savings vs. always-using-frontier.

Act 5 — Autonomy & Production
16 Event-Driven Autonomous Agents

Shift from reactive to proactive agents that monitor event streams, evaluate relevance, and act autonomously.

Core Topics
  • From reactive to proactive: “ask → answer” to “event → decide → act”
  • Cron jobs, webhooks, Gmail Pub/Sub
  • Event bus architecture: publish-subscribe with history replay
  • Autonomous decision-making: evaluate relevance, decide whether to act
  • Karpathy's autoresearch principles: “never stop” autonomy, constraint design, markdown-as-code
  • Fixed metrics for evaluation, accept/reject with git
  • NEW Real-time telemetry streaming during autonomous operation
Sample Assignment

Agent monitoring real event streams (GitHub webhooks, email, or custom) acting autonomously for at least 1 hour with a human-reviewable audit log.

17 Agentic Coding & Markdown-as-Code Skills

Build coding agents with System 2 reasoning, codebase navigation, and markdown-driven skill injection.

Core Topics
  • How coding agents work: Claude Code, Cursor, Windsurf architecture
  • File system awareness, diff generation, test running, git integration
  • Context management across large codebases
  • System 2 Reasoning Engine: Draft-Verify-Refine loop
  • Markdown-as-Code Skills: GenericSkill reads SKILL.md files
  • Karpathy: “You are programming the program.md”
  • NEW JitRL Query Optimizer — rewriting queries before planning
Sample Assignment

Coding agent that reads a codebase, identifies a bug, generates a fix, runs tests, and iterates until tests pass using System 2 reasoning and SKILL.md.

18 Agent Evaluation, Benchmarking & Capstone Prep

Design custom eval harnesses, run GAIA/SWE-bench benchmarks, and prepare capstone proposals.

Core Topics
  • GAIA benchmarks: multi-step reasoning evaluation
  • SWE-bench: software engineering task evaluation
  • Custom eval harnesses for domain-specific benchmarks
  • Regression testing for agent behavior
  • A/B testing: planning strategies, model configs, prompt variations
  • Measuring what matters: accuracy, cost, latency, safety
  • Capstone requirements: 30-day project, ArXiv-style paper, public demo
  • NEW Agent safety evaluation — prompt injection, tool misuse, cost runaway
Sample Assignment

Custom eval harness with 20+ test cases, automated scoring, regression detection, and a report comparing two configurations. Plus: capstone proposal draft.

19 Arcturus 2.0 — Full Integration & The Complete Platform

Integrate all three protocol layers (MCP + A2A + A2UI) into a production-ready agentic platform.

Core Topics
  • Complete protocol stack in one system: MCP + A2A + A2UI/AG-UI
  • Arcturus 2.0 architecture: how all 19 sessions integrate
  • Production deployment: Docker Compose, health checks, restart policies
  • Gateway API platform: auth, rate limiting, metering, webhooks
  • Live demo: task → DAG → agents → MCP → A2UI → channel
  • What's next: agent economies, self-improving systems, governance
Sample Assignment

Finalize capstone with GitHub Projects plan, or submit a PR to Arcturus 2.0 implementing a course concept. 2-minute lightning preview of capstone idea.

20 Capstone Pitches & Arcturus 2.0 Contributions

Student presentations and the beginning of the 4-week capstone execution window.

Format
  • 5-minute pitch per team + 2–3 minutes Q&A
  • GitHub Projects reviewed
  • Go/no-go decision on proposals
  • 4-week execution window begins

“Best in the World” Checklist

By Session 19, every student will be able to:

1 Build a multi-agent DAG executor from scratch
2 Wire it to any channel (WhatsApp, Slack, Discord, voice) in a day
3 Give it browser autonomy and computer use
4 Give it sandboxed code execution in containers
5 Make it remember, learn preferences, and recall workflows
6 Deploy with observability and cost controls
7 Evaluate and benchmark against GAIA-level tasks
8 Expose it as a platform API that other agents call (A2A)
9 Have it generate dynamic UIs for users (A2UI/AG-UI)
10 Have it operate autonomously on event streams
11 Design constraints that enable safe autonomous operation
12 Build a coding agent with System 2 reasoning