COMPSCI 714 · AI Architecture and Design
This course focuses on the design and implementation of modern AI systems — transformer architectures, large language models, training dynamics, and system-level considerations in deep learning.
Notes here focus on:
- architectural reasoning: why design choices are made, not just what they are
- implementation-level understanding — building models from scratch to expose internals
- connecting theory to real systems (GPT-2, attention mechanisms, tokenization)
Linked notes:
- Deep Learning Explained with Mathematics
- Build GPT-2 from Scratch
- Transformers
- AI Systems
- Mathematical Foundations
- COMPSCI 713
Bridges Back And Outward
- Deep Learning Explained with Mathematics connects this course back to probability, optimization, and representation learning
- Build GPT-2 from Scratch connects the architecture path into systems and implementation
- A Simple CPU-based Whirl Shader Experiment is a useful detour if you want a more geometric and computational perspective