Tuesdays 11:45 AM - 1:25 PM and Thursdays 2:50 PM - 4:30 PM, Shillman Hall 420. Class meetings are in-person, and will sometimes be on zoom at this link.
Professor: David Bau davidbau@northeastern.edu. Office Hours: before class, 10:30am Tuesdays, near the lecture hall
TAs:
Eric Todd
todd.er@northeastern.edu. Help hours: Wednesdays 10am-1pm, Fridays
3:30-4:30pm (zoom link)
Arnab Sen Sharma
sensharma.a@northeastern.edu. Help hours: Mondays 4-6pm (zoom link)
Sign up for Piazza here. Hand in reading questions (the day before class) and class notebooks (after class) on Canvas here. Homeworks will go to gradescope here.
Class participation: 40 points. In-class participation and notebooks. There will be points for submitting questions on nightly (before-class) reading, and for submitting in-class notebooks, about 20 each.
Homework: 40 points. Programming and calculation exercises. Homeworks will be jupyter notebooks to be worked through by students individually, due to be submitted online every two weeks at 11AM before class when they are due. Late homework submissions will be accepted but will lose points per day late, no points after a week. 10 points per homework.
Midterm: 40 points. A written exam about foundational methods. Closed-book.
Final Project: 40 points. A project to solve a problem by training and evaluating a deep network. Done by groups of 2 students (3 with permission). We may choose a contest activity for this.
There is no textbook for the class. Slides of material we have covered are linked below. There are original-research paper readings, linked below.
For those looking for a good reference book covering similar material, here are some good ones I recommend. Both are available online.
Tuesday, January 9, 2024 | What is deep learning, a bit of history in three traditions | slides - 1958 In-class Reading - Perceptrons Notebook (hand-in) - HW1 out |
Thursday, January 11, 2024 | Continuing history: gradient descent, universal computation, what is an MLP | slides - Read Rumelhart 1986 (hand-in) - Notebook on How to Read Pytorch (hand-in) |
Tuesday, January 16, 2024 | How to program a multilayer perceptrons: pytorch and modular machine learning | slides - Read Bottou 1990 (hand-in) - Notebooks: Pytorch-3-Autograd Pytorch-4-Modules (no new hand-in) - HW2 out |
Thursday, January 18, 2024 | Backprop I | slides - Read Solla 1988 - Notebook: cross-entropy Notebook: computation graphs (hand-in) - HW1 due - HW3 out |
Tuesday, January 23, 2024 | Backprop II, SGD, and Momentum | slides - Read Kingma 2015 (ADAM) - Notebook: backprop (hand-in) |
Thursday, January 25, 2024 | Practical training techniques: ways to improve gradients and ways to regularize | slides - Read Glorot 2010 (Xavier init) - Notebook: zigzag Notebook: stuck optimization (hand-in) - HW2 due |
Tuesday, January 30, 2024 | Bau Lab Research - Large Model Interpretability | slides - [no notebook or reading hand-in] |
Thursday, February 1, 2024 | The practice of training (2). Initialization. Then: Generalization and regularization | slides - Read Krogh 1991 (Weight decay) (hand-in) - Notebook: training dynamics (hand-in) |
Tuesday, February 6, 2024 | Words and images: representing concepts as vectors; time-invariant and translation-invariant weight-sharing | slides - Read LeCun 1989 (CNNs) (hand-in) - HW3 due |
Thursday, February 8, 2024 | Stacking convolutions; AlexNet; receptive fields | slides - Read AlexNet 2012 (hand-in) - Notebook: convolutions (hand-in) - Polo CNN explainer - HW4 out |
Tuesday, February 13, 2024 | Snow day | (University closed.) |
Thursday, February 15, 2024 | Equivariance, Batchnorm, Residual connections, Dropout | slides - Read ResNet 2015 (hand-in) - Notebook: Debugging CNNs (hand-in) |
Tuesday, February 20, 2024 | Debugging CNNs in detail | Notebook: Debugging CNNs (hand-in (repeated)) - Read Elman 1990 (hand-in) |
Thursday, February 22, 2024 | Midterm review | slides - practice midterm from 2022 - HW4 due |
Tuesday, February 27, 2024 | Midterm exam | (closed-book) |
Thursday, February 29, 2024 | Postmidterm review | slides - HW5 out |
Tuesday, March 5, 2024 | Spring break | |
Thursday, March 7, 2024 | Spring break | |
Tuesday, March 12, 2024 | Language Modeling | slides - Notebook: Language Models (hand-in) - Read Vaswani 2017 (Transformer) (hand-in (due 3/14)) |
Thursday, March 14, 2024 | Transformers | slides - Notebook: Single-Head Transformer Training (hand-in) - Weekend Neural Net Workshop signup |
Tuesday, March 19, 2024 | Generating images: GANs, VAEs | slides - Notebook: Training a GAN (hand-in) - Final project rubric out - Read Goodfellow 2014 (hand-in) |
Thursday, March 21, 2024 | Diffusion models | slides |
Tuesday, March 26, 2024 | Self-supervised representation learning, using pretrained models. | slides - Notebook: CLIP and BERT (hand-in) - HW5 due |
Thursday, March 28, 2024 | Topic: The policy debate over open and closed AI ecosystems | slides |
Tuesday, April 2, 2024 | Adversarial attack and adversarial robustness. | slides - Notebook: Adversarial Attack (hand-in) |
Thursday, April 4, 2024 | Tuning a chatbot, and emergence of humanlike behavior | slides - Notebook: ICL (hand-in) |
Tuesday, April 9, 2024 | Interpreting large models: mechanisms of ICL. | slides - Notebook: Function Vectors (hand-in) |
Thursday, April 11, 2024 | Extra lecture or gap day | |
Tuesday, April 16, 2024 | Project presentations |