DS 4440 - Spring 2024 - Practical Deep Networks

Tuesdays 11:45 AM - 1:25 PM and Thursdays 2:50 PM - 4:30 PM, Shillman Hall 420. Class meetings are in-person, and will sometimes be on zoom at this link.

Professor: David Bau davidbau@northeastern.edu. Office Hours: before class, 10:30am Tuesdays, near the lecture hall

TAs:
Eric Todd todd.er@northeastern.edu. Help hours: Wednesdays 10am-1pm, Fridays 3:30-4:30pm (zoom link)
Arnab Sen Sharma sensharma.a@northeastern.edu. Help hours: Mondays 4-6pm (zoom link)

Sign up for Piazza here. Hand in reading questions (the day before class) and class notebooks (after class) on Canvas here. Homeworks will go to gradescope here.

Summary

In this course we will learn the principles and practice of deep learning methods. We will cover the capabilities of deep networks, the main methods for training them effectively, and the common architectures and techniques for using a deep network to process and produce images and natural language text. In addition to getting experience using these methods, we will discuss the evolution of the field, some of the main questions and debates that have emerged in deep learning research, and explore some current research topics. There will be a final project where you work with a partner implement a solution to a problem using deep network methods.

Grading

160 points total.

Class participation: 40 points. In-class participation and notebooks. There will be points for submitting questions on nightly (before-class) reading, and for submitting in-class notebooks, about 20 each.

Homework: 40 points. Programming and calculation exercises. Homeworks will be jupyter notebooks to be worked through by students individually, due to be submitted online every two weeks at 11AM before class when they are due. Late homework submissions will be accepted but will lose points per day late, no points after a week. 10 points per homework.

Midterm: 40 points. A written exam about foundational methods. Closed-book.

Final Project: 40 points. A project to solve a problem by training and evaluating a deep network. Done by groups of 2 students (3 with permission). We may choose a contest activity for this.

Books

There is no textbook for the class. Slides of material we have covered are linked below. There are original-research paper readings, linked below.

For those looking for a good reference book covering similar material, here are some good ones I recommend. Both are available online.

Calendar

Tuesday, January 9, 2024	What is deep learning, a bit of history in three traditions	slides - 1958 In-class Reading - Perceptrons Notebook (hand-in) - HW1 out
Thursday, January 11, 2024	Continuing history: gradient descent, universal computation, what is an MLP	slides - Read Rumelhart 1986 (hand-in) - Notebook on How to Read Pytorch (hand-in)
Tuesday, January 16, 2024	How to program a multilayer perceptrons: pytorch and modular machine learning	slides - Read Bottou 1990 (hand-in) - Notebooks: Pytorch-3-Autograd Pytorch-4-Modules (no new hand-in) - HW2 out
Thursday, January 18, 2024	Backprop I	slides - Read Solla 1988 - Notebook: cross-entropy Notebook: computation graphs (hand-in) - HW1 due - HW3 out
Tuesday, January 23, 2024	Backprop II, SGD, and Momentum	slides - Read Kingma 2015 (ADAM) - Notebook: backprop (hand-in)
Thursday, January 25, 2024	Practical training techniques: ways to improve gradients and ways to regularize	slides - Read Glorot 2010 (Xavier init) - Notebook: zigzag Notebook: stuck optimization (hand-in) - HW2 due
Tuesday, January 30, 2024	Bau Lab Research - Large Model Interpretability	slides - [no notebook or reading hand-in]
Thursday, February 1, 2024	The practice of training (2). Initialization. Then: Generalization and regularization	slides - Read Krogh 1991 (Weight decay) (hand-in) - Notebook: training dynamics (hand-in)
Tuesday, February 6, 2024	Words and images: representing concepts as vectors; time-invariant and translation-invariant weight-sharing	slides - Read LeCun 1989 (CNNs) (hand-in) - HW3 due
Thursday, February 8, 2024	Stacking convolutions; AlexNet; receptive fields	slides - Read AlexNet 2012 (hand-in) - Notebook: convolutions (hand-in) - Polo CNN explainer - HW4 out
Tuesday, February 13, 2024	Snow day	(University closed.)
Thursday, February 15, 2024	Equivariance, Batchnorm, Residual connections, Dropout	slides - Read ResNet 2015 (hand-in) - Notebook: Debugging CNNs (hand-in)
Tuesday, February 20, 2024	Debugging CNNs in detail	Notebook: Debugging CNNs (hand-in (repeated)) - Read Elman 1990 (hand-in)
Thursday, February 22, 2024	Midterm review	slides - practice midterm from 2022 - HW4 due
Tuesday, February 27, 2024	Midterm exam	(closed-book)
Thursday, February 29, 2024	Postmidterm review	slides - HW5 out
Tuesday, March 5, 2024	Spring break
Thursday, March 7, 2024	Spring break
Tuesday, March 12, 2024	Language Modeling	slides - Notebook: Language Models (hand-in) - Read Vaswani 2017 (Transformer) (hand-in (due 3/14))
Thursday, March 14, 2024	Transformers	slides - Notebook: Single-Head Transformer Training (hand-in) - Weekend Neural Net Workshop signup
Tuesday, March 19, 2024	Generating images: GANs, VAEs	slides - Notebook: Training a GAN (hand-in) - Final project rubric out - Read Goodfellow 2014 (hand-in)
Thursday, March 21, 2024	Diffusion models	slides
Tuesday, March 26, 2024	Self-supervised representation learning, using pretrained models.	slides - Notebook: CLIP and BERT (hand-in) - HW5 due
Thursday, March 28, 2024	Topic: The policy debate over open and closed AI ecosystems	slides
Tuesday, April 2, 2024	Adversarial attack and adversarial robustness.	slides - Notebook: Adversarial Attack (hand-in)
Thursday, April 4, 2024	Tuning a chatbot, and emergence of humanlike behavior	slides - Notebook: ICL (hand-in)
Tuesday, April 9, 2024	Interpreting large models: mechanisms of ICL.	slides - Notebook: Function Vectors (hand-in)
Thursday, April 11, 2024	Extra lecture or gap day
Tuesday, April 16, 2024	Project presentations