DS 4440 - Spring 2024 - Practical Deep Networks

Tuesdays 11:45 AM - 1:25 PM and Thursdays 2:50 PM - 4:30 PM, Shillman Hall 420. Class meetings are in-person, and will sometimes be on zoom at this link.

Professor: David Bau davidbau@northeastern.edu. Office Hours: before class, 10:30am Tuesdays, near the lecture hall

TAs:
Eric Todd todd.er@northeastern.edu. Help hours: Wednesdays 10am-1pm, Fridays 3:30-4:30pm (zoom link)
Arnab Sen Sharma sensharma.a@northeastern.edu. Help hours: Mondays 4-6pm (zoom link)

Sign up for Piazza here. Hand in reading questions (the day before class) and class notebooks (after class) on Canvas here. Homeworks will go to gradescope here.

Summary

In this course we will learn the principles and practice of deep learning methods. We will cover the capabilities of deep networks, the main methods for training them effectively, and the common architectures and techniques for using a deep network to process and produce images and natural language text. In addition to getting experience using these methods, we will discuss the evolution of the field, some of the main questions and debates that have emerged in deep learning research, and explore some current research topics. There will be a final project where you work with a partner implement a solution to a problem using deep network methods.

Grading

160 points total.

Class participation: 40 points. In-class participation and notebooks. There will be points for submitting questions on nightly (before-class) reading, and for submitting in-class notebooks, about 20 each.

Homework: 40 points. Programming and calculation exercises. Homeworks will be jupyter notebooks to be worked through by students individually, due to be submitted online every two weeks at 11AM before class when they are due. Late homework submissions will be accepted but will lose points per day late, no points after a week. 10 points per homework.

Midterm: 40 points. A written exam about foundational methods. Closed-book.

Final Project: 40 points. A project to solve a problem by training and evaluating a deep network. Done by groups of 2 students (3 with permission). We may choose a contest activity for this.

Books

There is no textbook for the class. Slides of material we have covered are linked below. There are original-research paper readings, linked below.

For those looking for a good reference book covering similar material, here are some good ones I recommend. Both are available online.

Calendar

Tuesday, January 9, 2024 What is deep learning, a bit of history in three traditions slides - 1958 In-class Reading - Perceptrons Notebook (hand-in) - HW1 out
Thursday, January 11, 2024 Continuing history: gradient descent, universal computation, what is an MLP slides - Read Rumelhart 1986 (hand-in) - Notebook on How to Read Pytorch (hand-in)
Tuesday, January 16, 2024 How to program a multilayer perceptrons: pytorch and modular machine learning slides - Read Bottou 1990 (hand-in) - Notebooks: Pytorch-3-Autograd Pytorch-4-Modules (no new hand-in) - HW2 out
Thursday, January 18, 2024 Backprop I slides - Read Solla 1988 - Notebook: cross-entropy Notebook: computation graphs (hand-in) - HW1 due - HW3 out
Tuesday, January 23, 2024 Backprop II, SGD, and Momentum slides - Read Kingma 2015 (ADAM) - Notebook: backprop (hand-in)
Thursday, January 25, 2024 Practical training techniques: ways to improve gradients and ways to regularize slides - Read Glorot 2010 (Xavier init) - Notebook: zigzag Notebook: stuck optimization (hand-in) - HW2 due
Tuesday, January 30, 2024 Bau Lab Research - Large Model Interpretability slides - [no notebook or reading hand-in]
Thursday, February 1, 2024 The practice of training (2). Initialization. Then: Generalization and regularization slides - Read Krogh 1991 (Weight decay) (hand-in) - Notebook: training dynamics (hand-in)
Tuesday, February 6, 2024 Words and images: representing concepts as vectors; time-invariant and translation-invariant weight-sharing slides - Read LeCun 1989 (CNNs) (hand-in) - HW3 due
Thursday, February 8, 2024 Stacking convolutions; AlexNet; receptive fields slides - Read AlexNet 2012 (hand-in) - Notebook: convolutions (hand-in) - Polo CNN explainer - HW4 out
Tuesday, February 13, 2024 Snow day (University closed.)
Thursday, February 15, 2024 Equivariance, Batchnorm, Residual connections, Dropout slides - Read ResNet 2015 (hand-in) - Notebook: Debugging CNNs (hand-in)
Tuesday, February 20, 2024 Debugging CNNs in detail Notebook: Debugging CNNs (hand-in (repeated)) - Read Elman 1990 (hand-in)
Thursday, February 22, 2024 Midterm review slides - practice midterm from 2022 - HW4 due
Tuesday, February 27, 2024 Midterm exam (closed-book)
Thursday, February 29, 2024 Postmidterm review slides - HW5 out
Tuesday, March 5, 2024 Spring break
Thursday, March 7, 2024 Spring break
Tuesday, March 12, 2024 Language Modeling slides - Notebook: Language Models (hand-in) - Read Vaswani 2017 (Transformer) (hand-in (due 3/14))
Thursday, March 14, 2024 Transformers slides - Notebook: Single-Head Transformer Training (hand-in) - Weekend Neural Net Workshop signup
Tuesday, March 19, 2024 Generating images: GANs, VAEs slides - Notebook: Training a GAN (hand-in) - Final project rubric out - Read Goodfellow 2014 (hand-in)
Thursday, March 21, 2024 Diffusion models slides
Tuesday, March 26, 2024 Self-supervised representation learning, using pretrained models. slides - Notebook: CLIP and BERT (hand-in) - HW5 due
Thursday, March 28, 2024 Topic: The policy debate over open and closed AI ecosystems slides
Tuesday, April 2, 2024 Adversarial attack and adversarial robustness. slides - Notebook: Adversarial Attack (hand-in)
Thursday, April 4, 2024 Tuning a chatbot, and emergence of humanlike behavior slides - Notebook: ICL (hand-in)
Tuesday, April 9, 2024 Interpreting large models: mechanisms of ICL. slides - Notebook: Function Vectors (hand-in)
Thursday, April 11, 2024 Extra lecture or gap day
Tuesday, April 16, 2024 Project presentations