CMU 18786/18780

Course Overview

Neural networks have increasingly taken over various AI tasks, and currently produce the state of the art in many AI tasks ranging from computer vision and planning for self-driving cars to playing computer games. Basic knowledge of NNs, known currently in the popular literature as "deep learning", familiarity with various formalisms, and knowledge of tools, is now an essential requirement for any researcher or developer in most AI and NLP fields. This course is a broad introduction to the field of neural networks and their "deep" learning formalisms. The course traces some of the development of neural network theory and design through time, leading quickly to a discussion of various network formalisms, including simple feedforward, convolutional, recurrent, and probabilistic formalisms, the rationale behind their development, and challenges behind learning such networks and various proposed solutions. We subsequently cover various extensions and models that enable their application to various tasks such as computer vision, speech recognition, machine translation and playing games.

Prerequisites

This course is intended for graduate students and qualified undergraduate students with a strong mathematical and programming background. Undergraduate level training or coursework in algorithms, linear algebra, calculus, probability, and statistics is suggested. A background in programming will also be necessary for the problem sets; students are expected to be familiar with python or learn it during the course.

Textbooks

There will be no required textbooks, though we suggest the following to help you to study (all available online):

An online textbook: https://www.deeplearningbook.org/

We will provide suggested readings from these books in the schedule below.

Piazza

We will use Piazza for class discussions. Please go to the course Piazza site to join the course forum (note: you must use a cmu.edu email account to join the forum). We strongly encourage students to post on this forum rather than emailing the course staff directly (this will be more efficient for both students and staff). Students should use Piazza to:

Ask clarifying questions about the course material.
Share useful resources with classmates (so long as they do not contain homework solutions).
Look for students to form study groups.
Answer questions posted by other students to solidify your own understanding of the material.

The course Academic Integrity Policy must be followed on the message boards at all times. Do not post or request homework solutions! Also, please be polite.

Grading Policy

The grading policy will depend on what class you are signed up for.:

18-786
- 5 Homeworks (60%)
- Final Project (40%)
18-780
- 4 Homeworks (100%)

Gradescope: We will use Gradescope to collect PDF submissions of each problem set. Upon uploading your PDF, Gradescope will ask you to identify which page(s) contains your solution for each problem – this is a great way to double check that you haven’t left anything out. The course staff will manually grade your submission, and you’ll receive feedback explaining your final marks.

Regrade Requests: If you believe an error was made during grading, you’ll be able to submit a regrade request on Gradescope. For each homework, regrade requests will be open for only 1 week after the grades have been published. This is to encourage you to check the feedback you’ve received early!

Late submissions will not be accepted. There is one exception to this rule: You are given 6 “late days” (self-granted 24-hr extensions) which you can use to give yourself extra time without penalty. At most two late days can be used per assignment. This will be monitored automatically via Gradescope.

Academic Integrity Policy

Group studying and collaborating on problem sets are encouraged, as working together is a great way to understand new material. Students are free to discuss the homework problems with anyone under the following conditions:

Students must write their own solutions and understand the solutions that they wrote down.
Students must list the names of their collaborators (i.e., anyone with whom the assignment was discussed).
Students may not use old solution sets from other classes under any circumstances, unless the instructor grants special permission.

Students are encouraged to read CMU's Policy on Cheating and Plagiarism.

Using LaTeX

Students are strongly encouraged to use LaTeX for problem sets. LaTeX makes it simple to typeset mathematical equations, and is extremely useful for graduate students to know. Most of the academic papers you read were written with LaTeX, and probably most of the textbooks too. Here is an excellent LaTeX tutorial and here are instructions for installing LaTeX on your machine.

Acknowledgments

This course is based in part on material developed by Bhiksha Raj (CMU) and Chinmay Hedge (NYU). The course website follows the template of 18661.

Schedule (Subject to Change)

Date	Topics	HW
1/14	Intro + FeedForward NN
1/16	Backpropagation
1/17	Recitation	HW 1 Release
1/21	Torch Tutorial + Optimization
1/23	Practicalities
1/24	Recitation	HW 1 Due HW 2 Release
1/28	Introduction to Computer Vision
1/30	Convolutional Neural Networks
1/31	Recitation
2/4	Image Classification
2/6	Image Classification
2/7	Recitation
2/11	Object Detection	HW 3 Released
2/13	Object Detection
2/14	Recitation	HW 2 Due
2/18	Segmentation
2/20	Segmentation
2/21	Recitation	HW 4 Release
2/25	Language Transformers	HW 3 Due
2/27	Vision Transformers
2/28	Recitation
3/4	Spring Break
3/6	Spring Break
3/9	Spring Break	HW 4 Due (18-780)
3/11	Segmentation using Transformers	Project Release
3/13	LLM Intro
3/14	Recitation	HW 4 Due (18-786) HW 5 Release
3/18	LLM Finetuning
3/20	LLM Chain of Thought
3/21	Recitation
3/25	Reinforcement Learning 1
3/27	Reinforcement Learning 2
3/28	Recitation
4/1	Reinforcement Learning Applications
4/3	Carnival
4/4	Carnival
4/8	Generative 1 - diffusion models, GANs	HW 5 Due
4/10	Generative 2	Midterm projects due
4/11	Recitation
4/15	Special Topic 1
4/17	Special Topic 2
4/18	Recitation
4/22	Project Presentations
4/24	Project Presentations
4/25	Project Presentations	Final Reports Due

Prof. Andrea Zanette (azanette)	After class outside the lecture hall
Prof. Marios Savvides (marioss)	After class outside the lecture hall
Omar Alama (oalama)	Wednesday: 3:00 pm - 4:00 pm	Zoom
Hrishikesh Gokhale (hgokhale)	Wednesday: 5:00 pm - 6:00 pm	Zoom
Ritarka Samanta (rsamanta)	Tuesday: 4:00 pm - 5:00 pm	Posner 146
Chang Shu (cathyshu)	Tuesday: 5:00 pm - 6:00 pm	Zoom
Xin Shu (xinshu)	Monday: 11:00 am - 12:00 pm	Zoom
Haoran Zhang (haoranz5)	Thursday: 2:00 pm - 3:00 pm	Porter Hall B48 (MS study suite)	Zoom
Yingsi Qin (yingsiq)	Tuesday: 1:00 pm - 2:00 pm	Zoom
Yizhou Zhao (yizhouz)	Tuesday: 2:00 pm - 3:00 pm	Zoom
Zhantao (zhantaoy)	Friday: 3:00 pm - 4:00 pm	Zoom
Alex Liao (jiangmel)	Wednesday: 2:00 pm - 3:00 pm	Zoom

18-786/18-780: Introduction to Deep Learning