CS-552: Modern Natural Language Processing

Course Description

Natural language processing is ubiquitous in modern intelligent technologies, serving as a foundation for language translators, virtual assistants, search engines, and many more. In this course, we cover the foundations of modern methods for natural language processing, such as word embeddings, recurrent neural networks, transformers, and pretraining, and how they can be applied to important tasks in the field, such as machine translation and text classification. We also cover issues with these state-of-the-art approaches (such as robustness, interpretability, sensitivity), identify their failure modes in different NLP applications, and discuss analysis and mitigation techniques for these issues.


Platform Where & when
Lectures Wednesdays: 9:15-11:00am [CM2] & Thursdays: 1:15-2:00pm [CE1]
Exercises Session Thursdays: 2:15-4:00pm [CE1]
Project Assistance
(not every week)
Wednesdays: 11:15am-12:00pm [CM2]
Forum Ed Forum [link]
Moodle Annoucements [link]

All lectures will be given in person and live streamed on Zoom. The link to the Zoom is available on the course Moodle page. Lectures will be recorded and uploaded to SwitchTube.

Lecture Schedule

Week Date Topic Nostoi Instructor
Week 1 22 Feb
23 Feb
Introduction + Building a simple neural classifier
Neural LMs: word embeddings [slides, readings]
Lecture 1
Lecture 2
Lecture 3
Antoine Bosselut
Week 2 1 Mar
2 Mar
Classical and Fixed-context Language Models
Recurrent Neural Networks [slides, readings]
Lecture 4-5
Lecture 6
Antoine Bosselut
Week 3 8 Mar
9 Mar
LSTMs and Sequence-to-sequence models
Theoretical properties of RNNs [slides, readings]
Lecture 7
Lecture 8
Lecture 9
Antoine Bosselut
Gail Weiss
Week 4 15 Mar
16 Mar
Attention + Transformers
Transformers [slides, readings]
Lecture 10-11
Lecture 12
Antoine Bosselut
Week 5 22 Mar
23 Mar
Pretraining: ELMo, BERT
Transfer Learning: Introduction [slides, readings]
Lecture 13-14
Lecture 15
Antoine Bosselut
Week 6 29 Mar
30 Mar
Transfer Learning: Dataset Biases
Text Generation [slides]
Lecture 16-17
Lecture 18
Antoine Bosselut
Week 7 5 Apr
6 Apr
Text Generation [slides, readings] Lecture 19-20
Lecture 21
Antoine Bosselut
Week 8 19 Apr
20 Apr
In-Context Learning
Project Description [slides]
Lecture 23-24 Antoine Bosselut
Week 9 26 Apr
27 Apr
Scaling Laws + Model Compression
No class [slides, readings]
Lecture 25
Lecture 26
Antoine Bosselut
Reza Banaei
Week 10 3 May
4 May
Ethics in NLP
No class [slides]
Lecture 27 Antoine Bosselut
Week 11 10 May
11 May
Interpretability & Analysis of Language Models
No class [slides]
Lecture 31 Antoine Bosselut
Week 12 17 May
18 May
Reading Comprehension & Open-domain QA
No class [slides, readings]
Lecture 34
Lecture 35
Antoine Bosselut
Angelika Romanou
Week 13 24 May
25 May
Tokenization + Multilingual LMs
No class [slides, readings]
  Negar Foroutan
Week 14 31 May
1 Jun
Language & Vision
Language & Vision + Wrap-up [slides]
  Syrielle Montariol
Antoine Bosselut

Exercise Schedule

Week Date Topic Instructor
Week 1 23 Feb Setup + Word embeddings [code] Angelika Romanou
Sepideh Mamooler
Simin Fan
Week 2 2 Mar Word embeddings review
Classical & Fixed-context Language Models [code]
Angelika Romanou
Mohammedreza Banaei
Sepideh Mamooler
Week 3 9 Mar Language Models Review
Sequence-to-sequence models [code]
Mohammedreza Banaei
Sepideh Mamooler
Simin Fan
Week 4 16 Mar Sequence-to-sequence models review
Attention + Transformers [code]
Sepideh Mamooler
Mete Ismayil
Simin Fan
Week 5 23 Mar Transformers Review
Pretraining: ELMo, BERT [code]
Simin Fan
Sepideh Mamooler
Molly Petersen
Week 6 30 Mar Pretraining Review
Transfer Learning: Dataset Biases [code]
Molly Petersen
Mete Ismayil
Sepideh Mamooler
Week 7 6 Apr Transfer Learning Review
Text Generation [code]
Molly Petersen
Deniz Bayazit
Sepideh Mamooler
Week 8 13 Apr EASTER BREAK  
Week 9 20 Apr Text Generation Review
In-context Learning [code]
Deniz Bayazit
Silin Gao
Sepideh Mamooler
Week 10 27 Apr In-context Learning Review
Milestone 1 Discussion
Silin Gao
TA meetings on-demand
Week 11 4 May Project TA meetings on-demand
Week 12 11 May Milestone 2 Discussion
Silin Gao
TA meetings on-demand
Week 13 18 May Project TA meetings on-demand
Week 14 25 May Milestone 3 Discussion
Deniz Bayazit
TA meetings on-demand
Week 15 1 Jun Project TA meetings on-demand

Exercises Session format:

Note: Please make sure you have already done the setup prerequisites to run the coding parts of the exercises. You can find the instructions here.


Your grade in the course will be computed according to the following guidelines:

Assignments (40%):

There will be three assignments throughout the course. They will be released and due according to the following schedule:

Assignment 1 (10%)

Link for the assignment here.

Assignment 2 (15%)

Link for the assignment here.

Assignment 3 (15%)

Link for the assignment here.

Assignments will be released and announced on Moodle and Ed.

Project (60%):

The project will involve using large-scale language models (100B+ parameters) and medium-scale (300M parameters) in the domain of education. The project will be divided into 2 milestones and a final submission. Each milestone will be worth 15% of the final grade with the remaining 30% being allocated to the final report. Each team will be supervised by one of the course TAs or AEs.

Registration details can be found in the announcement here.

Milestone 1 (15%):

Milestone 2 (15%):

Final Deliverable (30%):

Late Days Policy

All assignments and milestones are due at 11:59 PM on their due date. As we understand that circumstances can make it challenging to abide by these due dates, you will receive 6 late days over the course of the semester to be allocated to the assignments and project milestones as you see fit. No further extensions will be granted. The only exception to this rule is for the final report, code, and data. No extensions will be granted beyond June 15th.


Lecturer: Antoine Bosselut.

Teaching assistants: Mohammadreza Banaei, Deniz Bayazit, Zeming (Eric) Chen, Simin Fan, Silin Gao, Molly Petersen, Angelika Romanou

Please contact us for any organizational questions or questions related to the course content.