CS6120: Practical Natural Language Processing

Instructor: Kenneth Church (Office hours before and after class on most Mondays (but not when Jiaji is teaching); Jiaji Huang will cover two lectures (see schedule below)
TA: Ferran Sulaiman (Office hours: 3-6PM PST most Tuesdays and Fridays (but not the first Friday) in person or by zoom; best by appointment)

Please feel free to contact us by email and/or chat; additional office hours by zoom can be arranged by appointment.

Previous versions of this class

  1. Spring 2023 in San Jose
  2. Spring 2023 in Boston

Textbooks

  1. JM3
  2. E (optional; maybe)
  3. Deep Learning with Python (optional; maybe)

Computers

  1. Your laptop; will need to install python (and more)
  2. Khoury Cloud: this and this
  3. Discovery Cluster
  4. GitHub (please request an account)
The syllabus below is modeled after the previous versions of this class, though I hope to prioritize the material based more on current priorities and less on how the field got to where it is.

Syllabus

Date Topic Teacher Slides Readings Assignments Projects
(Suggestions)
9/11
Introduction, Tools, Machine Learning, Statistics
Kenneth Church ppt
pdf
Lecture from Last Term
  1. JM3: Chapter 6
  2. My opinion piece on Word2vec
  3. Church and Hanks (1990)
  4. Latent Semantic Indexing
Better Together
(Due 9/23)
Solution
  1. Author Ids
  2. ProMed
  3. negation
9/18
Bag of Words, tf/idf, Naive Bayes, PMI, Word2vec, Man is to Woman as..., Linear Algebra, Rotations, Bilingual Lexicon Induction (BLI), BERT
Jiaji Huang ppt
pdf
Lecture from Last Term
  1. General Fine-Tuning
  2. HuggingFace tutorial on inference
No Assignment
9/25 Deep Nets: Inference Kenneth Church ppt
pdf
Lecture from Last Term
JM3: Chapter 14 Assignment 2
(Due 10/14)
10/2
More Deep Nets: Question Answering, Machine Translation, Part of Speech Tagging, Sequence Modeling
Kenneth Church ppt
pdf
Lecture from Last Term
No Readings
(Study for Exam)
No Assignment
(Study for Exam)
10/9 Holiday
10/16 Exam Kenneth Church No Lecture
  1. JM3: Chapter 10
  2. A Gentle Introduction to Fine-Tuning
  3. HuggingFace tutorial on fine-tuning
  4. Zoo of Checkpoints
Write proposal
for final project
10/23
Transformers; Zoo of checkpoints ; Deep Nets: Fine-Tuning, Distillation
Jiaji Huang ppt
pdf
  1. JM3 Chapter 15
  2. GPT-3: What’s it good for?
  3. Weizenbaum’s nightmares: how the inventor of the first chatbot turned against AI
  4. Prompt Engineering
  5. I am a Student. You have No Idea How much we're Using ChatGPT
Assignment 3
(canceled)
10/30 ChatGPT Kenneth Church ppt
pdf
  1. No Language Left Behind
  2. ArtELingo
  3. COCO
  4. Papers with Code (for COCO)
  5. LibriSpeech
  6. Speech to Speech Translation
  7. IGLUE
Assignment 4
(Due 11/11)
11/6
Multi-modality: NLP, vision, speech, emotion, translation, etc.
Jiaji Huang ppt
pdf
Readings: TBD No Assignment 5
11/13 Homework and Midterms Kenneth Church ppt
pdf
Readings: TBD Progress report on your project
(Due Nov 18)
11/20
Lexical Semantics
Kenneth Church ppt
pdf
JM25
Readings: TBD Yet another progress report on your project
(Due Nov 26)
11/27 Topics that may become great (again) Kenneth Church ppt
pdf
Readings: TBD No assignments
12/4 Final Presentations; Web Search Kenneth Church ppt
pdf
Readings: TBD Presentations by Teams 1-3
12/11 Final Presentations; Speech Kenneth Church Slides Readings: TBD Presentations by Team 4-7
Stand-By Benchmarking Kenneth Church Slides
  1. SOTA-Chasing
  2. Benchmarking Workshop
Assignment 12

Team Assignments

  1. Jing Luo and Zihan Chen
  2. Jiali Han
  3. ‘Ingrid’ Xiaoying Liu
  4. Runshi Gao, Haoping Lin and Yang Yao
  5. Bolang Yu, Ross Newman, Qiushi Xia and Ruitong Xu
  6. Raman Lnu and Edward Burke
  7. Rui Liu