DTSA 5748: Deep Learning for Natural Language Processing

  • Specialization: Natural Language Processing: Deep Learning Meets Linguistics
  • Instructor: Katharina Kann
  • Prior knowledge needed: TBD

 View on Coursera

Learning Outcomes

  • Define feedforward networks, recurrent neural networks, convolutional neural networks, attention, and transformers. 
  • Implement and train feedforward networks, recurrent neural networks, convolutional neural networks, attention, and transformers. 
  • Describe the idea behind backpropagation. 
  • Describe the idea behind transfer learning and frequently used transfer learning algorithms. 
  • Design and implement their own neural network architectures for natural language processing tasks.

Course Content

Duration: 9h

This first module introduces the fundamental concepts of feedforward and recurrent neural networks (RNNs), focusing on their architectures, mathematical foundations, and applications in natural language processing (NLP). We'll will begin with an exploration of feedforward networks and their role in sentence embeddings and sentiment analysis. We then progresses to RNNs, covering sequence modeling techniques such as LSTMs, GRUs, and bidirectional RNNs, along with their implementation in Python. Finally, you will examine training techniques, gaining hands-on experience in optimizing neural language models. 

Duration: 6h

This week we'll explore sequence-to-sequence models in natural language processing (NLP), beginning with recurrent neural network (RNN)-based architectures and the introduction of attention mechanisms for improved alignment in tasks like machine translation. The module also covers best practices for training neural networks, including regularization, optimization strategies, and efficient model training. At the end of the week, you will gain practical experience in implementing and training sequence-to-sequence models. 

Duration: 7h

This module explores transfer learning techniques in NLP, focusing on pretraining, finetuning, and multilingual models. You will first examine the role of pretrained language models like GPT, GPT-2, and BERT, and their challenges. We then explore multitask training and data augmentation, highlighting strategies like parameter sharing and loss weighting to improve model generalization across tasks. Finally, you will dive into crosslingual transfer learning, exploring methods like translate-train vs. translate-test, as well as zero-shot, one-shot, and few-shot learning for multilingual NLP. 

Duration: 6h

This final module introduces large language models (LLMs) and how they can be effectively used through techniques like prompt engineering, in-context learning, and parameter-efficient finetuning. You will explore language-and-vision models, understanding how multimodal architectures extend beyond text to integrate visual and other data modalities. We will also examine non-functional properties of LLMs, including challenges such as hallucinations, fairness, resource efficiency, privacy, and interpretability. 

Duration: 1h

TBD

Note: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click View on Coursera button above for the most up-to-date information.