CPSC 532P / LING 530A: Deep Learning for Natural Language Processing (DL-NLP)

The University of British Columbia

Year: Winter Session I 2019

Time: Tue Thu 14:00 15:30.

Location: The Leon and Thea Koerner University Centre (UCLL) (Right by Rose Garden at 6331 Crescent Road V6T 1Z1). Room 109.

Instructor: Dr. Muhammad Abdul-Mageed

Office location: Totem Field Studios 224 (Department of Linguistics: 2613 West Mall V6T 1Z4)

Office phone: (Apologies, I do not use office phone. Please email me)

Office hours: Tue. 12:00-14:00pm @Totem Field Studios 224, or by appointment. (I can also handle inquiries via email or, in limited cases, Skype.)

E-mail address: muhammad.mageed@ubc.ca

Student Portal: http://canvas.ubc.ca

1. Course Rationale & Goal:

Rationale/Background: Deep learning is a class of machine learning methods inspired by information processing in the human brain, whereas Natural language processing (NLP) is the field focused at teaching computers to understand and generate human language. Emotion detection where a program is able to identify the type of expressed emotion from language is an example of language understanding. Dialog systems where the computer interacts with humans, such as the Amazon Echo, constitute an instance of both language understanding and generation, as the machine identifies the meaning of questions and generates meaningful answers. Other examples of NLP include speech processing and machine translation. Deep learning of natural language is transformative, and has recently broken records on several NLP tasks. The field is also in its infancy, with fascinating future breakthroughs ahead. Solving NLP problems directly contributes to the development of pervasive technologies with significant social and economic impacts and the potential to enhance the lives of millions of people. Given the central role that language plays in our lives, this research has implications across almost all fields of science and technology, as well as other disciplines, as NLP and deep learning are instrumental for making sense of the ever-growing data collected in these fields.

Goal: TThis course provides a graduate-level introduction to deep learning, with a focus on NLP problems and applications. The goal of the course is to familiarize students with the major deep learning methods and practices. This includes, for example, how neural networks are trained, the core neural network architectures, and the primary deep learning methods being developed to solve language problems. This includes problems at various linguistic levels (e.g., word and sub-word, phrase, clause, and discourse). For example, we will cover unsupervised, distributed representations and supervised deep learning methods across these different linguistic levels. Through homework and a final project, the course also provides a context for hands-on experience in using deep learning software to develop advanced solutions for NLP problems.

Potential audiences for this course are:

People with a linguistics, computer science, and/or engineering background interested in learning novel deep learning and NLP methods.
People with other machine learning backgrounds interested in deep learning and/or NLP.

2. Course Objectives:

Upon completion of this course students will be able to:

identify the core principles of training and designing artificial neural networks
identify the inherent ambiguity in natural language, and appreciate challenges associated with teaching machines to understand and generate it
become aware of the major deep learning methods being developed for solving NLP problems, and be in a position to apply this deepened understanding in critical, creative, and novel ways
become aware of a core of NLP problems, and demonstrate how these are relevant to the lives of diverse individuals, communities and organizations.
collaborate effectively with peers through course assignments
identify an NLP problem (existing or novel) and apply deep learning methods to develop a novel solution for it

3. Course Topics:

word, phrase, and sentence meaning
feedforward networks
recurrent neural networks
convolutional neural networks
language models
seq2seq models
attention & Transformers
deep generative models (auo-encoders & generative adversarial networks)

Applications

machine translation
controlled language generation
summarization
image and video captioning
morphosyntax (e.g., POS tagging and morphological disambiguation)
text classification (e.g., sentiment analysis, emotion detection, language)

4. Prerequisites:

familiarity with basic linear algebra, basic calculus, and basic probability (basic = high school level)
have programming experience in Python
familiarity with at least one area of linguistics
have access to a computer with a GPU on a regular basis
ability to work individually as well as with team

Students lacking any of the above pre-requisites must be open to learn outside their comfort zone to make up, including investing time outside class learning these pre-requisites on their own. Some relevant material across these pre-requisites will be linked from the syllabus. Although very light support might be provided, the engineering is exclusively the students’ responsibility. This course has no lab section. Should you have questions about pre-requisites, please email the instructor.

5. Format of the course:

• This course will involve lectures, class hands-on activities, individual and group work, and instructor-, peer-, and self-assessment.

6. Course syllabus:

Recommended books:

Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep learning (Vol. 1). Cambridge: MIT press. Available at: [link].
Jurafsky, D., & Martin, J. H. (2017). Speech and language processing. London:: Pearson. Available at [link].

Other related material:

Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python. O'Reilly Media, Inc. [link].
Weekly readings will be assigned from materials available through the UBC library and online.
See "Readings Section below"

7. Calendar / Weekly schedule (tentative)

| Date | Slides | Related Content | What is due/out | |------|--------|----------------------------------|----------------------------| | Tues Sept 3 | No class; TA Training | NA | | | Thurs Sept 5 | Course overview | [overview_slides] | | | Tues Sept 10 | Probability I | [prob_slides]; [DLB CH03]; [KA]; [Harvard Stats] | | | Thurs Sept 12 | Probability II | [prob_slides] | hw01 out (Canvas) | | Tues Sept 17 | Probability III | [prob_slides] | | Thurs Sept 19 | Information Theory | [info_theory_slides]; [entropy_basics_slides ] | | Tues Sept 24 | ML Basics | [ml_basics_slides] | | | Thurs Sept 26 | Word meaning I | [word_meaning_slides] | hw01 due & hw02 out | | Tues Oct 1 | Word meaning II & project discussion | [word_meaning_slides] | | | Thurs Oct 3 | Linear Algebra | [lin_alg_notebook]; [DLB CH02] | | | Tues Oct 8 | Language models | [lang_models_slides]; [JM_CH03]; [Brown et al 1992] | | | Thurs Oct 10 | Word embeddings | [word2vec_slides]; [Mikolov et al. 2013a]; [Mikolov et al. 2013b]; [Bojanowski et al. 2017] | | | Tues Oct 15 | Feedforward Networks | [ff_slides]; [DLB CH06] | | | Thurs Oct 17 | Recurrent Neural Networks | [RNN_slides]; [DLB CH10] | | | Tues Oct 22 | RNNs II | [RNN_slides] | | | Thurs Oct 24 | GRUs & LSTMs | [gru_lstm_slides] ; [Chung et al. 2014_Empirical_Eval_GRU] | hw02 due | | Tues Oct 29 | Applications | [[applicati

Dlnlp2019

Install / Use

README