This course presents the main models, formalisms and algorithms necessary for the development of applications in the field of natural language information processing, covering both models and algorithms as well as application domains - such as liguistic engineering, information retrieval and textual data analysis. Automated textual data processing is explored at three levels: morpho-lexical, syntactic and semantic.
Natural language processing is ubiquitous in modern intelligent technologies, serving as a foundation for language translators, virtual assistants, search engines, and many more. In this course, we cover the foundations of modern methods for natural language processing, such as word embeddings, recurrent neural networks, transformers, and pretraining, and how they can be applied to important tasks in the field, such as machine translation and text classification. We also cover issues with these state-of-the-art approaches (such as robustness, interpretability, sensitivity), identify their failure modes in different NLP applications, and discuss analysis and mitigation techniques for these issues.
In recent years, NLP methods based on machine learning have become the core drivers of progress toward general natural language understanding. Their flexibility allows for rapid adaptation to new tasks, new domains, and new problems, but often at the cost of interpretability and robustness. The goal of this seminar is to introduce students to the most advanced methods in natural language processing, their shortcomings, and fruitful directions for continued investigation.
Students will be expected to read, review, present, and discuss relevant research papers in this area. Every week, they will be responsible for reading one or more research papers that are relevant to a topics of focus for that particularly week. One or more students will prepare a presentation highlighting the important points of the paper and leading a discussion around those points. All students will be responsible for reading the paper and contributing to the discussion of the paper's merits and weaknesses.
Over the course of the seminar, students will learn to critically read NLP research papers, critique work in this area, and propose extensions of current methods.