Natural Language Processing 2020

Last modified at: 2019-03-04 20:40:00+01:00

The 2020 episode at Faculty of Mathematics, Physics and Informatics of Comenius University

Lectures (I-23)
Wednesday, 9:50 - 11:30 (voluntary)
Labs (H-6)
Tuesday, 11:30 - 13:00 (voluntary)


Lesson 1: Intro

28th of February

Discussed material:
Supplementary resources:
  • Eliza Bot Demo: one of the most famous use cases of regular expressions. It is really worth trying out -- you may end up having some surprisingly good conversations.
  • Unix for poets: a nice 25 pages worth of examples on how to process text on the Unix command line. Here is a shorter version by the authors of the SLP book.
  • Scriptio continua: the reason why English also nearly ended up without word and sentence separators.
  • Regex101: a very nice application for working with (and specifically trying out) various regular expression. Note that the link goes to the Python flavor of regular expressions.

Lesson 2: Edit Distance

4th of March

Discussed material:
Supplementary resources:
  • subsync: a tool for automatically synchronizing subtitles with video (a nice use-case of using alignment in a not-so-ordinary context)
  • Autocomplete using Markov chains: a nice example (along with code in Python) that shows how Language Models can be used to generate "text resembling language" and build a simple 'autocomplete' engine.


Assignments: 50%
Project: 50%

Assignments are available via Google Classroom (the class code is yyozsin) but they are also available in the following repository on GitHub:

A list of project ideas can be found here.

Points Grade
(90, inf] A
(80, 90] B
(70, 80] C
(60, 70] D
(50, 60] E
[0, 50) FX