Monday
Opening session
13:30-14:00, Kari Tanácsterem (A039)
1 Introduction to Python, Session 1
14:00-15:45 A330
- Variables
- Data Types
- Operators
- Loops
- Indexing
- Functions
- Classes
- Flow Control
2 Introduction to Python, Session 2
16:15-18:00, Leonard
- Input/Ouptut
- Modules
- LXML
- Pandas
- Plotting
If you would like a free Prodigy license, please fill out the form here
Tuesday
3 Intro to spaCy
9:00-10:45, Andy
- Python strings
- Language objects, doc, sents, tokens
- POS
- NER (w/ pre-trained models)
- displacy
- available spaCy models
- Adding models from spacy-stanfordnlp
4 Spacy & TEI
11:15-13:00, David
- standoff converter
- adding extensions
- automated markup
- NER
- linguistic features
5 Rule-based matching and new pipeline components
14:00-15:15, Andy and David (short session)
- Rule-based matching
- adding/pipelines
- if time, fasttext, MUSE
6 Training custom models
15:45-17:00, Andy
- training data
- training spaCy models (ner, textcat, pos, dep, semantic similarity)
- Prodigy
- discussion
Wednesday
7 spaCy Universe (slides)
(9:00-10:45, Andy and David
- spaCy IRL
- course.spacy.io
- other learning resources
- scattertext (finding distinguishing terms in small-to-medium-sized corpora, and presenting them in a sexy, interactive scatter plot with non-overlapping term labels)
- Named Entity Linking
-
spacy-pytorch-transformers & spacy pretrain
- concluding discussion