Link Search Menu Expand Document


Opening session

13:30-14:00, Kari Tanácsterem (A039)

1 Introduction to Python, Session 1

14:00-15:45 A330

  • Variables
  • Data Types
  • Operators
  • Loops
  • Indexing
  • Functions
  • Classes
  • Flow Control

2 Introduction to Python, Session 2

16:15-18:00, Leonard

  • Input/Ouptut
  • Modules
  • LXML
  • Pandas
  • Plotting

If you would like a free Prodigy license, please fill out the form here


3 Intro to spaCy

9:00-10:45, Andy

  • Python strings
  • Language objects, doc, sents, tokens
  • POS
  • NER (w/ pre-trained models)
  • displacy
  • available spaCy models
  • Adding models from spacy-stanfordnlp

4 Spacy & TEI

11:15-13:00, David

  • standoff converter
  • adding extensions
  • automated markup
    • NER
    • linguistic features

5 Rule-based matching and new pipeline components

14:00-15:15, Andy and David (short session)

  • Rule-based matching
  • adding/pipelines
  • if time, fasttext, MUSE

6 Training custom models

15:45-17:00, Andy

  • training data
  • training spaCy models (ner, textcat, pos, dep, semantic similarity)
  • Prodigy
  • discussion


7 spaCy Universe (slides)

(9:00-10:45, Andy and David

  • spaCy IRL
  • other learning resources
  • scattertext (finding distinguishing terms in small-to-medium-sized corpora, and presenting them in a sexy, interactive scatter plot with non-overlapping term labels)
  • Named Entity Linking
  • spacy-pytorch-transformers & spacy pretrain

  • concluding discussion