Meta ====== Zipf's Law -------------- .. math:: frequency * rank = count .. math:: frequency = \frac{count}{rank} Evaluating similarity measures ----------------------------------- Extrinsic - plug into downstream system and see how well they perform Intrinsic - ask humans to evaluate Inter annotator agreement ----------------------------------- Basic probability ******************** .. math:: \text{P(label)} = \frac{\text{agree}}{\text{agree + disagree}} Cohen's Kappa ******************** .. math:: \frac{\text{P(a) - E[a]}}{1 - E[a]} Where P(a) is according to the annotators and E[a] is the probability of a having this label at random. Baselines for word sense prediction --------------------------------------------- * Just predict word as most frequent sense * Lesks Algorithm - check context and compare to different word sense definitions Closed vs Open vocab ----------------------- * Vocab called closed if entire vocab known Perplexity ----------------------- Intrinsic Performance measure used for language models "inverse probability of test set normalized by # of words" "kind of like weighted branch factor of language" Should only be used to compare models which use the same vocab Low Perplexity is good Is 2 to the cross entropy For a bigram model, can define as: .. math:: \text{PP(w)} = \sqrt[N]{\prod_{i=1}^{N}\frac{1}{p(w_i | w_{i-1})}} Meta ------- * Distributional semantics define word meanings by context in which they occur * first order occurrences words similar to words that occur near by them * second order occurrence similar words have similar neighbors Excercises ------------- * derive perplexity cross entropy relationship