+ Detecting Genre Shift Mark Dredze, Tim Oates, Christine Piatko Paper to appear at EMNLP-10.

Detecting Genre Shift

Mark Dredze, Tim Oates, Christine Piatko

Paper to appear at EMNLP-10

+Natural Language Processing and Machine Learning

Extracting findings from scientific papers

•Genetic epidemiology (development domain)

•PubMed search produces thousands of papers

•Manually reviewed to extract findings

•Findings determine relevant papers/studies

•Automate this process with ML/NLP methods

•Create searchable database of findings

•Allow machine inference over findings

•Suggest new scientific hypotheses

+Genre Shift in Statistical NLP

… told that John Paul Stevens is retiring this summer …

Named Entity Recognition

… President Barack Obama is urging members to …

+Supervised Machine Learning for Named Entity Recognition

Today the Atlantic Ocean is in an uproar and North Carolina remains in a state of anxiety.

Windowed Text Label

Today the Atlantic Ocean is B

the Atlantic Ocean is in I

Atlantic Ocean is in an O

Ocean is in an uproar O

is in an uproar and O

in an uproar and North O

an uproar and North Carolina O

uproar and North Carolina remains B

and North Carolina remains in I

North Carolina remains in a O

+Supervised Machine Learning for Named Entity Recognition

Windowed Text Label

Today the Atlantic Ocean is B

the Atlantic Ocean is in I

Atlantic Ocean is in an O

Feature Vector Label

[today, the, atlantic, ocean, is, U, L, U, U, L] B

[the, atlantic, ocean, is, in, L, U, U, L, L] I

[atlantic, ocean, is, in, an, U, U, L, L, L] O

+Genre Shift in Statistical NLP

… PRESIDENT BARACK OBAMA IS URGING MEMBERS TO…

+This is a Pervasive Problem

Extracting regulatory pathways from online bioinformatics journals using a parser trained on the WSJ

Finding faces in images of disaster victims using a model trained on “mug shot” images

Identifying RNA sequences that regulate gene expression in a lab in Baltimore using a model trained on data gathered in a lab in Germany

When things change in a way that’s harmful, we’d like to know!

+Data Streams Change Over Time

Natural drift Users unaware of system limitations

Sentiment classification from movie reviews

+Detecting Genre Shift

Two problems1)Detect changes in stream of

numbers (A-distance)2)Convert document stream to

stream of informative numbers (margin)

Genre shift hurts system performance (accuracy)

+Detecting Genre Shift

Measure accuracy directlyRequires labeled examples!

Look for changes in feature distributionsWords become more/less commonNew words appear

Genre shift hurts system performance (accuracy)

+Measuring Changes in Streams:The A-Distance

A nonparametric, distribution independent measure of changes in univariate, real-valued data streams (Kifer, Ben-David, and Gherke, 2004)

P P’

+Changes in Document Streams

embassy

0.1 1.6 * 4 + 0.1 * 1 + … = 3.7

embassy

0.1 1.6 * 4 + 0.1 * 1 + … = 3.7

• WX = margin• sign of WX is class label (+/-)• magnitude of WX is “certainty” in label

+Why Margins?

We have an easy way of producing them from unlabeled examples!

We want to track feature changes Margins are linear combinations of feature values Removing important features yields smaller

margins Only track features that matter, features with

zero (small) weight don’t affect margin (much)

Spoiler alert! Tracking margins works really well for unsupervised detection on genre shifts.

+Accuracy vs. Margins

DVD to Electronics

Average in block

Average over last 100 instances

DVD to Electronics

+Confidence Weighted Margins

Margins can be viewed as measure of confidence

We detect when confidence in classifications drops

Confidence Weighted (CW) learning refines this idea Gaussian distribution over weight vectors Mean of weight vector: μ in RN

Diagonal co-variance matrix: σ in RNxN

Low variance high confidence

Normalized margin: μx / (xTσx)0.5

Called VARIANCE in slides that follow

σ = 0.02σ = 1.74

+Experiments

Datasets Sentiment classification between domains (Blitzer et al.,

2007) DVDs, electronics, books, kitchen appliances

Spam classification between users (Jiang and Zhai, 2007) Named entity classification between genres (ACE 2005)

News articles, broadcast news, telephone, blogs, etc.

Algorithms Baselines: SVM, MIRA, CW Our method: VARIANCE

+Experiments

Simulated domain shifts between each pair of genres 38 pairs, 10 trials each with different random instance

orderings 500 source examples 1500 target examples

False change 11 datasets with no shift, 10 trials with different random

instance orderings

If no shift found then detection recorded as end of target examples when computing averages

+Comparing Algorithms

Instances from point of shift

Good for o

approach

Good for b

aselin

+SVM vs. VARIANCE

+Summary of Results Thus Far

VARIANCE detected shifts faster than … SVM 34 times out of 38 MIRA 26 times out of 38 CW 27 times out of 38

+Gradual Shifts

+What if you have labels?

STEPD: a Statistical Test of Equal Proportions to Detect concept drift (Nishida and Yamauchi, 2007)

Monitors accuracy of classifier from stream of labeled examples

Parameters: window size, W, and threshold, α

+Comparison to STEPD

+What about false positives?

+The A-Distance: Choosing Parameters

• A-distance paper gives bounds on FPs and FNs• Bounds depend on n and • Bounds do not depend on tiling!• So loose as to be meaningless• No guidance on how to choose tiling

• What if tiles lie outside support of data?

+Better Bounds

PA = true probability of a point falling in tile A

h = number of points that actually fell in A

pA = h/n = ML estimate of PA

Define P’A, h’, and p’A for second window

Suppose PA = P’A, then any change detected is a false positive

What is the probability that |pA – p’A| > /2?

+Posterior Over PA

B(, ) is the Beta function over + Bernoulli trials

trials have one outcome (point lands in tile A)

trials have the other (point lands in some other tile)

+False Positives: Two Cases

+Don’t worry, I’m not going to explain this (much)

+Probability of a FP (n = 200)

+Probability of FN

+Minimizing Expected Loss

+Moving Forward

GenreClassifier Newswire

TranscribedBroadcast

Twitter

+Genre Shift “Fix”

+Conclusion

Changes in margins convey useful information about changes in classification accuracy No need for labeled examples!

The A-distance applied to margin streams finds genre shifts with few false positives/negatives

Confidence weighted margins normalized by variance detect shifts faster than SVM, MIRA, or (non-normalized) CW margins

Our approach even works with gradual shifts and compares favorably to shift detectors that use labeled examples

+Thank you!

+ Detecting Genre Shift Mark Dredze, Tim Oates, Christine Piatko Paper to appear at EMNLP-10.

Documents

Transcript of + Detecting Genre Shift Mark Dredze, Tim Oates, Christine Piatko Paper to appear at EMNLP-10.

Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP

AUTOMATIC KEYPHRASE EXTRACTION VIA TOPIC DECOMPOSITION Proceeding EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language.

Learning Image Embeddings using Convolutional …leon.bottou.org/publications/pdf/emnlp-2014.pdfLearning Image Embeddings using Convolutional Neural Networks for ... on 1512 ImageNet

1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.

SOCIAL MEDIA: A NEW DATA SOURCE FOR PUBLIC HEALTH › uploads › downloadable-resources › dredze … · International Conference on Weblogs and Social Media (ICWSM), 2011. Ralph

2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.

Learning Semantic Relations from Textark/EMNLP-2015/tutorials/10/10_OptionalAttachment.pdfcapture and represent knowledge: machine-friendly intersection with language: inevitable Learning

Wikitology Wikipedia as an Ontology Zareen Syed and Anupam Joshi University of Maryland, Baltimore County James Mayfield, Paul McNamee and Christine Piatko.

Rls For Emnlp 2008

Cluster-specific Named Entity Transliteration Fei Huang HLT/EMNLP 2005.

LCSTS: A Large Scale Chinese Short Text Summarization ...ark/EMNLP-2015/proceedings/EMNLP/pdf/...text summarization and natural language process-ing based on naturally annotated web

An Unsupervised Bayesian Modelling Approach for Storyline Detection on News …ark/EMNLP-2015/proceedings/EMNLP/... · 2015. 9. 4. · Proceedings of the 2015 Conference on Empirical

Approximation-aware Dependency Parsing by Belief Propagation September 19, 2015 TACL at EMNLP 1 Matt Gormley Mark Dredze Jason Eisner.

Semantic Proto-Roles -- EMNLP 2015 - cs.jhu.educs.jhu.edu/~vandurme/papers/Semantic_Proto-Roles... · Frank Ferraro! Craig Harman! Our Goal Universal! Decompositional! Semantics!

1 Multi-Perspective Question Answering Using the OpQA Corpus (HLT/EMNLP 2005) Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University.

Hash Kernels for Structured Data - MIT Computer Science ... · HASH KERNELS FOR STRUCTURED DATA 2.5 Random Feature Mixing Ganchev and Dredze (2008) provide empirical evidence that

Self-disclosure topic model for twitter conversations - EMNLP 2014

Personality Profiling of Fictional Characters using Sense ...ark/EMNLP-2015/proceedings/EMNLP/pdf/EMNL… · sonality prole of a ctional character in a similar way as it is done for

EMNLP 2011 reading

Learning to Automatically Solve Logic Grid Puzzlesark/EMNLP-2015/proceedings/EMNLP/...2.2 Representing Puzzle Entities A (m;n )-puzzle problem contains m categories and n elements