Importance of Semantic Representation: Dataless Classification

Importance of Semantic Representation:

Dataless Classification

Ming-Wei Chang Lev Ratinov Dan Roth Vivek Srikumar

University of Illinois, Urbana-Champaign

Text Categorization

Classify the following sentence:

Syd Millar was the chairman of the International Rugby Board in

Pick a label:

Class1 vs. Class2

Traditionally, we need annotated data to train a classifier

Text Categorization

Humans don’t seem to need labeled data

Syd Millar was the chairman of the International Rugby Board in 2003.

Pick a label:

Sports vs. Finance

Label names carry a lot of information!

Text Categorization

Do we really always need labeled data?

Contributions

We can often go quite far without annotated data … if we “know” the meaning of text

This works for text categorization ….and is consistent across different domains

Outline

Semantic Representation

On-the-fly Classification

Datasets

Exploiting unlabeled data

Robustness to different domains

Outline

Datasets

One common representation is the Bag of Words representation

All text is a vector in the space of words.

Explicit Semantic Analysis [Gabrilovich & Markovitch, 2006, 2007]

Text is a vector in the space of concepts

Concepts are defined by Wikipedia articles

Explicit Semantic Analysis: Example

Monetary Policy

International Monetary Fund

Monetary policy

Economic and Monetary Union

Hong Kong Monetary Authority

Monetarism

Central bank

ESA representation

IPod mini

IPod photo

IPod nano

Apple Computer

IPod shuffle

ITunes

Apple IPod

ESA representation

Wikipedia article titles

Two semantic representations

Bag of words

Outline

Datasets

Traditional Text Categorization

Sports Finance

Labeled corpus

Semantic space

A classifier

Sports Finance

Labeled corpusLabels

What can we do using just the labels?

But labels are text too!

Sports Finance

Semantic space

LabelsNew unlabeled

document

What is Dataless Classification?

Humans don’t need training for classification

Annotated training data not always needed

Look for the meaning of words

What is Dataless Classification?

Humans don’t need training for classification

Annotated training data not always needed

Look for the meaning of words

Sports Finance

Semantic space

LabelsNew unlabeled

document

No training data needed

We know the meaning of label names

Pick the label that is closest in meaning to the

document

Nearest neighbors

Hockey Baseball

Semantic space

New labels

New unlabeled

document

No need to even know labels before hand

Compare with traditional classification Annotated training data for each label

Outline

Datasets

Dataset 1: Twenty Newsgroups

Posts to newsgroups Newsgroups have descriptive names

sci.electronics = Science Electronicsrec.motorbikes = Motorbikes

Dataset 2: Yahoo Answers

Posts to Yahoo! Answers Posts categorized into a two level hierarchy 20 top level categories Totally 280 categories at the second level

Arts and Humanities, Theater ActingSports, Rugby League

Experiments

20 Newsgroups 10 binary problems (from [Raina et al, ‘06])

Religion vs. Politics.guns

Motorcycles vs. MS Windows

Yahoo! Answers 20 binary problems

Health, Diet fitness vs. Health Allergies

Consumer Electronics DVRs vs. Pets Rodents

Results: On-the-fly classification

Dataset Supervised Baseline

Bag of Words

Newsgroup 71.7 65.7 85.3

Yahoo! 84.3 66.8 88.6

Naïve Bayes classifier

Uses annotated data,

Ignores labels

Nearest neighbors,

Uses labels,

No annotated data

Outline

Datasets

Using Unlabeled Data

Knowing the data collection helps We can learn specific biases of the dataset

Potential for semi-supervised learning

Bootstrapping Each label name is a “labeled” document

One “example” in word or concept space

Train initial classifier Same as the on-the-fly classifier

Loop: Classify all documents with current classifier Retrain classifier with highly confident predictions

Co-training Words and concepts are two independent “views”

Each view is a teacher for the other

[Blum & Mitchell ‘98]

Co-training

Train initial classifiers in word space and concept space

Loop Classify documents with current classifiers Retrain with highly confident predictions of both

classifiers

Using unlabeled data

Three approaches

Bootstrapping with labels using Bag of Words

Bootstrapping with labels using ESA

Co-training

More Results

No annotated data

Co-training using just labels does as well as supervision with 100 examples

Outline

Datasets

Domain Adaptation

Classifiers trained on one domain and tested on another

Performance usually decreases across domains

But the label names are the same Label names don’t depend on the domain

Label names are robust across domains On-the-fly classifiers are domain independent

ExampleBaseball vs. Hockey

Conclusion

Sometimes, label names are tell us more about a class than annotated examples Standard learning practice of treating labels as unique

identifiers loses information

The right semantic representation helps What is the right one?

Importance of Semantic Representation: Dataless Classification

Documents

Transcript of Importance of Semantic Representation: Dataless Classification

Cognitive Representation of Semantic Categories-Rosch-1975

Semantic Network and Frame Knowledge Representation ... · Semantic Network and Frame Knowledge Representation Formalisms in Artificial Intelligence ... AND FRAME KNOWLEDGE REPRESENTATION

Topics in Semantic Representation

Meaning Representation and Semantic Analysis

LEVELS OF SEMANTIC REPRESENTATION: WHERE LEXICON AND ... · LEVELS OF SEMANTIC REPRESENTATION: WHERE LEXICON AND ... “Levels of semantic representation: where lexicon and grammar

Chemical Entity Semantic Specification: Knowledge representation

Lexical Knowledge Representation and Semantic Composition

in a High-Fidelity Semantic Representation Intension ...

Topics in Semantic Representation - Cognitive sciencepsiexp.ss.uci.edu/research/papers/Griffiths_Steyvers_Tenenbaum...Topics in Semantic Representation Thomas L. Griffiths University

Semantic Representation and Formal Transformation

Distributed Semantic Web Knowledge Representation and Inferencing

Semantic Analysis III + Intermediate Representation I

“Semantic PDF Processing & Document Representation”

Meta-Semantic Representation for Early Detection of ...

SEMANTIC REPRESENTATION ANDTHE TRANSLATION OF POETRY

Semantic Web: Knowledge representation and reasoningcs.petrsu.ru/.../docs/SSlect03-SemanticWeb.pdf · Semantic Web: Knowledge representation and reasoning Outline 2 Dmitry G. Korzun,

Knowledge Representation for the Semantic Web. Hitzler, M. Krötzsch, S. Rudolph: Knowledge Representation for the Semantic Web, KI 2009 semantic-web-book.org 2 Why Rules? OWL may

Multi-agent and Semantic Web Systems: Representation

A Probabilistic Approach to Semantic Representation

Semantic Theory: Discourse Representation Theory II