Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization

---Lei Tang, Jianping Zhang and Huan Liu

Taxonomies and Hierarchical Models

Web pages can be organized as a tree-structured taxonomy (Yahoo!, Google directory)

Parental control: Web filters to block children’s access to undesirable web sites. Parents want accurate content categorization of

different granularity Service providers appreciate the decision path how

a blocking/non-blocking is made for fine tuning.

Hierarchical Model: Exploit the taxonomy for classification strategy or loss function

Quality of Taxonomy

Most hierarchical models use a predefined taxonomy, typically semantically sound.

A librarian is often employed to construct the semantic taxonomy.

Is semantically-sound taxonomy always good? Subjectivity can result in different taxonomies Semantics change for specific data

A Motivating Example

Hurricane

Federal Emergency Management Agency

Geography

Politics

Normally

During Katrina

A “Bayesian” View

Stagnant nature of predefined Taxonomy (Prior Knowledge)

Dynamic change of Semantics reflected in Data

Data-Driven Taxonomy

Inconsistent

“Start from Scratch” - Clustering

Throw away the predefined taxonomy information, clustering based on labeled data.

Two categories: divisive or hierarchical Usually require human experts to specify some

parameters like the maximum height of a tree, the number of nodes in each branch, etc.

Difficult to specify parameters without looking at the data

Optimal Hierarchy

Optimal hierarchy: How to estimate the likelihood? Hierarchical model’s performance and the

likelihood are positively related. Use hierarchical models’ performance statistics

on validation set to gauge the likelihood. Brute-force approach to enumerate all

taxonomies is infeasible.

Constrained Optimal Hierarchy

Predefined taxonomy can help. Assumption: the optimal hierarchy is near

the neighborhood of predefined taxonomy H0

Constrained optimal hierarchy H’ for H0:H’ results from a series of elementary operations to adjust H0 until no likelihood increase is observed.

Elementary Operations1

5 6(H1)

3 45 6

(H4)(H2)

‘Promote’ ‘Merge’

‘Demote’

(All the leaf nodes remain unchanged)

Search in Hierarchy Space

Given a predefined taxonomy, find its best constrained optimal hierarchy.

Search in the hierarchy space.

Finding Best COH

Greedy Search Follow the track with largest likelihood

increase at each step to search for the best hierarchy.

Framework (a wrapper approach)

Given: H0 , Training Data T, Validation Data V

1. Generate neighbor hierarchies for H0,

2. For each neighbor hierarchy, train hierarchical classification models on T

3. Evaluate hierarchical classifiers on V.

4. Pick the best neighbor hierarchy as H0

5. Repeat step 1 until no improvement

Hierarchy Neighbors Elementary operations can be applied to any

nodes in the tree. Neighbors of a hierarchy could be huge. Most operations are repeated for evaluation.

3 2’

Finding Neighbors

Check nodes one by one rather than all the nodes at the same time in each search step.

‘Merge’ and ‘Demote’ only consider the node most similar to the current one.

Nodes at higher levels affects more for classification. Top-down traversal: Generate neighbors by performing all

possible elementary operations to the shallowest node first.

Further consideration 2 types of top-down

traversal:

1. ‘Promote’ operation only to generate neighbors

2. ‘Demote’ and ‘Merge’ operations only to generate neighbors

Repeat 2-traversals procedure until no improvement.

Geography

Hurricane

Politics

If a node is inproperly placed under a parent, we need to ‘promote’ it first.

Experiment Setting

10-fold cross validation Naïve Bayes Classifier (Multinomial) Use information gain to select features Due to the scarcity of documents in each

class, we use training data to validate the likelihood of a hierarchy.

Data Sets

Data: Soc and Kids Human labeled

web pages with a predefined taxonomy

Soc Kids

Classes 69 244

Nodes 83 299

Height 4 5

Instances 5248 15795

Vocabulary 34003 48115

Results on Soc

Results on Kids

Over-fitting?

As we optimize the hierarchy just based on training data, it’s possible to over-fit the data.

1 2 3 4 5 6 7

Iteration No.

Fold 1

Fold 2

Fold 3

Fold 4

Fold 5

Fold 6

Fold 7

Fold 8

Fold 9

Fold 10

Robust Method Instead of multiple traversals(iterations), just do 2-

traversals once.

Conclusions

Semantically sound taxonomy does not necessarily lead to intended good classification performance.

Given a predefined taxonomy, we can accustom it to a data-driven taxonomy for more accurate classification

Taxonomy generated by our method outperforms human-constructed taxonomy and the taxonomy generated “starting from scratch”.

Future work

An initial work to combine “noisy” prior knowledge and data.

How to implement an efficient filter model that can find a good taxonomy by exploiting the predefined taxonomy?

Feature selection could alleviate the difference between taxonomies. How to use the taxonomy information for feature selection?

Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization

Documents

Transcript of Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization

Categorization Guide Watches, Sunglasses, Jewellery SUN JEL Categorization... · Categorization Guide Watches, Sunglasses, Jewellery. 2 ... Watches Sunglasses Jewellery>Jewellery>Women>Beads

UCS STUDENT ACTIVITIES CATEGORIZATION AND RE- CATEGORIZATION PROCESS UNDERGRADUATE COUNCIL OF STUDENTS.

Ubudehe Categorization

Taxonomic Evidence

Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization --- Lei Tang, Jianping Zhang and Huan Liu.

Concepts & Categorization

Animal Categorization

hot rolled & cold rolled categorization hot rolled & cold rolled categorization

Taxonomic order

L TAXONOMIC LUMBER VS TAXONOMIC SPLITTER · 2019-04-28 · taxonomic lumber broad taxon concept---employs broad chracter states taxonomic lumber + ... designed for non-scientist artificial

Taxonomic procedures

Likert Coarse Categorization - pbarrett.net · Likert Coarse‐Categorization Pearson correlation attenuation due to ... Likert response categorization and attenuation of Pearson

Taxonomic Characters

Asset Categorization

Taxonomic and non-taxonomic responses Running title: of ...

Text Categorization Hongning Wang CS@UVa. Today’s lecture Bayes decision theory Supervised text categorization – General steps for text categorization.

Untitled-1 [] · taxonomic characters. Zoological Nomenclature, origin of code, ICZN. UNIT V : Taxonomic Records And Publications Taxonomic keys, Taxonomic characters description,

Hazard Categorization

Guidelines for Using the IUCN Red List Categories and Criteria€¦ · 2. An Outline of the Red List Categories and Criteria 2.1 Taxonomic level and scope of the categorization process

Image Categorization